Jiarui Yao @ExplainMiracles

UIUC CS PhD, 24 Joined May 2023

Tweets

21
Followers

88
Following

525
Likes

58

Jiarui Yao @ExplainMiracles

2 weeks ago

Glad that our paper has been accepted to Neurips 2025! By gradient variance minimization (GVM), we balance the training data by difficulties and their contribution to the model. We achieve improvement on math reasoning. Please check the original post for more details.

Jiarui Yao @ExplainMiracles

5 months ago

0 28 89 6K 45

Download Image

0 0 1 131 0

Peixuan Han @peixuanhakhan

3 weeks ago

(1/5) Super excited to release our new paper on Reinforcement Learning: "Self-Aligned Reward: Towards Effective and Efficient Reasoners"! Preprint: arxiv.org/pdf/2509.05489

2 13 30 5K 9

Download Image

Cheng Qian @qiancheng1231

2 months ago

🤝 Can LLM agents really understand us? We introduce UserBench: a user-centric gym environment for benchmarking how well agents align with nuanced human intent, not just follow commands. 📄 arxiv.org/pdf/2507.22034 💻 github.com/SalesforceAIRe…

6 37 121 12K 60

Download Image

Yong Lin @Yong18850571

3 months ago

(1/4)🚨 Introducing Goedel-Prover V2 🚨 🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 64 problems—with far less compute. 🧠 New SOTA on MiniF2F: * 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%. * 8B > 671B: Our 8B…

7 87 255 67K 121

Download Image

Noam Razin @noamrazin

3 months ago

Reward models (RMs) are key to language model post-training and inference pipelines. But, little is known about the relative pros and cons of different RM types. 📰 We investigate why RMs implicitly defined by language models (LMs) often generalize worse than explicit RMs 🧵 1/6

3 20 158 11K 133

Download Image

Shulin Tian @shulin_tian

4 months ago

🎥 Video is already a tough modality for reasoning. Egocentric video? Even tougher! It is longer, messier, and harder. 💡 How do we tackle these extremely long, information-dense sequences without exhausting GPU memory or hitting API limits? We introduce 👓Ego-R1: A framework…

7 9 37 5K 7

Download Video

Xiusi Chen @xiusi_chen

4 months ago

Can LLMs make rational decisions like human experts? 📖Introducing DecisionFlow: Advancing Large Language Model as Principled Decision Maker We introduce a novel framework that constructs a semantically grounded decision space to evaluate trade-offs in hard decision-making…

2 18 55 8K 26

Download Image

Peixuan Han @peixuanhakhan

4 months ago

(1/5) Want to make your LLM a skilled persuader? Check out our latest paper: "ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind"! For details: 📄Arxiv: arxiv.org/pdf/2505.22961 🛠️GitHub: github.com/ulab-uiuc/ToMAP

2 7 24 2K 9

Download Image

Cheng Qian @qiancheng1231

4 months ago

📢 New Paper Drop: From Solving to Modeling! LLMs can solve math problems — but can they model the real world? 🌍 📄 arXiv: arxiv.org/pdf/2505.15068 💻 Code: github.com/qiancheng0/Mod… Introducing ModelingAgent, a breakthrough system for real-world mathematical modeling with LLMs.

3 32 103 13K 52

Download Image

Hanze Dong @hendrydong

5 months ago

How to improve the test-time scalability? - Separate thinking & solution phases to control performance under budget constraint - Budget-Constrained Rollout + GRPO - Outperforms baselines on math/code. - Cuts token 30% usage without hurting performance huggingface.co/papers/2505.05…

5 21 82 7K 46

Xiusi Chen @xiusi_chen

5 months ago

🚀 Can we cast reward modeling as a reasoning task? 📖 Introducing our new paper: RM-R1: Reward Modeling as Reasoning 📑 Paper: arxiv.org/pdf/2505.02387 💻 Code: github.com/RM-R1-UIUC/RM-… Inspired by recent advances of long chain-of-thought (CoT) on reasoning-intensive tasks, we…

3 49 202 37K 114

Download Image

Jiarui Yao @ExplainMiracles

5 months ago

We introduce Gradient Variance Minimization (GVM)-RAFT, a principled dynamic sampling strategy that minimizes gradient variance to improve the efficiency of chain-of-thought (CoT) training in LLMs. – Achieves 2–4× faster convergence than RAFT – Improves accuracy on math…

0 28 89 6K 45

Download Image

Haocheng Xi @HaochengXiUCB

5 months ago

Thrilled to announce that our paper Sparse VideoGen got into #ICML2025! 🎉 Our new approach to speedup Video Generation by 2×. Details in the thread/paper. Huge thanks to my collaborators! Blog: svg-project.github.io Paper: arxiv.org/abs/2502.01776 Code:…

Haocheng Xi @HaochengXiUCB

7 months ago

13 56 257 45K 123

Download Video

0 14 72 6K 16

Manling Li @ManlingLi_

5 months ago

Welcome to join our Tutorial on Foundation Models Meet Embodied Agents, with @YunzhuLiYZ @maojiayuan @wenlong_huang ! Website: …models-meet-embodied-agents.github.io

4 40 232 16K 106

Download Image

Shizhe Diao @shizhediao

5 months ago

Thrilled to share my first project at NVIDIA! ✨ Today’s language models are pre-trained on vast and chaotic Internet texts, but these texts are unstructured and poorly understood. We propose CLIMB — Clustering-based Iterative Data Mixture Bootstrapping — a fully automated…

16 56 317 37K 209

Download Image

Jiarui Yao @ExplainMiracles

6 months ago

Negative samples are "not that important", while removing samples with all negative outputs is "important". 🤣

Hanze Dong @hendrydong

6 months ago

Negative samples are "not that important", while removing samples with all negative outputs is "important". 🤣

9 100 468 39K 436

0 0 2 120 0

Cheng Qian @qiancheng1231

7 months ago

🚀Can your language model think strategically? 🧠 SMART: Boosting LM self-awareness to reduce Tool Overuse & optimize reasoning! 🌐 arxiv.org/pdf/2502.11435 📊 github.com/qiancheng0/Ope… Smaller models, bigger brains. Smarter tool use, better results! 🔥 #AI #LLM