Multi-Agent Debate (MAD) has been hyped as a collaborative reasoning paradigm — but let me drop the bomb: majority voting, without any debate, often performs on par with MAD.
This is what we formally prove in our #NeurIPS2025 Spotlight paper:
“Debate or Vote: Which Yields…
There are 70+ "reasoning" papers accepted at COLM 2025 (Oct 7-10, Montreal). Most papers elicit long reasoning for different tasks or understand the reasoning abilities/limitations of LLMs.
I wrote a blog post covering ~30 of those papers 👇
AI efficiency is important. Today, Google is sharing a technical paper detailing our comprehensive methodology for measuring the environmental impact of Gemini inference. We estimate that the median Gemini Apps text prompt uses 0.24 watt-hours of energy (equivalent to watching an…
Test-time scaling w/ GRPO boosts accuracy, but also adds “filler tokens” increasing length w/o real progress.
We present Group Filtered Policy Optimization (GFPO):🧵
1️⃣ Sample more per prompt
2️⃣ Rank by token efficiency (reward ÷ length)
3️⃣ Train on top-k
4️⃣ 🚀 Cut 80% of…
OpenAI hasn’t open-sourced a base model since GPT-2 in 2019. they recently released GPT-OSS, which is reasoning-only...
or is it?
turns out that underneath the surface, there is still a strong base model. so we extracted it.
introducing gpt-oss-20b-base 🧵
I implemented GRPO and DPO from scratch in vanilla Pytorch to unravel every piece of training details. Hope it could be helpful for those who care about the implementation details of the algorithms. 👉 github.com/mingyin0312/RL…#AI#RL#LLM
gpt-oss is out!
we made an open model that performs at the level of o4-mini and runs on a high-end laptop (WTF!!)
(and a smaller one that runs on a phone).
super proud of the team; big triumph of technology.
✨Huge thanks for interest in Mixture-of-Recursions! Codes are officially out!
It's been a long journey exploring Early-exiting with Recursive Architecture.
I'll soon post my 👨🎓PhD thesis on Adaptive Computation too!
Code: github.com/raymin0223/mix…
Paper: arxiv.org/abs/2507.10524
Introducing our new work: 🚀Mixture-of-Recursions!
🪄We propose a novel framework that dynamically allocates recursion depth per token.
🪄MoR is an efficient architecture with fewer params, reduced KV cache memory, and 2× greater throughput— maintaining comparable performance!
R.I.P McKinsey.
You don’t need a $300k consultant anymore.
You can now run full competitive market analysis using Grok 4.
Here are the exact 3 mega-prompts I use to replicate McKinsey-style insights for free:
🚨New Paper Alert
As a game company, @Krafton_AI is actively exploring how to apply LLM agents to video games.
We present Orak—a foundational video gaming benchmark for LLM agents!
Includes Pokémon, StarCraft II, Slay the Spire, Darkest Dungeon, Ace Attorney, and more in🧵
Super happy and proud to share our novel scalable RNN model - the MesaNet!
This work builds upon beautiful ideas of 𝗹𝗼𝗰𝗮𝗹𝗹𝘆 𝗼𝗽𝘁𝗶𝗺𝗮𝗹 𝘁𝗲𝘀𝘁-𝘁𝗶𝗺𝗲 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 (TTT), and combines ideas of in-context learning, test-time training and mesa-optimization.
Shocker! Claude 4 system prompt was leaked, and it's a goldmine!
The Claude system prompt incorporates several identifiable agentic AI patterns as described in "A Pattern Language For Agentic AI." Here's an analysis of the key patterns used:
Run-Loop Prompting: Claude…
Small language models struggle with complex reasoning tasks where large models excel.
This paper introduces the SMART framework, where a small model performs reasoning but selectively requests corrections from a large model only for steps identified as uncertain via a scoring…
Academia should focus on discovering simplifying and unifying principles and mechanisms behind intelligence; and industry is obviously better equipped to manifest and scale up. That is the same as physics/mechanics to building big airplanes... But I do not believe the current…
Academia should focus on discovering simplifying and unifying principles and mechanisms behind intelligence; and industry is obviously better equipped to manifest and scale up. That is the same as physics/mechanics to building big airplanes... But I do not believe the current…
Devastatingly, we have lost a bright light in our field. Felix Hill was not only a deeply insightful thinker -- he was also a generous, thoughtful mentor to many researchers. He majorly changed my life, and I can't express how much I owe to him.
Even now, Felix still has so much…
🥪New Paper! 🥪Introducing Byte Latent Transformer (BLT) - A tokenizer free model scales better than BPE based models with better inference efficiency and robustness. 🧵
134 Followers 668 FollowingFounder & CEO, @joincircleup | Helping companies hire top students and grads from multiple schools, all on one platform.🚀 | Future of Work | HR Tech | HR AI
695 Followers 4K FollowingMicrobiologist (BTech) | Entrepreneur | Federal Ministry of Health and Social Welfare | Founder & CEO of @imisihealthcare (🦄).
2K Followers 908 FollowingFounder of ProSights (YC W24), AI finance automations trusted by over half of the 25 largest PE firms. Former IB/PE and @harvardswimdive
543K Followers 24K FollowingThe best from ML/AI community | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future | Silicon Valley robots, holodecks, BCIs, & startups.
98 Followers 2K FollowingData & AI @ ENSAE 🤖 | From Dakar to Paris to the world 🌍 | Founder mindset ⚡ | (finance • media • sport • NLP • Crypto ) | Legacy. Growth. Impact.
128 Followers 5K FollowingI’m helping people with Financial support for bills rent, debt who need money for is family care and job text me on WhatsApp +1 (307) 757 4293
17K Followers 6K FollowingNeurodivergent physics student with a keen interest in multisensory integration and emergent perception. Exploring research on a proposed ‘sixth sense’. Δ
220 Followers 2K FollowingPhysicist to AI researcher.
Building AI assistant for scientific discovery.
Interpretability.
Connection between ML and renormalization group
56 Followers 1K FollowingInnovating the Circuitries of the Digital world with enhanced technologies and science of the artistries itself .with Asian / American consolidated insight
6K Followers 5K FollowingFOLLOWS 🫵 https://t.co/F7MzDOTC1k
ML/AI R&D sci/eng, quant trading, ASR in noise, TTS.
Open ASI compute for */acc; it's more fun to compute 🥰
2K Followers 908 FollowingFounder of ProSights (YC W24), AI finance automations trusted by over half of the 25 largest PE firms. Former IB/PE and @harvardswimdive
27K Followers 1K FollowingGenAI @Youtube | Building AI powered video editing | ex : @Google Search & @Microsoft Azure | 3x hackathon winner | Views my own
543K Followers 24K FollowingThe best from ML/AI community | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future | Silicon Valley robots, holodecks, BCIs, & startups.
79K Followers 1K Followingi teach AI on X
leader @openminedorg, research scientist @GoogleDeepMind, ABD PhD @OxfordUni, @UN @GovAI_ @CFR_org GrokkingDL
50K Followers 5K FollowingCofounder and Head of Post Training @NousResearch, prev @StabilityAI
Github: https://t.co/LZwHTUFwPq
HuggingFace: https://t.co/sN2FFU8PVE
1K Followers 730 FollowingCo-founder, CEO at Endo Health. Backed by a16z @speedrun, @generalcatalyst, @annewoj23. Medical doctor turned engineer. ex-Krafton. Scout @a16z SR
57K Followers 858 FollowingFiguring out AI @allen_ai, open models, RLHF, fine-tuning, etc
Contact via email.
Writes @interconnectsai
Wrote The RLHF Book
Mountain runner
22K Followers 540 FollowingFounded the Reasoning Team in Google Brain (now in the Gemini Core team of Google DeepMind). Build LLMs to reason. Opinions my own.
219 Followers 264 FollowingPh. D Candidate in #NLProc at Korea University, currently interning at AWS AI (@AmazonScience).
Previously interned @ NAVER AI Lab and Microsoft Research Asia
97K Followers 8K FollowingCompiling in real-time, the race towards AGI.
The Largest Show on X for AI.
🗞️ Get my daily AI analysis newsletter to your email 👉 https://t.co/6LBxO8215l
20K Followers 2K FollowingVC @FlywheelVC. Lecturer, entrep mgmt fin & VC @Stanford. Expert witness. Prev: @NVCA @KauffmanFellows @Intel & 3x founder. I am "trevorloy" on all other apps.
319K Followers 1K FollowingAI Educator. 𝕏 about AI, solutions and interesting things. Showing how to leverage AI in practical ways for you and your business. Opinions are my own.
432 Followers 576 FollowingMusic/Audio Generative Models, Research at Google London, PhD @cardiffuni. Ex) Applied Scientist @AmazonScience, Research Intern @Snap.
20K Followers 1K Following@OpenAI Language agents (ReAct, Reflexion, Tree of Thoughts, SWE-agent, CoALA) for digital automation (WebShop, SWE-bench, tau-bench)