Silver Professor at NYU Courant and CDS, Research Scientist at FAIR
Research in Machine Learning, past in Quantum Computing & Finance. Posts my own.Joined April 2024
New @AIatMeta paper.
The paper teaches a model to think using continuous tokens during reasoning, then answer using normal tokens.
The big deal is that it gives the diversity benefits of continuous reasoning without changing serving or prompts.
This keeps single try accuracy…
🔥New preprint: Soft Tokens, Hard Truths
Introduces the first scalable continuous-token RL method for LLMs - no reference CoTs needed; scales to hundreds of thought tokens. Best to train soft, infer hard! Pass@1 parity ⚖️, Pass@32 gains 📈& better robustness 🛡️ vs. hard CoT
1/🧵
🔥 NEW PAPER: What makes reasoning traces effective in LLMs? Spoiler: It's NOT length or self-checking. We found a simple graph metric that predicts accuracy better than anything else—and proved it causally. 🧵[1/n]
Beautiful @AIatMeta paper.
Shows that when models are rewarded only for getting the final answer right, they do become more accurate, but they also lose variety in the answers they generate. That lack of variety hurts real-world use, because sampling multiple answers at test…
LLMs lose diversity after RL post-training, and this hurts test-time scaling & creativity.
Why does this collapse happen, and how can we fix it?
Our new work introduces:
🔍 RL as Sampling (analysis)
🗺️ Outcome-based Exploration (intervention)
[1/n]
Our new Simons Collaboration on the Physics of Learning and Neural Computation will employ and develop powerful tools from #physics, #math, computer science and theoretical #neuroscience to understand how large neural networks learn, compute, scale, reason and imagine:…
How would you make an LLM "forget" the concept of dog — or any other arbitrary concept? 🐶❓
We introduce SAMD & SAMI — a novel, concept-agnostic approach to identify and manipulate attention modules in transformers.
❓How to balance negative and positive rewards in off-policy RL❓
In Asymmetric REINFORCE for off-Policy RL, we show that giving less weight to negative rewards is enough to stabilize off-policy RL training for LLMs! 💪 (1/8)
Paper: arxiv.org/abs/2506.20520
4K Followers 1K Followinglast year phd student @CMU_Robotics working on efficient algorithms for interactive learning (e.g. imitation / RL / RLHF). no model is an island. prefers email.
4K Followers 2K FollowingResearch Scientist @NVIDIA focusing on efficient post-training of LLMs. Finetuning your own LLMs with LMFlow: https://t.co/UTykmQBwFr Views are my own.
543K Followers 24K FollowingThe best from ML/AI community | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future | Silicon Valley robots, holodecks, BCIs, & startups.
4 Followers 136 FollowingTraining deep nets until they blow up 💥
Gradients explode, losses skyrocket, memes incoming.
ML / DL / AI chaos served fresh daily.
1K Followers 210 FollowingAssistant Professor of math and data science @NYU_Courant and @NYUDataScience | Former PhD @stanford_ee, postdoc @SimonsInstitute @mit | views are my own
548 Followers 240 FollowingPostdoc in Machine Learning and Geometry at JKU Linz.
Previously a PhD at SINTEF & University of Oslo, fellow of https://t.co/Atlj4V3fzw
24 Followers 18 FollowingAnti-war activist ☮️ | Religion is the root of all Evil 🌑 | All life matters 🌏
#WarFreeWorld
#EndCruelty
#NoWarInGodsName
#LetsCoExist
1K Followers 210 FollowingAssistant Professor of math and data science @NYU_Courant and @NYUDataScience | Former PhD @stanford_ee, postdoc @SimonsInstitute @mit | views are my own
490 Followers 395 FollowingPhD Student @MPI_IS & @uni_tue | currently interning @AIatMeta | Working on getting agents to play like children with unsupervised RL 🤖
387 Followers 1K FollowingResearch Scientist building theory & tools to make machine learning interpretable, more reliable & robust #MetaLearning #MachineLearning #ArtificialIntelligence
25K Followers 100 FollowingDirector, @PrincetonPLI and Professor @PrincetonCS. Seeks math/conceptual understanding of deep learning and large AI models.
Also on the "other" social network
16K Followers 1K FollowingAI Professor @Harvard; Senior Staff Research Scientist @GoogleAI; @trustworthy_ml #AI #Safety #XAI; AI PhD from Stanford; Sloan/Kavli Fellow, MIT TR #35Under35
7K Followers 658 FollowingResearch Scientist @AIatMeta
Previously Researcher @ Samsung AI
Outstanding Paper Award @icmlconf 2023
Action Editor @TmlrOrg
I tweet about ML papers and math
822 Followers 2K FollowingNLP/Code Generation PhD at FAIR (Meta AI) and INRIA - previously researcher at Stanford University - MS Stanford 22’ - Centrale Paris P2020
9K Followers 2K FollowingAssociate professor of @umdcs @umiacs @ml_umd at UMD. Researcher in #AI/#ML, AI #Alignment, #RLHF, #Trustworthy ML, #EthicalAI, AI #Democratization, AI for ALL.
3K Followers 386 FollowingI like Physics, Statistics, Machine learning, Computer Science & above all playing 🎸. Happy dad 👧 👧. Also professor @ EPFL. Views are my own.
57K Followers 568 FollowingAssistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Joining @NYU_Courant September 2026. Co-EiC @TmlrOrg. I lead @TheSalonML.
5K Followers 2K FollowingDirector and Research Scientist, FAIR @ Meta. Former Professor at UCSD. Researcher in AI privacy, security, and generalization.
205K Followers 15K FollowingDirector, @base Ads. Founder @spindl_xyz (acq. @coinbase). Wrote bestseller 'Chaos Monkeys'. "To fill the hour—that is happiness." גם זה יעבור 🇺🇸🇪🇸
1K Followers 655 FollowingThe real AGI is the friends we make along the way. PhD in FAIR CodeGen @AIatMeta. Alumni: @Huggingface, Sea AI Lab, @openai, École Polytechnique, SJTU
759 Followers 202 Followingp/hd | Big RL energy | 0.71 |research⟩ + 0.71 |engineer⟩ @ Meta, but never speaking on behalf of the company | Prev. lead maintainer of Gym/Gymnasium