Aashu Singh @iam_aashusingh
ML Engg @Facebook Alum @GeorgiaTech Joined April 2010-
Tweets339
-
Followers95
-
Following522
-
Likes995
@giffmana @laurence_ai @TheGregYang Shameless plug there is this outdate blog notion.so/cloneofsimo/Wh… that you gave some input actually lol
a good set of tips for GRPO RL training in @willccbb's verifiers repo
New video, starting to look at Diffusion Language Models. This one introduces some ideas, then shows how I turn ModernBERT into a LLaDA-style generative model. Lots of avenues to explore from here! Join me in playing with this? Project ideas in thread :) youtube.com/watch?v=Ds_cTc…
I love Cutlass, and this new Python DSL looks very well-designed. Will for sure accelerate kernel dev + exploring new ideas in ML + GPU. I'm already playing with it and having fun
I love Cutlass, and this new Python DSL looks very well-designed. Will for sure accelerate kernel dev + exploring new ideas in ML + GPU. I'm already playing with it and having fun
We’re also releasing the SkyAgent-v0 models which achieve promising results on SWE-Bench-Verified across model lines. Check it out! Blog: novasky-ai.notion.site/skyrl-v0 Model Collection: huggingface.co/collections/No… Github: github.com/NovaSky-AI/Sky… 3/N
A deep conversation with @SavinovNikolay, the Gemini long context pre-training co-lead… We go from the basics to what is needed to scale to infinite context to long context best practices for devs:
Thrilled to share our new paper: MetaQueries! We've created novel approach that bridges MM-LLMs and diffusion models using learnable queries . The method enables knowledge augmented image generation while preserving SOTA understanding capabilities.
Thrilled to share our new paper: MetaQueries! We've created novel approach that bridges MM-LLMs and diffusion models using learnable queries . The method enables knowledge augmented image generation while preserving SOTA understanding capabilities.
Llama4 models are out! Open sourced! Check them out: “Native multimodality, mixture-of-experts models, super long context windows, step changes in performance, and unparalleled efficiency. All in easy-to-deploy sizes custom fit for how you want to use it” llama.com
Pretty cool "Multi-Head Attention Shape Transformations (Cheat Sheet)" shared by a reader: github.com/rasbt/LLMs-fro…
We are bringing back Stanford’s CS 25 Transformers Course (cs25.stanford.edu) today! It’s open to everybody! This is one of @Stanford's hottest seminar courses. We open the course through Zoom to the public. Lectures start today (Tuesdays), 3-4:20pm PDT, at…
Lecture 15: Quantization (Guest lecture by @Tim_Dettmers) youtu.be/YXZZaje76r4 - Quantization basics - Quantized foundation models: LLM.int8() - Finetuning foundation models: QLoRA - Quantization and users
Lecture 15: Quantization (Guest lecture by @Tim_Dettmers) youtu.be/YXZZaje76r4 - Quantization basics - Quantized foundation models: LLM.int8() - Finetuning foundation models: QLoRA - Quantization and users
Since launching Agent S2, many folks working on GUI/computer-use agents asked for our tech report. Here we go! 🎉New SOTA on 3 major computer use benchmarks. • OSWorld (15 steps): 27.0% 🚀 (+18.9%) • OSWorld (50 steps): 34.5% 🚀 (+32.7%) • WindowsAgentArena: 29.8% 🚀…
Since launching Agent S2, many folks working on GUI/computer-use agents asked for our tech report. Here we go! 🎉New SOTA on 3 major computer use benchmarks. • OSWorld (15 steps): 27.0% 🚀 (+18.9%) • OSWorld (50 steps): 34.5% 🚀 (+32.7%) • WindowsAgentArena: 29.8% 🚀… https://t.co/2AYVcE9IHa
Blog post: all-hands.dev/blog/introduci… Model: huggingface.co/all-hands/open…
🚨Multi-Token Attention🚨 📝: arxiv.org/abs/2504.00927 Attention is critical for LLMs, but its weights are computed by single query & key vectors, limiting capability. MTA combines query, key & head operations over multiple tokens, improving performance in terms of PPL, std…
Interesting paper: Video-R1 improves temporal reasoning in MM LLMs using T-GRPO a variant of GRPO and high quality curated data for SFT. Here's a summary: medium.com/@aashus18_1308… Original paper: arxiv.org/abs/2503.21776
🎨 Understanding GPU Architecture from Cornell This GPU architecture roadmap is a good starting point for diving deeper, along with the CUDA C++ programming guide PDF - both freely available from Cornell and NVIDIA.
I read the R1 zero paper and the method is very simple , just a tweak to PPO to fine tune deepseek v3 base using a verifiable sparse binary reward. The fact that they got it to work even though others failed is likely due to better data and/or their very efficient implementation
I read the R1 zero paper and the method is very simple , just a tweak to PPO to fine tune deepseek v3 base using a verifiable sparse binary reward. The fact that they got it to work even though others failed is likely due to better data and/or their very efficient implementation
For those trying to understand DeepSeeks Group Relative Policy Optimization (GRPO): GRPO is just PPO without a value function using monte carlo estimates of the advantage. So, study why PPO exists (lots of docs / writing on that) and understand that value functions are tricky…
I re-recorded the post-training part of our NeurIPS tutorial on language models, added some more slides, and wrote up a mini state of the union on @interconnectsai. Enjoy! Links in QT. 00:00 Introduction 10:00 Prompts & Skill Selection 14:19 Instruction Finetuning 21:45…
I re-recorded the post-training part of our NeurIPS tutorial on language models, added some more slides, and wrote up a mini state of the union on @interconnectsai. Enjoy! Links in QT. 00:00 Introduction 10:00 Prompts & Skill Selection 14:19 Instruction Finetuning 21:45… https://t.co/ckTcQU5PqU
10 short videos about LLM infrastructure to help you appreciate Pages 12-18 of the DeepSeek-v3 paper (arxiv.org/abs/2412.19437) 🧵 youtube.com/watch?v=76gulN…

Aruheej @Aruheej182
24 Followers 1K Following
Hawehe @Hawehe1847549
68 Followers 2K Following
fx__evoIutıons… @fx_evoIution
1K Followers 7K Following 🚀Ready to level up? Join 20,000+ traders getting free weekly insights & pro strategies. 💹 Don't miss out-grab yours now! 👉 https://t.co/PCglJC36Te
Alice Cruickshank @AliceC73115
105 Followers 2K Following
Srikanth Vidapanakal @sreak1089
844 Followers 4K Following Founder, https://t.co/ggFrmJ8WEw Research Engineer, Data Scientist, Applied Math guy, interested in building embodied intelligence products
Gail Ward @GailWard361957
52 Followers 3K Following
Apurva Pathak @technoapurva
129 Followers 462 Following Software Engineer @ Facebook | Ex- Microsoft | Alumni University of California San Diego | NIT Rourkela
Shlok Kumar Mishra @shlokkkk
426 Followers 1K Following Research Scientist @AIatMeta | Prev @GoogleAI | CS PhD UMD
returnhome @returnhome7
713 Followers 3K Following
Mr. Jack Tung @MrJackTung
293 Followers 6K Following
Gpbhupinder @gpbhupinder
471 Followers 7K Following 👨💻 Full-Stack Developer & AI Integration Expert 🚀 From concept to launch, we bring your tech vision to life
Wenhao Chai @wenhaocha1
2K Followers 2K Following Ph.D. Student @PrincetonCS. Prev @Stanford @UW @pika_labs @MSFTResearch @UofIllinois. I used to work on computer vision, but it's not all I do.
Eva Louise Marie Gabr... @e681554349
11 Followers 7K Following
King Hong Chuang @KingHongChuang
37 Followers 2K Following
Dung Doan @dungdx34
333 Followers 7K Following
Satya Narayan Shukla @ImSNShukla
431 Followers 661 Following Senior Research Scientist @MetaAI | PhD @UMassAmherst | Prev @MSFTResearch, @facebookai and @Bosch_AI | BTech @IITKgp
Xichen Pan @xichen_pan
632 Followers 497 Following CS Ph.D. Student @NYU_Courant, Visiting Researcher @metaai | Prev: @MSFTResearch, @AlibabaGroup, https://t.co/EVVU493Kwp, @sjtu1896
λux @novasarc01
20K Followers 2K Following tensor shepherd in a non-euclidean pasture | grazing on cuda cores
Miroslav Pekárek @MiroslavPe79985
1K Followers 8K Following
Abhay Sharma @abhay110011
36 Followers 739 Following
Make money easily @sGqXS4i7Ojsuj
18 Followers 572 Following MEXC focuses on financial management, stocks, cryptocurrencies, digital assets and investments. Currently, new users can get free dollars when they sign up.
Pramit Saha @PramitSaha5
335 Followers 1K Following DPhil Candidate @UniofOxford @oxengsci working with Alison Noble on #Multimodal #Federated Learning #PEFT | MASc @ECEUBC | @MICCAI Young Scientist Award Winner
chris Judge @judgefws
495 Followers 7K Following
SwissCognitive, AI Ve... @SwissCognitive
146K Followers 100K Following We are committed to unleashing the power of AI in the business world. With our AI research, advisory, and ventures, we bring a blend of expertise to the Table.
Martin Görner @martin_gorner
14K Followers 6K Following AI/ML engineer. Previously at Google: Product Manager for Keras and TensorFlow and developer advocate on TPUs. Passionate about democratizing Machine Learning.
Mehmet Can @mehmetcansvs
25 Followers 286 Following
Web Culture @realwebculture
212 Followers 2K Following Technology, AI, programming, crypto and blockchain enthusiast.
MBH Corporation PLC @MBH_Corporation
11K Followers 13K Following Giving investors access to profitable businesses in the $1m-$10m EBITDA range through a 'Buy and Build' approach, creating shareholder value through synergies.
Gustavo Rayo 🇳🇮... @rayogustavo
90 Followers 819 Following Software developer, chess player. Interested in AI and languages.
Asif Razzaq @asifrazzaq1988
6K Followers 7K Following Unleashing AI's potential. Editor and CEO at @marktechpost : AI News Platform with over 1.5 Million Visits per month
Nathan Benaich @nathanbenaich
62K Followers 34K Following solo member of superinvestment staff @airstreet @airstreetpress @stateofaireport @raais
Sergios Karagiannakos @KarSergios
2K Followers 1K Following Writing about AI on https://t.co/qn6ZyTwnrj | Senior Data Engineer at @CausalyAI | 📖 Deep Learning course: https://t.co/e3QHPwOBnA
Naman Goyal @NamanGoyal21
2K Followers 637 Following Research @thinkymachines, previously pretraining LLAMA at GenAI MetaTanmay Pal @tanmay_pal
43 Followers 79 Following
MIT CSAIL @MIT_CSAIL
327K Followers 21K Following MIT's Computer Science & Artificial Intelligence Laboratory (CSAIL). Media Inquiries: [email protected] Check out the latest CSAIL content ⬇️
Snehal Lokhande 🦋 @snehal3105
329 Followers 1K Following Be happy for this moment this moment is your life. Python | Cloud | Data Analytics
Siddarth Venkatraman @siddarthv66
603 Followers 472 Following PhD at Mila | RL and other stuff I find interesting
Rohan Pandey @khoomeik
39K Followers 2K Following descending cross-entropy to ascend entropy || prev research @OpenAI @CarnegieMellon '23
🇺🇦 Dzmitry Bahd... @DBahdanau
10K Followers 37 Following Team member at something young. Adjunct Prof @ McGill. Member of Mila, Quebec AI Institute. Stream of consciousness is my own.
Christian Richardt @c_richardt
2K Followers 658 Following Research Scientist at @RealityLabs. Working on novel-view synthesis etc. Previously @UniofBath, #IVCI @Saar_Uni, #MPI_Informatik, @Inria, @Cambridge_Uni.
Yuvraj Singh @YuvrajS9886
2K Followers 581 Following Ex - @turboml, @puch_ai | @iitmadras (left), @iiserkol, @UofMaryland, AIISC | YESIST '24 Finalist | LLM x RL | Building SmolHub, NeatRL |
Joe Fioti @joefioti
2K Followers 372 Following it's not possible, it's necessary. building a compiler @luminal_ai to make models go really fast.
raulpuri.eth @TheRealRPuri
9K Followers 350 Following AI things @ OpenAI - ChatGPT Multimodal, Her, GPT4V, 4, 4o, 3.5, Codex | past: NVIDIA - megatron, sentiment neurons | go bears 🐻
Sachin Goyal @goyalsachin007
1K Followers 712 Following PhD @ CMU MLD || Past intern at Meta, Google and MSR
Richard Sutton @RichardSSutton
52K Followers 64 Following Student of mind and nature, libertarian, chess player, cancer survivor. @ Keen, UAlberta, Amii, https://t.co/u8za2Kod54, The Royal Society, Turing Award
Jacob Kahn @jacob_d_kahn
194 Followers 3 Following AI Researcher at FAIR, @MetaAI. CS Faculty at @Penn.
Krishna Mohan @KMohan2006
3K Followers 344 Following Denoising present to hopefully get brighter future | loves diffusion models
SemiAnalysis @SemiAnalysis_
37K Followers 18 Following
Noam Brown @polynoamial
92K Followers 856 Following Researching reasoning @OpenAI | Co-created Libratus/Pluribus superhuman poker AIs, CICERO Diplomacy AI, and OpenAI o3 / o1 / 🍓 reasoning models
Bert Maher @tensorbert
3K Followers 374 Following I’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
Zach Mueller @TheZachMueller
13K Followers 603 Following Let's make billions of parameters go brr https://t.co/rUxXIfNpwh
Jiawei Zhao @jiawzhao
3K Followers 242 Following Research Scientist at Meta FAIR @AIatMeta, PhD @Caltech, GaLore, DeepConf
Jacob Austin @jacobaustin132
7K Followers 920 Following Research at @GoogleDeepMind. Currently making LLMs go fast. I also play piano and climb. NYC. Opinions my own
ARC Prize @arcprize
28K Followers 173 Following A North Star for open AGI. Co-founders: @fchollet @mikeknoop. President: @gregkamradt. Help support the mission - make a donation today.
Feng Yao @fengyao1909
1K Followers 662 Following Ph.D. student @UCSD_CSE | Intern @Amazon Rufus Foundation Model Ex. @MSFTResearch @TsinghuaNLP
Jack Morris @jxmnop
46K Followers 994 Following research @cornell // language models, information theory, science of AI
Vipin PIllai @vipin2pillai
129 Followers 516 Following Applied Scientist at Amazon Just Walk Out (previously Amazon Go) Computer Vision Ph.D. from UMBC.
verl project @verl_project
1K Followers 5 Following Open RL library for LLMs. https://t.co/Xpaq0thhgi Join us on https://t.co/uWI5Zbd6IH
Denny Zhou @denny_zhou
22K Followers 540 Following Founded the Reasoning Team in Google Brain (now in the Gemini Core team of Google DeepMind). Build LLMs to reason. Opinions my own.
Kimi.ai @Kimi_Moonshot
53K Followers 100 Following Built by Moonshot AI to empower everyone to be superhuman.
Micah Goldblum @micahgoldblum
8K Followers 769 Following 🤖Prof at Columbia University 🏙️. All things machine learning.🤖
David Hall @dlwh
3K Followers 1K Following Research Engineering Lead at @StanfordCRFM. Previously co-founder at Semantic Machines ⟶ MSFT. Lead developer of Levanter and Marin @[email protected]
Alexander Kolesnikov @__kolesnikov__
12K Followers 194 Following
Xiaohua Zhai @XiaohuaZhai
11K Followers 311 Following Researcher at Meta (previously at OpenAI Zürich, Google DeepMind)
Wenhu Chen @WenhuChen
23K Followers 674 Following AI researcher. Interested in Reasoning, Multimodal. I direct TIGER-Lab. Author of PoT, MMMU, MMLU-Pro, MAmmoTH, LongRAG, MAP-Neo, YuE, VL-Rethinker
William Wang @WilliamWangNLP
19K Followers 762 Following CEO & Founder, @AlphaDesignAI. We make https://t.co/1LfDYicsF2 I'm also Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS.
Percy Liang @percyliang
85K Followers 420 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist
Ai2 @allen_ai
74K Followers 410 Following Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj
Rose Yu @yuqirose
9K Followers 582 Following Machine Learning Prof @UCSanDiego, Scholar @amazon, Previously @google, @Northeastern, @Caltech, @USC, #Physics-Guided #AI, MIT TR-35 Innovator.
Akari Asai @AkariAsai
19K Followers 870 Following Incoming Assistant Professor @SCSatCMU & research scientist @allen_ai. akariasai @ 🦋
Yu Su (hiring postdoc... @ysu_nlp
11K Followers 960 Following cooking something new | prof. @osunlp | sloan fellow | intelligence and agents | author of Mind2Web, SeeAct, MMMU, HippoRAG, BioCLIP, UGround.
Jim Fan @DrJimFan
327K Followers 3K Following NVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
Boshi Wang @BoshiWang2
2K Followers 508 Following Fourth-year Ph.D. @OhioState. Prev intern @MSFTResearch
Yoav Artzi @yoavartzi
17K Followers 182 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / asso. faculty director @arxiv / building https://t.co/nwrbEuwfaK and @COLM_conf
Bill Yuchen Lin @billyuchenlin
25K Followers 3K Following Grok Code @xAI. Ex: Affiliate Assistant Prof @UW, Research Scientist @allen_ai, Google AI, Meta FAIR.
Jingbo Shang @shangjingbo
452 Followers 85 Following Assoc Prof at UC San Diego CSE & HDSI. Research on weak supervision and LLM. UIUC PhD.
MiniMax (official) @MiniMax__AI
18K Followers 11 Following Our mission is to build a world where intelligence thrives with everyone. MiniMax Agent: https://t.co/XzaTmAos0V
William Merrill @lambdaviking
5K Followers 669 Following Incoming Assistant Prof, Toyota Technical Institute at Chicago @TTIC_Connect Recruiting PhD students (start 2026) 👀 Will irl - TC0 enthusiast