Saurav Muralidharan @srv_m
Research Scientist @NVIDIA | Making LLMs More Efficient sauravm.com Joined March 2008-
Tweets251
-
Followers186
-
Following247
-
Likes1K
Sharing our team’s latest work on Hymba - an efficient small language model with hybrid architecture. Tech report: arxiv.org/abs/2411.13676 Discover the tradeoff between Mamba and Attention, how they can be combined, how attention sink and forced-to-attend phenomena can be…
We are hiring researchers working in LLM and VLM efficiency! Applications are open for PhD students graduating in 2025; and senior researchers with PhD. Check requirements for the position. Apply here: nvidia.wd5.myworkdayjobs.com/NVIDIAExternal… Senior researchers: nvidia.wd5.myworkdayjobs.com/NVIDIAExternal……
🚀 @NeurIPSConf Spotlight! 🥳 Imagine fine-tuning an LLM with just a sparsity mask! In our latest work, we freeze the LLM and use 2:4 structured sparsity to learn binary masks for each linear layer. Thanks to NVIDIA Ampere’s 2:4 sparsity, we can achieve up to 2x compute…
👀 Experience high-efficiency NVIDIA Llama-3.1-Nemotron-51B - a NAS-optimized model achieving 2x throughput while preserving accuracy runs on a single H100 GPU. ✨Try out the Llama-3.1-Nemotron-51B NIM through the API from ai.nvidia.com or download from @huggingface.…
Introducing NVLM 1.0, a family of frontier-class multimodal LLMs that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., InternVL 2). Remarkably, NVLM 1.0 shows improved text-only…
LLM Pruning and Distillation in Practice: The Minitron Approach abs: arxiv.org/abs/2408.11796 models: huggingface.co/nvidia/Mistral… huggingface.co/nvidia/Llama-3… huggingface.co/nvidia/Llama-3… Compressing Llama 3.1 8B and Mistral NeMo 12B to 4B and 8B, respectively, with teacher correction,…
Today we released Mistral-NeMo-Minitron 8B, a pruned and distilled version of the open @MistralAI NeMo 12B model, achieving high accuracy across nine popular benchmarks for chatbots, virtual assistants, content generation, coding, and educational tools. ➡️…
🌟 The best 8B Base model via pruning and distillation! 🚀 Introducing Mistral-NeMo-Minitron-8B-Base model we derived from the recent Mistral-NeMo-12B. Our recipe: finetune teacher on 100B tokens, prune to 8B params, run teacher-student distillation on <400B tokens. Result: the…
See how our #NVIDIAResearch team has developed a method to efficiently create smaller, accurate language models by using structured weight pruning and knowledge distillation - offering several advantages for developers: ✅ 16% better performance on MMLU scores ✅ 40x fewer…
🚀 We've pruned LLaMa3.1 down to 4B parameters, delivering a smaller and more efficient model! Based on our recent paper: arxiv.org/abs/2407.14679 📖 Learn all about it in our blog: developer.nvidia.com/blog/how-to-pr… 🔗 META's announcement: ai.meta.com/blog/nvidia-ll… 👐 Checkpoints at HF this…
🤖 Excited to announce Minitron, a new family of language models obtained through a combination of weight pruning and knowledge distillation! Our models are available on HF with a permissive license. Give them a try today!
🤖 Excited to announce Minitron, a new family of language models obtained through a combination of weight pruning and knowledge distillation! Our models are available on HF with a permissive license. Give them a try today!
@MistralAI and @nvidia announce Mistral-NeMo 12B, an awesome bite-size model released under Apache 2.0 that we jointly trained. FP8 aligned checkpoint and 128k context window, great benchmark scores. blogs.nvidia.com/blog/mistral-n… mistral.ai/news/mistral-n…
🚀 Introducing Flextron - a Many-in-One LLM - Oral at ICML! Train one model and get many optimal models for each GPU at inference without any additional retraining. 🌟 🔗 Paper: arxiv.org/abs/2406.10260 Main benefits with only 5% post-training finetuning: ✅ Best model for…
We are now having full conversations with Figure 01, thanks to our partnership with OpenAI. Our robot can: - describe its visual experience - plan future actions - reflect on its memory - explain its reasoning verbally Technical deep-dive 🧵:
Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval. For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of…
here is sora, our video generation model: openai.com/sora today we are starting red-teaming and offering access to a limited number of creators. @_tim_brooks @billpeeb @model_mechanic are really incredible; amazing work by them and the team. remarkable moment.
Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/

Chetan Kumar @ck_pinna
99 Followers 632 Following
Yosef Antonius @YosefAnton87916
1 Followers 13 Following
Ash @arshiailaty
514 Followers 3K Following PhD @UCIrvine & @SDSU | EX-SWE-intern@Tesla گورستانِ ماه است شب!
Indrabhan Chaudhary @Indrabhan_09
20 Followers 701 Following 💻 Java Developer | ☕ Spring Boot | 🌐 Web App Builder | 📊 API Integration | 🚀 Learning & Growing | 📍 India
Ferhat Culfaz @ModernStoic00
138 Followers 1K Following Data scientist with a passion for machine learning, art, history, politics and classics.
Ankit Shah @thellamapriest
182 Followers 966 Following
Prudvish కొర్... @prudvishh
261 Followers 753 Following Founder of @SafeSoundAI Ecology and economy should move hand in hand Only then we can have Sustainable planet....#humanity #empathy Mother Nature
Mary Jo Riggs @gomez101_morgan
208 Followers 2K Following 🔑 Entrepreneur - Consultant- Investor - 📊 - Leader - Mentor- Catalyst 🗺️ 42 Countries 🇺🇸🇵🇭🇫🇷🇪🇸🇬🇧🇩🇪
Wajahat Ali Basharat ... @WajahatAli_231
8 Followers 562 Following AI/ML Researcher | LLMs | Computer Vision | NLP 📊 | https://t.co/bOFDHCm2je 🇸🇦 | MSCS @ NUST 🇵🇰 | Python • PyTorch • TensorFlow | Sharing Research
Tal Be'ery @TalBeerySec
10K Followers 2K Following Security Research Manager. Co-Founder, CTO @ZenGo. Advisor @ZeroNetworks. x-VP Research Aorato, acq by @Microsoft. 9 times @BlackHatEvents speaker.
kimi5 @xianyue_
27 Followers 1K Following
Tejpal Singh @0xTejpal
2K Followers 5K Following training sota agents @openblocklabs prev. ml research @stanford, cs @carnegiemellon
Gautam Goswami @NotTheBudha
821 Followers 6K Following AWS-Bedrock | Ex-AWS-SageMaker | Ex-Meta | Working with 🧠🤖💥
super intelligence @eacc72
14 Followers 2K Following
Chand @ChandMoham71862
44 Followers 3K Following
artifacts @artifacts475461
5 Followers 112 Following
Aditya Bhaskara @aditya_bhaskara
120 Followers 105 Following Faculty at the University of Utah, interested in theoretical CS and machine learning
Yu Yang @YuYang_i
6K Followers 784 Following reasoning research @OpenAI 🍓 | UCLA CS PhD | Ex. Microsoft Research, Meta FAIR, NVIDIA Research
Ajay Jaiswal @ajayjaiswal1994
250 Followers 348 Following Amazon Science PhD Fellow @UTAustin || Research Intern @Apple || Ex-Adobe || Ex-IITKGP || Ex-Samsung
KD @ChekhovianGun
128 Followers 3K Following Engineer @Nvidia. Opinions are my own. I like to discuss AI, Philosophy and everything in between.
Noythuse @NoythusefGiZnw
56 Followers 1K Following
Cameron Shinn @CameronShinn7
22 Followers 121 Following PhD student at UC Davis. Researching deep learning performance engineering. I also play lots of MTG.
Gopalakrishnan @MGopalakrishnan
197 Followers 2K Following
Ruairi ⚽🍊🇪�... @ruairiSpain
306 Followers 2K Following
Camila Galloway @cami__453
283 Followers 2K Following Nothing can dim the light that shines from within.
Dayan Fernando @Dayanferbps
30 Followers 851 Following
Hongyu Wang🥕 @realHongyu_Wang
1K Followers 529 Following Fight for 1-bit AI🚀, Scalable and Efficient Foundation Models & Deep learning, PhD student, Prev: Research Intern@Microsoft Research
Mayank Bhaskar @cataluna84
3K Followers 4K Following Machine Learning Consultant 🧑🏽💻 | @twimlai & @Cohere_Labs Community Lead 👥 | @AILucknow ⌨ | #engineer 🛠 🧮 | #datavisualization 📊 | #sports ⚽ 🏓 🏋🏽 🎮
Shizhe Diao @shizhediao
4K Followers 2K Following Research Scientist @NVIDIA focusing on efficient post-training of LLMs. Finetuning your own LLMs with LMFlow: https://t.co/UTykmQBwFr Views are my own.
Joseph Pollack #Ï �... @josephpollack
2K Followers 5K Following 🤖AI❤️Data enjoyer , building robots to helps folks learn things quicker.
sk @sk80104705
10 Followers 486 Following
Praveen @freeze0xBFF
350 Followers 5K Following Intern @aws Aurora CS grad @penn_state, Ex-SE @AMD worked on GPU address sanitizer and compilers!, GSoC @llvm, Interested in Systems!
Mads Toftrup @mabeto5p
21 Followers 360 Following CS PhD @ Aarhus University, Algorithms and foundations of Machine Learning Group Looking for Summer/Fall 2025 internships Working on efficient LLM training
Rejina sen @RejinaSen
214 Followers 416 Following Drama Queen with A Heart of Gold 💖 🌙 Moon Child 🌙 🎨 Creative Soul 🎨 🍦 Ice Cream Aficionado 🍦 💫Living Life Unapologetically
Ananya Rai @Ananya_Rai09
200 Followers 225 Following 🌸 𝒫𝑜𝓈𝒾𝓉𝒾𝓋𝒾𝓉𝓎 𝒫𝓇𝒾𝓃𝒸𝑒𝓈𝓈 💄 𝐵𝑒𝒶𝓊𝓉𝓎 𝐵𝓁𝑜𝑔𝑔𝑒𝓇 📚 𝐵𝑜𝑜𝓀𝓈 & 𝐵𝑜𝒷𝒶 ✨ 𝒟𝓇𝑒𝒶𝓂𝑒𝓇 & 𝒟𝑜𝑒𝓇
Ashutosh Mehra @ashutoshmehra
2K Followers 7K Following Senior Principal Scientist at Adobe. Working on Acrobat AI Assistant, LLMs, and document ML.
Aditya Kusupati @adityakusupati
5K Followers 2K Following Been places..... Done things.... Next-Gen Modelling @GoogleDeepMind
Dr. S. Jaishankar @DrSJaishankar
4.0M Followers 35 Following External Affairs Minister of India. Member of Parliament (Rajya Sabha) from Gujarat State.
Tendy’s @Wendys
3.7M Followers 455 Following We like our tweets the way we like our tendys: hot, crispy, and better than anyone expects from a fast food restaurant.
DeepSeek @deepseek_ai
972K Followers 0 Following Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.
Yu Yang @YuYang_i
6K Followers 784 Following reasoning research @OpenAI 🍓 | UCLA CS PhD | Ex. Microsoft Research, Meta FAIR, NVIDIA Research
Rajesh Balasubramania... @RajeshB18566468
2K Followers 58 Following Math teacher, Entrepreneur in that order.
Ajay Jaiswal @ajayjaiswal1994
250 Followers 348 Following Amazon Science PhD Fellow @UTAustin || Research Intern @Apple || Ex-Adobe || Ex-IITKGP || Ex-Samsung
Matt Walsh @MattWalshBlog
3.9M Followers 598 Following Theocratic fascist, bestselling children’s author, world renowned DEI consultant
Alex @TickerSymbolYOU
48K Followers 390 Following I break down high-tech companies to help you invest in the best growth stocks | Former @MITLL rocket scientist turned full-time investor | $7M AUM (and growing)
Emad @EMostaque
291K Followers 25 Following Distributing Intelligence @ii_posts. Founder @StabilityAI.
Benjamin Marie @bnjmn_marie
1K Followers 202 Following Independent AI researcher (LLM, NLP). My blog, The Kaitchup - AI on a Budget: https://t.co/tyXQ2R8xgV
Joseph Pollack #Ï �... @josephpollack
2K Followers 5K Following 🤖AI❤️Data enjoyer , building robots to helps folks learn things quicker.
Deepak Narayanan @deepakn94
1K Followers 1K Following Research Scientist at @nvidia. Interested in the intersection of Computer Systems and ML. Occasionally tweet about sports. Views are my own.
Shizhe Diao @shizhediao
4K Followers 2K Following Research Scientist @NVIDIA focusing on efficient post-training of LLMs. Finetuning your own LLMs with LMFlow: https://t.co/UTykmQBwFr Views are my own.
Krish Ashok @krishashok
94K Followers 713 Following Techie (https://t.co/dULL1kpela), Author of Masala Lab (https://t.co/MdeRPLmgVz…), Musician (https://t.co/9sVjqvRpa4)
Nikita Bier @nikitabier
607K Followers 2K Following head of product @x, advisor @solana, venture partner @lightspeedvp, ex-founder @gasappteam (acq by discord), ex-founder @thetbhapp (acq by facebook)
unusual_whales @unusual_whales
2.5M Followers 2K Following Stocks/Options/Crypto/Market News/Tools. Not advice @Polymarket partner Open a tastytrade account: https://t.co/wGf2ZdlpzY Discord: https://t.co/0xJ9e0Zr98 More: https://t.co/nsxZlPUsA4
xAI @xai
1.8M Followers 38 Following
Zhiding Yu @ZhidingYu
8K Followers 565 Following Working to make machines understand the world like human beings. Words are my own.
Modular @Modular
20K Followers 2 Following The future of AI development starts here. Sign up to our 📪 Newsletter → https://t.co/gpuHGRyHTs. We are hiring → https://t.co/cPTAes0HMt 🚀
Nithin Kamath @Nithin0dha
739K Followers 187 Following Founder & CEO @Zerodhaonline @Rainmatterin Learning at @RainmatterOrg Musings on business & life: https://t.co/gQi9cu6E5h. Views are personal, Nothing is advice.
The Browser Company @browsercompany
143K Followers 0 Following Building a better way to use the internet, with @diabrowser and @arcinternet.
MLSys Conference @MLSysConf
469 Followers 17 Following Machine Learning & Systems Conference April 11-14th 2022
Tim Cook @tim_cook
14.9M Followers 70 Following Apple CEO Auburn 🏀 🏈 Duke 🏀 National Parks 🏞️ “Life's most persistent and urgent question is, 'What are you doing for others?'” - MLK. he/him
Dr. Parik Patel, BA, ... @ParikPatelCFA
730K Followers 990 Following Aswath Damodaran 🙏🏾 Dhandho Investor Chapter 3:4 🐂 | God first, full employment second 😤 | Investor @SamosaCapital | Subscribe to my newsletter 👇🏾
Song Han @songhan_mit
9K Followers 171 Following
Washington State Dept... @waDNR
138K Followers 7K Following Managing, sustaining, and protecting the health & productivity of Washington's lands and waters. 👤@CPL_Dave 🔥@WaDNR_Fire 🌲@waDNR_Forests
Dr. Rhonda Patrick @foundmyfitness
611K Followers 215 Following Ph.D in biomedical science interested in nutrition, brain & aging. Host of FoundMyFitness podcast https://t.co/rirQwqebxL
Trung Phan @TrungTPhan
727K Followers 4K Following Write on business with @workweekinc. Building a privacy-first AI research app (https://t.co/fZ5ObIy3Ra) and LLM API management platform (https://t.co/VTMMh1UFSj)
Aakanksha Chowdhery @achowdhery
11K Followers 5K Following @Stanford @reflection_ai // Previously @GoogleDeepMind :: PaLM, Gemini // @MSFTResearch, @Princeton // views my own and subject to change
Jeremy Howard @jeremyphoward
261K Followers 6K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Prev: professor @ UQ; Stanford fellow; @kaggle president; @fastmail/@enlitic/etc founder https://t.co/16UBFTX7mo
niki parmar @nikiparmar09
15K Followers 915 Following Working @Anthropic. Views expressed here are my own.
Ashish Vaswani @ashVaswani
26K Followers 2K Following
The Paperclip @Paperclip_In
83K Followers 115 Following A digital media house. Binding stories from India & beyond. History | Culture | Sports | Politics | Life
Andrew D. Huberman, P... @hubermanlab
1.6M Followers 2K Following Professor of Neurobiology and Ophthalmology at Stanford Medicine • Host of Huberman Lab • Focused on science and health research and public education
bandish @bandish
233 Followers 455 Following Engineer @MosaicML, I work on making DL efficient and accessible.
Abhi Venigalla @ml_hardware
7K Followers 1K Following Researcher @Databricks. Former @MosaicML, @CerebrasSystems. Addicted to all things compute.
Cade Metz @CadeMetz
31K Followers 1K Following New York Times reporter, covering A.I., driverless cars, and other changes: [email protected]. My book, "Genius Makers": https://t.co/TJBqNRKR5Q.
Adept @AdeptAILabs
31K Followers 19 Following Adept has built the most robust and reliable agent tech stack on the market.
Ekta Prashnani @ekta_prashnani
1K Followers 475 Following Research Scientist @NVIDIA AI. PhD @UCSantaBarbara. Views my own.
Distributed AI Resear... @DAIRInstitute
23K Followers 446 Following AI is not inevitable. We DAIR to imagine, build & use AI deliberately. Follow us on Mastodon at @[email protected]