Marc Sun @_marcsun
Machine Learning Engineer @huggingface Open Source team New york Joined February 2023-
Tweets646
-
Followers2K
-
Following454
-
Likes3K
LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…
Made an interactive bank conflict visualizer. It was for me to easily debug swizzling function.
Why is your KV so small? 🤏 In continuous batching, if you increase the max number of tokens per batch, you must decrease the memory allocated for your cache. In transformers, we make sure they are perfectly balanced (as all things should be). No matter how big your model is🦠🐋
Training long-context LLMs is getting easier! TRL now supports Context Parallelism (CP), letting you scale sequences across multiple GPUs, even multi-node setups, seamlessly 💆 Combine TRL and accelerate to run it effortlessly!
I do not get why Dan's group does not get more attention: best quantization methods, best quantization kernels, and they even put everything into open-source libraries. Meanwhile, we see slop papers/software explode. If frontier labs ask me who's students to hire, I go like 👇
I do not get why Dan's group does not get more attention: best quantization methods, best quantization kernels, and they even put everything into open-source libraries. Meanwhile, we see slop papers/software explode. If frontier labs ask me who's students to hire, I go like 👇
We're releasing the DASLab GGUF Quantization Toolkit! 🚀 First open-source toolkit bringing GPTQ + EvoPress to @ggerganov's GGUF format, enabling heterogeneous quantization based on importance. Result: Better models at the same file size. [1/5]
Excited to share what friends and I have been working on at @Standard_Kernel We've raised from General Catalyst (@generalcatalyst), Felicis (@felicis), and a group of exceptional angels. We have some great H100 BF16 kernels in pure CUDA+PTX, featuring: - Matmul 102%-105% perf…
🚀 Life update: I’ve joined 🤗@huggingface as AI Scientist & Educator, starting a new track on **Mechanistic Interpretability of LLMs** 🧠🤖 Over the past 7 years at Ubisoft 🎮, I explored how AI, science & gameplay intersect. I worked on cutting-edge LLM-powered NPCs,…
📢📢 Quelques news de ma présence ici : je vais la couper en 2 ! Ce compte devient « Science étonnante » et continuera de relayer mes articles, vidéos, conférences 🔭🧪 … J’ai recréé un compte @dlouapre à côté pour parler plutôt de mes activités professionnelles et de mes…
Beginner-friendly LLM course exploring foundations, architectures, training, deployment, and current trends: bit.ly/47iexAz v/@Hesamation
I have strong opinions that tests should drive development; I don't consider a feature working unless it is actively tested. Very happy about this dashboard put out by the team to follow failure rates across popular/canonical models Come check it out and hold us accountable 🤗
At Thinking Machines, our work includes collaborating with the broader research community. Today we are excited to share that we are building a vLLM team at @thinkymachines to advance open-source vLLM and serve frontier models. If you are interested, please DM me or @barret_zoph!…
🚀 Just shipped TRL v0.23 - train with *any* context length This release brings Context Parallelism which allow to train with arbitrary context length along with major improvements for post-training Here’s what’s new 🧵👇
After working on Transformers for a number of years, I'm now switching things a bit, working on evaluations w/ HF/lighteval. The first PRs will be focused on stabilizing, testing, and improving the overall API. Let us know in case you faced any difficulty using the tool!
looks like a great resource
🚀 Excited to announce QuTLASS v0.1.0 🎉 QuTLASS is a high-performance library for low-precision deep learning kernels, following NVIDIA CUTLASS. The new release brings 4-bit NVFP4 microscaling and fast transforms to NVIDIA Blackwell GPUs (including the B200!) [1/N]
i was reading the torchao paper today and came across a pleasant surprise 😳 contributions were super minimal but kinda cool that the folks over at torchao mentioned me 🤗 defo wanted to contribute more impactfully but didn't know how to write triton nor fully understood…
New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work! Took me a while to get this level of understanding of the codebase and then to write up…
Day 14 of 14 Days of Distributed! We've got a number of cool people still that are talking since we started this list, so today we're going to rapid fire them all (in no particular order)! Let's buckle up and go! @winglian @FerdinandMom @m_sirovatka @mervenoyann @charles_irl

Quantization @Quantization89
0 Followers 28 Following Quantization, graph transforms & low-bit inference | Turning big models into lean, fast ones
shah @dist_all_reduce
0 Followers 12 Following
mav @mav3ri3k
67 Followers 320 Following CS @ VIT • 🦀 • prev GSOC'24 Rust Compiler • i'll be ml researcher
Harpreet Singh @harpreetmann24
9 Followers 826 Following
Aexyn @Aexyn
0 Followers 3K Following
Benedikt Koehler @furukama
6K Followers 5K Following Loves building things and training LLMs • Founder @DataLion_EN https://t.co/LyGyDQ2MPL • Crunching Numbers & Synthwave Music & Chen Taiji • PhD @LMU_Muenchen
nwyin @_nwyin
507 Followers 577 Following
Théo Pomies @theopomies
297 Followers 2K Following CTO @ https://t.co/BRhXDz6tjE, Roman Catholic, Bitcoin Maximalist, AI dabbler, Weight training x Training weights
Saurabh Baji @sbaji
1K Followers 2K Following CTO/SVP Eng @CohereAI LLMs Ex - VP, AI and Data @ Unity, Quantcast, AWS / EMR, Athena. AI, ML, Big Data - always hiring; DM if interested. Tweets are my own.
Sriram K @ram_sharanga
2 Followers 25 Following
Matthieu LC @matthieulc
1K Followers 1K Following robots and llms / prev built typeless to make doctors happy, acq by @Doctolib
David Louapre @dlouapre
4K Followers 151 Following ML/AI scientist @huggingface 🤗 · Creator of @sciencetonnante (1.4M YouTube subs) 🎥 PhD in quantum gravity 🎓 · ex-Scientific Director @Ubisoft 🎮
Timmi @TimmiTimmey
4 Followers 143 Following
Nadav Timor @NadavTimor
1K Followers 8K Following AI inference, speculative decoding, open source. Built novel decoding algorithms – default in Hugging Face Transformers (150+ ⭐). Making AI faster + cheaper
Robert Scoble @Scobleizer
543K Followers 24K Following The best from ML/AI community | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future | Silicon Valley robots, holodecks, BCIs, & startups.
Shekswess @Shekswess
238 Followers 676 Following AWS Ambassador @awscloud | Machine Learning Lead @lokahq | College Professor @Brainsterio
Tarek Masryo @TarekMasryo
2 Followers 43 Following AI/ML Engineer | Generative AI · MLOps · Open-Source | Building & Sharing
Aman Swar @AmanSwar_
2 Followers 151 Following MLSys. Hacking on CUDA kernels, compilers,and LLM infra. Pushing performance
Sourik @Sourik24
285 Followers 2K Following Making GPUs and CPUs go Brrrrr @ https://t.co/CXXbtt3IPU , GPU tinkerer, Compiler Fanatic, Code Slinger, Harry Potter and Star Trek Nerd, Full-Time LEGO Connoisseur
Sam Foreman @saforem2
2K Followers 5K Following https://t.co/oi6qzoIAB8 | making rocks think @argonne | models lead https://t.co/C75Km4rkpG
daryl martis @realdarylmartis
329 Followers 2K Following
myron koch @myronkoch
571 Followers 2K Following Saxophone | Technology | Film | Blockchain & AI Research
Kazuki Fujii @okoge_kaz
3K Followers 2K Following TokyoTech CS Master Swallow LLM Project: https://t.co/eKjBfnQjUo Distributed Training, Sytems for Machine Learning, Low Precision Training
ELONMUSKTESLA @elonneurlink
33 Followers 781 Following Live life to the fullest, keep things simple, truthful & filter the noise. I am a long term investor MAGA🚀🇺🇸
Jannik @JannikHWX
19 Followers 1K Following
ye dongxi @YDongxi
74 Followers 2K Following A data set, data annotation sales, selling high-quality annotation solutions similar to AI for science/autonomous driving/lean4 data topics。
Stewart Caitlin @StewartCaitlin3
35 Followers 365 Following
Sai Vignan @vignan_sai
112 Followers 2K Following ML Engineering @Microsoft, prev ML @sprinklr, CS @iitdelhi, Interested in ML, Bio Informatics
sandya mannarswamy @sandyasm
1K Followers 7K Following Natural Language Processing Researcher. https://t.co/oYoCTKS2Ho
Rahul ✨ @Geek4PM
63 Followers 785 Following Helping fashion studios & brands replace $400/SKU photoshoots with custom, consistent, and scalable AI-generated visuals. Book a call with 8M Studio
Kashish Jagga @kashish_jagga
23 Followers 3K Following
Hamid Soorghali @soorghali
1K Followers 5K Following Connecting space-based infrastructure and services to the downstream industries and markets | Strategy @SatAppsCatapult | @SOAS & @InterpolAber alumnus
Wen-Ding Li @xu3kev
3K Followers 6K Following LLM for code and reasoning. PhD student at Cornell. Previously Student Researcher at @google. Previously intern at @theteamatx.
Nina @Majorg09355
37 Followers 2K Following
Ilias Miraoui @iliasmiraoui
681 Followers 1K Following Hacking with LLMs⚒️ https://t.co/Q9bogacIgC https://t.co/CTcyYe1bmv
Christian Lim @christian_lim_
195 Followers 1K Following VP of Engineering @ArklexAI | Adjunct @Columbia | Director of Internships @ICPCNews | Board of Directors @SAPAAC1 | @stanford (BS '11 MS '13)
Utaiwi @Utaiwi36604
106 Followers 2K Following
Daniel Bis @danielbis01
111 Followers 770 Following LLMs at Amazon AI | prev Samsung, RMS | Opinions expressed are my own
Sergio Soage @Sergio_Soage
877 Followers 6K Following artificial intelligence, math. Random stuff @ https://t.co/tqV9OIPsWE
Ant Ling @AntLingAGI
2K Followers 114 Following A series of open-source large models from Ant Group, Ling for LLM, Ring for Reasoning LLM, Ming for MLLM. See us at inclusionAI.
Jackmin @jackminong
2K Followers 769 Following brutally slashing misbehaving computers @PrimeIntellect 🇺🇸. Previously @JinaAI_ 🇩🇪 @MoneyLion 🇲🇾.
Anne Ouyang @anneouyang
7K Followers 927 Following Building @Standard_Kernel, CS PhD student @Stanford | prev: cuDNN @Nvidia, M.Eng, B.S. in CS @MIT | efficient scalable self-improving AI systems | 🌽KernelBench
David Louapre @dlouapre
4K Followers 151 Following ML/AI scientist @huggingface 🤗 · Creator of @sciencetonnante (1.4M YouTube subs) 🎥 PhD in quantum gravity 🎓 · ex-Scientific Director @Ubisoft 🎮
Irwan Bello @IrwanBello
7K Followers 3K Following Supercomputers & Friends AGI research & products founding team @reflection_ai ex @OpenAI, founding team @character_ai
Christopher De Sa @chrismdesa
495 Followers 23 Following
Bert Maher @tensorbert
3K Followers 375 Following I’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
Romain Huet @romainhuet
33K Followers 8K Following Head of Developer Experience @OpenAI. Empowering builders with GPT-5, Codex, gpt-oss, and more. Previously, Product Lead @Stripe.
Stuart Sul @stuart_sul
1K Followers 118 Following ml research @cursor_ai, cs @Stanford, mlsys @HazyResearch
Zeyuan Allen-Zhu, Sc.... @ZeyuanAllenZhu
21K Followers 465 Following physics of language models @ Meta (FAIR, not GenAI, not TBD) 🎓:Tsinghua Physics — MIT CSAIL — Princeton/IAS 🏅:IOI x 2 — ACM-ICPC — USACO — Codejam — math MCM
Eric Hartford @QuixiAI
17K Followers 579 Following We make AI models Dolphin and Samantha BTC 3ENBV6zdwyqieAXzZP2i3EjeZtVwEmAuo4 https://t.co/3ri2GbWU13 https://t.co/zH0F3pSLuq @dphnAI
Wanchao Liang @wanchao_
1K Followers 230 Following building @thinkymachines ex-PyTorch @ Meta. Author of PyTorch DTensor and TorchTitan. Opinions are my own
Rémi Ouazan @remi_or_
266 Followers 60 Following Crafting cutting-edge GPU kernels at Hugging Face 🤗
Dan Saunders @djsaunde
498 Followers 2K Following mle @axolotl_ai making OSS LM training tools. prev @awscloud, startups, research
Cheng @zcbenz
3K Followers 90 Following hacking CUDA and MLX. creator of @electronjs. check https://t.co/ZDJujd4fAN for the open source things I built.
Federico Cassano @ellev3n11
2K Followers 241 Following training big models @cursor_ai prev @neu_prl, @scale_AI, @Roblox, @trailofbits
Christian Szegedy @ChrSzegedy
42K Followers 3K Following #deeplearning, #ai research scientist. Opinions are mine.
Harry Mellor @hmellor_
176 Followers 33 Following ML Engineer @huggingface maintaining @vllm_project, prev @graphcoreai, @uniofoxford
LMSYS Org @lmsysorg
8K Followers 180 Following Large Model Systems Organization: Join our Slack: https://t.co/mSPNyKTLTS We developed SGLang https://t.co/jEqIJcGwGA, Chatbot Arena (now @lmarena_ai), and Vicuna!
Vasiliy Kuznetsov @vkuzo
39 Followers 27 Following
turboderp @turboderp_
771 Followers 36 Following
Vijay @__tensorcore__
2K Followers 525 Following MLIR, CUTLASS,Tensor Core arch @NVIDIA. Mechanic @hpcgarage. Exercise of any 1st amendment rights are for none other than myself.
Prime Intellect @PrimeIntellect
48K Followers 28 Following find compute. train models. contribute to open superintelligence. https://t.co/ZRZOsRRbwr
joe @official_j3rck
249 Followers 203 Following training things @pytorch @aiatmeta | previously language research at @usc_isi
fal @fal
34K Followers 6 Following the generative media cloud. hiring https://t.co/JrbUk989MN. for support/discounts, e-mail us at [email protected].
Alex Zhang @a1zhang
13K Followers 596 Following phd student @MIT_CSAIL, ugrad @Princeton, 🫵🏻 go participate in the @GPU_MODE kernel competitions!
Jeff Rasley @jeffra45
883 Followers 1K Following @Snowflake AI Research Team. @DeepSpeedAI co-founder, @BrownCSDept PhD, @uwcse alum
Hao AI Lab @haoailab
4K Followers 343 Following Hao AI Lab at UCSD. Our mission is to democratize large machine learning models, algorithms, and their underlying systems.
kalomaze @kalomaze
19K Followers 2K Following ML researcher (@primeintellect), speculator • extremely silly jester
You Jiacheng @YouJiacheng
9K Followers 2K Following a big fan of TileLang 关注TileLang喵!关注TileLang谢谢喵! https://t.co/utshC0jrCO 十年老粉
ℏεsam @Hesamation
39K Followers 609 Following ai engineer | rigorously overfitting on a learning curve
Sergio Paniego @SergioPaniego
3K Followers 2K Following Machine Learning Engineer @huggingface 🤗 AI PhD. Technology enables us to be more human. 🏳️🌈
Taishi Nakamura @Setuna7777_2
2K Followers 6K Following Working on scalable and efficient LLM (MoE pretraining, RL, reasoning). CS MS at @sciencetokyo_en Intern @SakanaAILabs
Mira Murati @miramurati
371K Followers 574 Following Now building @thinkymachines. Previously CTO @OpenAI
zhyncs @zhyncs42
3K Followers 538 Following 🌁 OPINIONS ARE MY OWN, Homepage https://t.co/saCowtppUm, Just for fun @lmsysorg SGLang, Prev @basetenco @meituan @Baidu_Inc
Eldar Kurtić @_EldarKurtic
739 Followers 629 Following Principal Research Scientist @RedHat_AI & Dan Alistarh's group @ISTAustria