Zihao Ye @ye_combinator
Proud to be an engineer. I'm building flashinfer (https://t.co/PabCM3ksjN) Opinions are my own. Seattle Joined October 2017-
Tweets153
-
Followers2K
-
Following555
-
Likes2K
🚀 Follow-up to our last breakthrough on DeepSeek V3/R1 inference! On NVIDIA GB200 NVL72, SGLang now achieves 26k input tokens/s and 13k output tokens/s per GPU with FP8 attention + NVFP4 MoE - that’s a 3.8× / 4.8× speedup vs H100 settings. See the details in the 🧵 (1/4)
🚀 Introducing Sparse VideoGen2 (SVG2) — Pareto-frontier video generation acceleration with semantic-aware sparse attention! 🏆Spotlight paper accepted by #NeurIPS2025 ✅ Training-free & plug-and-play ✅ Up to 2.5× faster on HunyuanVideo, 1.9× faster on Wan 2.1 ✅ SOTA quality…
The main branch of sglang now supports deterministic inference with user-specified per-request seeds! It utilized kernels from @thinkymachines and introduced new optimizations & coverage. Run out of box for most hardware backends and pytorch versions.
The main branch of sglang now supports deterministic inference with user-specified per-request seeds! It utilized kernels from @thinkymachines and introduced new optimizations & coverage. Run out of box for most hardware backends and pytorch versions.
@LigengZhu Glad that you enjoyed it! To be precise, it's EP64 on the inference side, around 30GB per inference GPU. So it's around 30GB / 1.3s = 23 GB/s.
Excited to share what friends and I have been working on at @Standard_Kernel We've raised from General Catalyst (@generalcatalyst), Felicis (@felicis), and a group of exceptional angels. We have some great H100 BF16 kernels in pure CUDA+PTX, featuring: - Matmul 102%-105% perf…
At Thinking Machines, our work includes collaborating with the broader research community. Today we are excited to share that we are building a vLLM team at @thinkymachines to advance open-source vLLM and serve frontier models. If you are interested, please DM me or @barret_zoph!…
Awesome work from @thinkymachines and @cHHillee! The importance of determinism might be underestimated. Like with LLM-based compression (bellard.org/ts_zip/) - you really need things to work the same way whether you're doing prefill/decode or different batching setups. Here's…
Awesome work from @thinkymachines and @cHHillee! The importance of determinism might be underestimated. Like with LLM-based compression (bellard.org/ts_zip/) - you really need things to work the same way whether you're doing prefill/decode or different batching setups. Here's…
@JingyuanLiu123 This is the advantage of large nvlink domains or TPUs topology - the main reason to do PP is that you are bottlenecked on your DP comms and cannot scale TP further. But if you have high enough bandwidth across a large enough domain (like TPUs or NVL72), you don't need to do PP…
🚀 Presenting LiteASR: a method that halves the compute cost of speech encoders by 2x, leveraging low-rank approximation of activations. LiteASR is accepted to #EMNLP2025 (main) @emnlpmeeting
Sub-10-microsecond Haskell Sudoku solver implemented in hardware. unsafeperform.io/papers/2025-hs…
Tilelang now supports SM120 — give it a try if you have RTX 5090 🚀😎
🎉 Excited to share: We’ve open-sourced Triton-distributed MegaKernel! A fresh, powerful take on MegaKernel for LLMs—built entirely on our Triton-distributed framework. github.com/ByteDance-Seed… Why it’s awesome? 🧩 Super programmable ⚡ Blazing performance 📊 Rock-solid precision
One nice thing you can do with an interactive world model, look down and see your footwear ... and if the model understands what puddles are. Genie 3 creation.
🚀 Excited to announce day-0 support from @NVIDIAAIDev for @OpenAI's gpt-oss model in flashinfer v0.2.10! github.com/flashinfer-ai/… ✅ Speed-of-light Blackwell mxfp4/mxfp8 MoE kernels + attention-sink from trtllm-gen ✅ FA2/FA3 template-based attention-sink support for earlier…
🚀 Excited to announce day-0 support from @NVIDIAAIDev for @OpenAI's gpt-oss model in flashinfer v0.2.10! github.com/flashinfer-ai/… ✅ Speed-of-light Blackwell mxfp4/mxfp8 MoE kernels + attention-sink from trtllm-gen ✅ FA2/FA3 template-based attention-sink support for earlier…
Like SGLang? Want speed of light decode perf? Checkout: github.com/sgl-project/sg…
Powered by TensorRT-LLM Gen kernels. Available via flashinfer and TRT-LLM. 🚀
Great to see StepFun acknowledges the idea of Attention-FFN disaggregation from our Megascale-infer work and take to the next level 🚀🚀🚀 arxiv.org/abs/2504.02263
Great to see StepFun acknowledges the idea of Attention-FFN disaggregation from our Megascale-infer work and take to the next level 🚀🚀🚀 arxiv.org/abs/2504.02263
I’ve been starting to collaborate with the folks who are building FlashInfer: nice project and pretty amazing set of people! @ye_combinator @tqchenml and everyone.
I’ve been starting to collaborate with the folks who are building FlashInfer: nice project and pretty amazing set of people! @ye_combinator @tqchenml and everyone.
SGLang is an early user of FlashInfer and witnessed its rise as the de facto LLM inference kernel library. It won best paper at MLSys 2025, and Zihao now leads its development @NVIDIAAIDev. SGLang’s GB200 NVL72 optimizations were made possible with strong support from the…
SGLang is an early user of FlashInfer and witnessed its rise as the de facto LLM inference kernel library. It won best paper at MLSys 2025, and Zihao now leads its development @NVIDIAAIDev. SGLang’s GB200 NVL72 optimizations were made possible with strong support from the…

Tianqi Chen @tqchenml
18K Followers 1K Following AssistProf @CarnegieMellon. Distinguished Eng @NVIDIA. Creator of @XGBoostProject, @ApacheTVM. Member https://t.co/QYyfjQNp4p, @TheASF. Views are on my own
Kiv @kivdaychen
3K Followers 1K Following cmu mcse '25 | broke things @M5tTrading @RisingWaveLabs @Hyperledger, @BytedanceTalk and 3 others.
Horace He @cHHillee
42K Followers 540 Following @thinkymachines Formerly @PyTorch "My learning style is Horace twitter threads" - @typedfemale
¬¬Mike (Deyuan) He @1SHL10
996 Followers 573 Following 3rd-year PhD @PrincetonCS PL Group; PL/Systems; Prev @AWSCloud @Intel @Taichi_Lang @uwplse
Dr. Jian "Daye" Weng @b1antaidaye
5K Followers 696 Following Father of 2 | PhD @UCLAComSci | AssistProf @cemseKAUST | Compilers | Computer Arch | Sw/hw Co-designs | IMDB: PTSD | 抽象是工作抽象也是生活 | 川粉
Ce Gao @gaocegege
7K Followers 788 Following Co-founder and CEO @TensorChord, building postgres-based vector extension https://t.co/7WGvl1sR56 | Father of 1 cat | Married
Beidi Chen @BeidiChen
15K Followers 400 Following Asst. Prof @CarnegieMellon, @amazon Scholar, Prev: Visiting Researcher @Meta, Postdoc @Stanford, Ph.D. @RiceUniversity, Large-Scale ML, a fan of Dota2.
Talia Ringer 💚 @TaliaRinger
30K Followers 7K Following Professor, @plfmse, @IllinoisCS! Proof Automation. @SigplanM & CCF Founder. Israeli-American for peace, equality, justice. Mom. They/היא, ND, bi
Ligeng Zhu @LigengZhu
2K Followers 2K Following Research Scientist at @Nvidia exploring efficient training , previously @MIT, @SFU and @ZJU_China.
Xuanwo @OnlyXuanwo
11K Followers 941 Following ASF Member. @ApacheOpenDAL PMC Chair. VISION: Data Freedom. Working on #RBIR with @LanceDB
鹿 𝕟𝕠𝕜𝕚�... @IIInoki
10K Followers 2K Following Nobody “文科”🐶博士 人间不值得 生活推 風立ちぬ、いざ生きめやも 背景图 by @IIInoki
Yuliang Xiu @yuliangxiu
7K Followers 5K Following Assistant Professor @Westlake_Uni, Ph.D. @MPI_IS, previously @USC_ICT. Focusing on democratizing human digitization. Intern @RealityLabs @Ubisoft
Luis Ceze @luisceze
4K Followers 2K Following computer architect. marveled by biology. professor @uwcse. ceo @OctoAICloud. venture partner @madronaventures.
Ji Lin @jilin_14
6K Followers 949 Following Research @Meta Superintelligence Lab | Prev: Research @OpenAI; PhD @MIT
Anne Ouyang @anneouyang
8K Followers 938 Following Building @Standard_Kernel, CS PhD student @Stanford | prev: cuDNN @Nvidia, M.Eng, B.S. in CS @MIT | efficient scalable self-improving AI systems | 🌽KernelBench
HappyQQ_AI @HappyQQ_AI
19K Followers 3K Following 少壮不努力,老大成老码农。 X是我的树洞,未成年人请在家长授权及陪同下关注这位偶尔胡言乱语以及开车油门踩到底的Q哥。 专注𝗔𝗜、网络安全、系统架构等前沿技术领域的产品研发及商业应用实践。 微信公众号:移动互联网。 不吹牛逼的小号:@HappyQQ_CN
Ljubomir Josifovski @ljupc0
6K Followers 5K Following FOLLOWS 🫵 https://t.co/F7MzDOTC1k ML/AI R&D sci/eng, quant trading, ASR in noise, TTS. Open ASI compute for */acc; it's more fun to compute 🥰
Xinwei Qiang @QiangXinwe38067
3 Followers 35 Following
이덕영 @ideogyeong6445
0 Followers 14 Following
VioP @AcousimHss
37 Followers 576 Following the more you laugh the more you cry the more you cry the more you laugh
swh @swhsiang
58 Followers 167 Following building humanoid | prev ML @CashApp Infra @salesforce Purdue ECE
Red Hat AI @RedHat_AI
8K Followers 2K Following Accelerating AI innovation with open platforms and community. The future of AI is open.
Nam Đức @Namc1524397
3 Followers 116 Following
RockyParadox @RockyParadox44
135 Followers 8K Following
Oliver Sieberling @osieberling
101 Followers 475 Following PhD student @MIT_CSAIL | CS @ETH | LLM Architectures, Efficiency, Evolutionary Algorithms
P @IiHq7kK57ZxosbT
93 Followers 2K Following
KC 🪐 @erwangto
185 Followers 689 Following
Adarsh @adarshxs
1K Followers 2K Following 20 | Founder @tensoic | multimodal team @ SGLang(@lmsysorg) | Prev: @iiscbangalore
Eric @eelbaz
94 Followers 495 Following Knowledge is a passion. Lead by example. (opinions are my own)
Adam Zweiger @AdamZweiger
959 Followers 445 Following Rethinking how language models learn | Researcher @MIT_CSAIL
NexusFire👹 @NexusFireX
101 Followers 827 Following .°. Community | AI | Open Science | Interdisciplinary Curricula .°.
Allison Zhan @AllisonXinyuan
89 Followers 399 Following Investing in early stage AI. First partner to @lobehub , @klavis_ai , @LeptonAI (acq NVIDIA) etc. | co-founder of @EvalSysOrg | Views are my own
Tahsin Mayeesha @tahsin_mayeesha
757 Followers 6K Following PhD Student, Information Science (University of North Texas) | AI Engineer & Researcher | NLP · HCI · AI Policymaking · Human-Centered ML.
shiredude95 @shiredude95
8 Followers 301 Following
Neo @tianyuzhang1214
6 Followers 166 Following
HuldaBarrett @W0B2ri0x5Sh5j
15 Followers 640 Following
Chiranjiv @meChiranjiv
89 Followers 849 Following
ErinPhilemon @9S2iX7RC1uZ0D8
19 Followers 1K Following
playerUnknown @Syyuuvh
21 Followers 206 Following
Dhamodharan @CosmicNewage
171 Followers 4K Following
RokunuzJahan Rudro @rudro12356
51 Followers 744 Following Learning ML bit by bit || MS student @ University of Kansas || X-Machine Learning Intern @ AssetWorks Inc.
AI Butterflies @ShalabyAI
12 Followers 396 Following
Mike Menalis @MikeMenalis
8 Followers 83 Following Built world class engineering and product teams at Google, Uber, Meta, DoorDash and startups.
559240422 @dddddtttttt2019
0 Followers 2K Following
We live in a time of ... @eemin72
19 Followers 5K Following HUMAN RIGHTS was a lie. FREE SPEECH was a lie. DEMOCRACY was a lie. We live in a time of monsters…
Piyush Byahut @piyush_byahut
2 Followers 592 Following
城陽人 @minamijoyo
948 Followers 660 Following 本と缶コーヒーを愛するWeb系インフラ園児にゃー。空気か水になりたい。 ※つぶやきは個人の見解であり、所属する組織を代表するものではありません。
Ai Parrot @Theaiparrot
18 Followers 190 Following Ai Parrot 🦜 | Pioneering the largest AI blogging chain in history | Empowering voices with cutting-edge AI insights & tools | Join the flock! #AIRevolution
gmtmambim @SECRETGARDDEN
4 Followers 148 Following
Bert Maher @tensorbert
3K Followers 375 Following I’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
Kim Lee @lee_kim16855
61 Followers 1K Following
Tianqi Chen @tqchenml
18K Followers 1K Following AssistProf @CarnegieMellon. Distinguished Eng @NVIDIA. Creator of @XGBoostProject, @ApacheTVM. Member https://t.co/QYyfjQNp4p, @TheASF. Views are on my own
Kiv @kivdaychen
3K Followers 1K Following cmu mcse '25 | broke things @M5tTrading @RisingWaveLabs @Hyperledger, @BytedanceTalk and 3 others.
Horace He @cHHillee
42K Followers 540 Following @thinkymachines Formerly @PyTorch "My learning style is Horace twitter threads" - @typedfemale
¬¬Mike (Deyuan) He @1SHL10
996 Followers 573 Following 3rd-year PhD @PrincetonCS PL Group; PL/Systems; Prev @AWSCloud @Intel @Taichi_Lang @uwplse
Dr. Jian "Daye" Weng @b1antaidaye
5K Followers 696 Following Father of 2 | PhD @UCLAComSci | AssistProf @cemseKAUST | Compilers | Computer Arch | Sw/hw Co-designs | IMDB: PTSD | 抽象是工作抽象也是生活 | 川粉
Beidi Chen @BeidiChen
15K Followers 400 Following Asst. Prof @CarnegieMellon, @amazon Scholar, Prev: Visiting Researcher @Meta, Postdoc @Stanford, Ph.D. @RiceUniversity, Large-Scale ML, a fan of Dota2.
Talia Ringer 💚 @TaliaRinger
30K Followers 7K Following Professor, @plfmse, @IllinoisCS! Proof Automation. @SigplanM & CCF Founder. Israeli-American for peace, equality, justice. Mom. They/היא, ND, bi
Shriram Krishnamurthi... @ShriramKMurthi
21K Followers 4K Following @BrownCSDept/@BrownUniversity • @BootstrapWorld • @PyretLang • @racketlang • Unreasonably excited about compsci, education, cycling, cricket, human experience.
Ligeng Zhu @LigengZhu
2K Followers 2K Following Research Scientist at @Nvidia exploring efficient training , previously @MIT, @SFU and @ZJU_China.
Xuanwo @OnlyXuanwo
11K Followers 941 Following ASF Member. @ApacheOpenDAL PMC Chair. VISION: Data Freedom. Working on #RBIR with @LanceDB
Yuliang Xiu @yuliangxiu
7K Followers 5K Following Assistant Professor @Westlake_Uni, Ph.D. @MPI_IS, previously @USC_ICT. Focusing on democratizing human digitization. Intern @RealityLabs @Ubisoft
Luis Ceze @luisceze
4K Followers 2K Following computer architect. marveled by biology. professor @uwcse. ceo @OctoAICloud. venture partner @madronaventures.
Ji Lin @jilin_14
6K Followers 949 Following Research @Meta Superintelligence Lab | Prev: Research @OpenAI; PhD @MIT
Vinod Grover @vinodg
3K Followers 1K Following Sr Distinguished Engineer @nvidia. Compilers, CUDA C++, PL, Machine Learning and Systems. tweets and opinions are personal.
Hawkingrei @suohawking
3K Followers 3K Following mono repo 爱好者|抖机灵 | Database Developer | SW-1518-1200-8238 | ADHD https://t.co/x1xMF1BYtc
Yueying Li @lisali126
216 Followers 557 Following Ph.D. @Cornell Start new adventures at @MITCSAIL soon. Former SJTU @Umich @Apple Intel Labs @MSFTResearch
Anne Ouyang @anneouyang
8K Followers 938 Following Building @Standard_Kernel, CS PhD student @Stanford | prev: cuDNN @Nvidia, M.Eng, B.S. in CS @MIT | efficient scalable self-improving AI systems | 🌽KernelBench
Haoran Qiu @haoran_qiu98
444 Followers 428 Following Systems for Efficient AI @Microsoft Azure Research | Prev. @Google @IBMResearch @MSFTResearch | CS PhD @IllinoisCS, B.Eng @HKUniversity
Red Hat AI @RedHat_AI
8K Followers 2K Following Accelerating AI innovation with open platforms and community. The future of AI is open.
Robert Scoble @Scobleizer
543K Followers 24K Following The best from ML/AI community | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future | Silicon Valley robots, holodecks, BCIs, & startups.
Zed @zeddotdev
57K Followers 47 Following A next-generation code editor that enables high-performance collaboration with AI and your team. https://t.co/4Ua0UqLrsv
Oliver Sieberling @osieberling
101 Followers 475 Following PhD student @MIT_CSAIL | CS @ETH | LLM Architectures, Efficiency, Evolutionary Algorithms
Adarsh @adarshxs
1K Followers 2K Following 20 | Founder @tensoic | multimodal team @ SGLang(@lmsysorg) | Prev: @iiscbangalore
Adam Zweiger @AdamZweiger
959 Followers 445 Following Rethinking how language models learn | Researcher @MIT_CSAIL
Bert Maher @tensorbert
3K Followers 375 Following I’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
yinghai @yinghai
47 Followers 50 Following
Jiayi Yuan @JiayiYuan99
159 Followers 307 Following CS PhD candidate at Rice | Intern at NVIDIA | Efficient LLM
Thien Tran @gaunernst
1K Followers 229 Following
John Schulman @johnschulman2
66K Followers 1K Following Recently started @thinkymachines. Interested in reinforcement learning, alignment, birds, jazz music
Masahiro Hiramori @mshrh3
64 Followers 111 Following Apache TVM committer. ML compiler engineer. Creator and maintainer of Verilog-HDL/SystemVerilog for VS @code extension. Opinions are my own.
JingyuanLiu @JingyuanLiu123
3K Followers 433 Following https://t.co/D7zLeTZRMh is all you need | Opinions are my own
Jiarong Xing @Jiarong_Xing
119 Followers 134 Following Postdoc at UC Berkeley; Assistant Professor at Rice University
Minjia Zhang @_Minjia_Zhang_
156 Followers 73 Following Assistant Professor@UIUC, Machine Learning System, Ex-Principal Researcher@Microsoft, @MSFTResearch, @MSFTDeepSpeed
Roger Wang @rogerw0108
496 Followers 183 Following Flowers and friendship | ML Platform & Infra @Roblox | Committer @vllm_project | @uwaterloo @uwcse
EndeavourOS @OsEndeavour
15K Followers 315 Following A terminal-centric distro with a vibrant and friendly community at its core.
Eigen AI @Eigen_AI_Labs
401 Followers 23 Following Built by researchers and engineers from MIT, we are pursuing Artificial Efficient Intelligence (AEI). Try GPT-OSS support: https://t.co/BQfsnXIGFo.
Dylan Patel @dylan522p
97K Followers 947 Following SemiAnalysis Boutique AI & Semiconductor Research and Consulting DMs are open for consulting, quotes, or to talk shop
Jianan Ji @ji_jianan71963
2 Followers 16 Following
NVIDIA AI Developer @NVIDIAAIDev
83K Followers 324 Following All things AI for developers from @NVIDIA. Additional developer channels: @NVIDIADeveloper, @NVIDIAHPCDev, and @NVIDIAGameDev.
Mark Collier 柯理�... @sparkycollier
14K Followers 15K Following Austin Powered. Co-founder of OpenStack & OpenInfra Foundation. General Manager of AI & Infrastructure for the Linux Foundation. open source for fun & profit.
Mathew Jacob @mat_jacob1002
135 Followers 65 Following Incoming PhD @uwcse. prev @DbrxMosaicAI, @siebelschool
Yi Pan @conlesspan
68 Followers 260 Following Undergrad @ SJTU ACM Class | RA @uwcse | Distributed & ML Systems
Wei-Lin Chiang @infwinston
5K Followers 937 Following Building @lmarena_ai @UCBerkeley PhD in AI & systems
Yurong You @YurongYou
42 Followers 62 Following
Chenggang Zhao @chenggang_zhao
387 Followers 56 Following @deepseek_ai infra; previously at NVIDIA | SenseTime | Tsinghua University.
Joy Dong @JoyChew_d
181 Followers 51 Following PhD candidate @UMich. Previously @PyTorch @NVidia. #ConfidentialComputing #GPU Optimization & Architecture
Yong Wu @yongwwwml
3 Followers 32 Following
Sean Lee @seanprime7
44 Followers 152 Following
Ajay Jain @ajayj_
7K Followers 4K Following Co-founder @genmoai. Co-created denoising diffusion (DDPM), DreamFusion, Dream Fields. Ex Ph.D. @berkeley_ai, @googleai, @facebookai, @nvidiaai, @mit
Perplexity Developers @PPLXDevs
3K Followers 15 Following Updates for developers building with Sonar. Power your products with the fastest, cheapest API offering out there with search grounding.
Sainbayar Sukhbaatar @tesatory
3K Followers 326 Following Researcher Scientist at FAIR @AIatMeta Research: Memory Networks, Asymmetric Self-Play, CommNet, Adaptive-Span, System2Attention, ...
Steeve Morin @steeve
6K Followers 1K Following Building @zml_ai (and we're hiring), ex @zenly, ex Exalead, ex @google. Skydiver and wingsuiter.