EmbeddedLLM @EmbeddedLLM
Your open-source AI ally. We specialize in integrating LLM into your business. Joined October 2023-
Tweets388
-
Followers886
-
Following1K
-
Likes356
Getting ready to try DeepSeek-V3.2-Exp from @deepseek_ai ? vLLM is here to help! We have verified that it works on H200 machines, and many other hardwares thanks to the hardware plugin mechanism. Check out the recipes docs.vllm.ai/projects/recip… for more details 😍 Note: currently…
Getting ready to try DeepSeek-V3.2-Exp from @deepseek_ai ? vLLM is here to help! We have verified that it works on H200 machines, and many other hardwares thanks to the hardware plugin mechanism. Check out the recipes docs.vllm.ai/projects/recip… for more details 😍 Note: currently… https://t.co/9B3ImM72Cn
How does @deepseek_ai Sparse Attention (DSA) work? It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA.
How does @deepseek_ai Sparse Attention (DSA) work? It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA. https://t.co/QzzPRvAaNa
🚀 New in vLLM: dots.ocr 🔥 A powerful multilingual OCR model from @xiaohongshu hi lab is now officially supported in vLLM! 📝 Single end-to-end parser for text, tables (HTML), formulas (LaTeX), and layouts (Markdown) 🌍 Supports 100 languages with robust performance on…
🚀 New in vLLM: dots.ocr 🔥 A powerful multilingual OCR model from @xiaohongshu hi lab is now officially supported in vLLM! 📝 Single end-to-end parser for text, tables (HTML), formulas (LaTeX), and layouts (Markdown) 🌍 Supports 100 languages with robust performance on… https://t.co/G4hMCrSHJY
Missed our latest vLLM office hours? We covered hybrid models as first-class citizens in @vllm_project. ✅ Hybrid model support in v1 ✅ Mamba, Mamba2, linear attention ✅ Performance from v0 → v1 ▶️ Recording: youtube.com/live/uWQ489ONv… 📑 Slides: docs.google.com/presentation/d…
We keep pushing the limits of speculative decoding (SD) in LLM inference -- check out our latest NeurIPS’25 paper: Lookahead Reasoning (LR). The high-level rationale is pretty simple: SD alone isn’t enough now: as GPUs get stronger (H200 -> B200 -> Rubin CPX), we'll be able to…
We keep pushing the limits of speculative decoding (SD) in LLM inference -- check out our latest NeurIPS’25 paper: Lookahead Reasoning (LR). The high-level rationale is pretty simple: SD alone isn’t enough now: as GPUs get stronger (H200 -> B200 -> Rubin CPX), we'll be able to…
Day-0 support on one of the most anticipated model releases🚀🚀🚀 Detailed deployment guide coming soon at vllm-recipes docs.vllm.ai/projects/recip…
Day-0 support on one of the most anticipated model releases🚀🚀🚀 Detailed deployment guide coming soon at vllm-recipes docs.vllm.ai/projects/recip… https://t.co/dwXiEVZHEM
Congrats to @deepseek_ai ! DeepSeek-R1 was published in Nature yesterday as the cover article, and vLLM is proud to have supported its RL training and inference🥰
Disaggregated Inference at Scale with #PyTorch & #vLLM: Meta’s vLLM disagg implementation improves inference efficiency in latency & throughput vs its internal stack, with optimizations now being upstreamed to the vLLM community. 🔗 hubs.la/Q03J87tS0
@zephyr_z9 😂 Although AMD is now working pretty well for small to medium sized models
Welcome Qwen3-Next! You can run it efficiently on vLLM with accelerated kernels and native memory management for hybrid models. blog.vllm.ai/2025/09/11/qwe…
Welcome Qwen3-Next! You can run it efficiently on vLLM with accelerated kernels and native memory management for hybrid models. blog.vllm.ai/2025/09/11/qwe… https://t.co/M19eWoXAj2
Deep dive into optimizing weight transfer step by step and improving it 60x!
⚡️ Efficient weight updates for RL at trillion-parameter scale 💡 Best practice from Kimi @Kimi_Moonshot vLLM is proud to collaborate with checkpoint-engine: • Broadcast weight sync for 1T params in ~20s across 1000s of GPUs • Dynamic P2P updates for elastic clusters •…
⚡️ Efficient weight updates for RL at trillion-parameter scale 💡 Best practice from Kimi @Kimi_Moonshot vLLM is proud to collaborate with checkpoint-engine: • Broadcast weight sync for 1T params in ~20s across 1000s of GPUs • Dynamic P2P updates for elastic clusters •…
vLLM Singapore Meetup — Highlights Thanks to everyone who joined! Check out the slides by @vllm_project DarkLight1337 with tjtanaa / @EmbeddedLLM V1 is here: faster startup, stronger CI & perf checks. Scaling MoE: clear Expert Parallelism (EP) setup for single/multi-node +…
vLLM is proud to support the great Kimi update from @Kimi_Moonshot , better tool-calling, longer context, and more! Check the deployment guide at huggingface.co/moonshotai/Kim… 🔥
vLLM is proud to support the great Kimi update from @Kimi_Moonshot , better tool-calling, longer context, and more! Check the deployment guide at huggingface.co/moonshotai/Kim… 🔥 https://t.co/ysxqSJVTBU
Amazing blogpost from @gordic_aleksa explaining internals of vLLM😍
Amazing blogpost from @gordic_aleksa explaining internals of vLLM😍
Go LMCache 🚀
Big energy at the @vllm_project Meetup in Singapore! WEKA’s Ronald Pereira shared how NeuralMesh Axon + Augmented Memory Grid can boost vLLM inferencing. Shoutout to @EmbeddedLLM + @AMD for the strong collaboration.
🚀 Exciting news: DeepSeek-V3.1 from @deepseek_ai now runs on vLLM! 🧠 Seamlessly toggle Think / Non-Think mode per request ⚡ Powered by vLLM’s efficient serving — scale to multi-GPU with ease 🛠️ Perfect for agents, tools, and fast reasoning workloads 👉 Guide & examples:…
🚀 Exciting news: DeepSeek-V3.1 from @deepseek_ai now runs on vLLM! 🧠 Seamlessly toggle Think / Non-Think mode per request ⚡ Powered by vLLM’s efficient serving — scale to multi-GPU with ease 🛠️ Perfect for agents, tools, and fast reasoning workloads 👉 Guide & examples:… https://t.co/AiiFbGH8Go
Great example from @skypilot_org showing how to use vLLM to serve Kimi K2 from @Kimi_Moonshot 😀
Great example from @skypilot_org showing how to use vLLM to serve Kimi K2 from @Kimi_Moonshot 😀
🚀 GLM-4.5 meets vLLM @Zai_org 's latest GLM-4.5 & GLM-4.5V models bring hybrid reasoning, coding & intelligent agent capabilities—now fully supported in vLLM for fast, efficient inference on NVIDIA Blackwell & Hopper GPUs! Read more 👉 blog.vllm.ai/2025/08/19/glm…

John M @Piedrasai
45 Followers 116 Following Rogue engineer, adrenaline junkie, nuff said #AI #LLM #Bitcoin #Blockchain #Crypto
Puruvamitra @kinematronicss
1 Followers 21 Following
Chris Royal @CJRoyal9905
0 Followers 58 Following
Virginia @Hawtaw741
53 Followers 2K Following
Ievwamfor @Ievwamfor59949
17 Followers 874 Following
tanjiro_komado @TanjiroKom92070
0 Followers 7 Following
atony stephens @AtonyStephens
1 Followers 30 Following
Srinu @seeenu2003
2 Followers 106 Following
Zelda @Rortaur7759
42 Followers 2K Following I’m too busy working on my own grass to notice if yours is greener.
Abhay @capabhay
95 Followers 716 Following PhD in Intelligent Systems @ISPPitt | Prev. @UofAInfoSci, @iit_tirupati
Flora @Voorpi0852219
51 Followers 2K Following The strongest actions for a woman is to love herself, be herself, and shine amongst those who never believed she could.
SupermanSpace 𝕏 @superman_space
904 Followers 2K Following Tech Researcher - Gamer - Open Searcher 😎 ||Fun Fact|| 📊Numbers leads to 👀 Bias.
MarjorieCooke @38GTj152ZFUil
16 Followers 486 Following
Lorelei @Reeba922
45 Followers 2K Following
Mihai Constantin @CoMiK6891
60 Followers 2K Following
Peramanathan Sathyamo... @Peramanathan
467 Followers 5K Following Senior Software Engineer, Dad in break to dust off and to get ready full swing for next chapters
Sivakumar Chidambaram @SivakumarChida1
5 Followers 254 Following
Sai Ruthvik @SaiRuth03703659
4 Followers 2K Following
arion das @ArionDas
839 Followers 8K Following gen ai intern @Techolution_com || research @ aiisc, usc || author @naacl || reviewer @aclmeeting, aia @COLM_conf, mti_llm @ NeurIPS
BacktestAlpha🇺🇸 @Tieootom950062
56 Followers 2K Following 15-30% Monthly | 2 High-Conviction Stocks.Short-Term Gains: 15-20% in Days/Weeks.DM "JOIN" for WhatsApp Alerts. Live Trade Signals • Market Analysis
Anup @code_moji
25 Followers 927 Following
🔯 Balanced Acceler... @AccBalanced
9K Followers 9K Following AI Factories. Balanced Accelerationist. WEKA, CNCF k8s founding board, Post-PKI.
mengwang @poker901115
42 Followers 1K Following
Sean Chang @sean_chang76967
1 Followers 64 Following
Gowtham Kumar Reddy @Gowtham1926
9 Followers 68 Following
IYIMOGA JOSEPH NANA @IyimogaNana
90 Followers 1K Following A lover of God. An aspiring Artificial Intelligence/Machine Learning Engineer.
Koge @Koge754219
132 Followers 3K Following
LillianSilas @KZILe40T2ku6K
12 Followers 1K Following
!.! @xypyth
56 Followers 5K Following
Hamzé 🦀 @Hamzeml
3K Followers 7K Following I write the bugs that future AIs will be paid to fix. AI Maximalist & Architect of Artisanal Technical Debt! Rust 🦀 supremacy!
Ahmed Omar @Ahmed2Omar20
2 Followers 77 Following
murphy @murphy000912
5 Followers 41 Following
lipi @lipiisme
161 Followers 2K Following Co-founder of xinference Amateur developers, #AI enthusiasts #xinference
Nilesh Kokane @nils360
3 Followers 596 Following
jimin Lee @jimin_lee92991
0 Followers 48 Following
Rahul Atlury @atlury
30 Followers 663 Following I am an electronics engineer with interests in preserving the knowledge of yesteryears for future generations
Violet @FlyWY2H6Iasxp6D
31 Followers 1K Following
Amara @Twortaln9490
37 Followers 2K Following I’m not a one in a million kind of girl, I’m a once in a lifetime kind of woman.
You Jiacheng @YouJiacheng
9K Followers 2K Following a big fan of TileLang 关注TileLang喵!关注TileLang谢谢喵! https://t.co/utshC0jrCO 十年老粉
Zach Wilson @EcZachly
48K Followers 1K Following Founder @ https://t.co/CWvLDHU2Lx $150k/month | https://t.co/F5VqLpyMZn $5k/month | ADHD | 10 yrs big data experience | ex @meta, @netflix, and @airbnb
Edward Z. Yang @ezyang
14K Followers 1K Following I work on PyTorch at Meta. Chatty alt at @difficultyang.
Michael Rabone @michaelrabone
15K Followers 807 Following Daily AI resources and educational posts. Support me with a coffee, patreon or purchase my style code bundle. Links are in my Linktree. DM's are open.
Junchen Jiang @JunchenJiang
413 Followers 319 Following CS Prof @ UChicago https://t.co/U01oOWGnip (Fast distributed LLM inference) https://t.co/hoetjwXKIt (Best KV cache layer)
Maxsun Official @MaxsunOfficial
8K Followers 40 Following Focus on quality, focus on excellence.💚 | Official X Maxsun Global Account | Product & Collaboration request, please DM us or visit: https://t.co/CRIOIN1L6F
Max Ng @maxnghello
61 Followers 53 Following AI theoretical research. Author of 1st Latent reasoning paper | Teacher-student one-model | Chi-ENG LLM 🏆#1@CIFAR-100|#2@CIFAR-10|#1@OpenWebText2022
Choong-Huei Seow (C.H... @CHViewpoints
240 Followers 663 Following @MIT_alumni @MIT @ChicagoBooth AI/Computer scientist, portfolio manager, digitization, crypto and tokenization evangelist. Lifelong learner and explorer.
WEKA @WekaIO
3K Followers 2K Following NeuralMesh™ by WEKA® - The world's only storage system purpose-built for AI. Accelerate performance, deploy anywhere, grow stronger with scale.
Linus LinusMediaGroup @linusgsebastian
592K Followers 71 Following Father of children, maker of YouTube videos, player of badminton, Canadian. Yes I play FAF.
Binyuan Hui @huybery
35K Followers 662 Following 🥝 Building Qwen @Alibaba_Qwen. Focus on CodeLLM (Pre-training and Post-training) / Reasoning / Agent. Ideas my own.
Daniel Romero @HyperTechInvest
16K Followers 413 Following Growth investor // Daily stock commentary // Investing in the future, for the future
NeRF&3DGS and Beyond @jasonmeyang
77 Followers 299 Following Author of NeRF/3DGS book, maintainer of https://t.co/4GaK9FgScS and https://t.co/9yFNZ6NVIR
Zhenjun Zhao @zhenjun_zhao
6K Followers 1K Following PhD from @CUHKofficial. 3D vision, SLAM, SfM, Image Matching (https://t.co/ek376Drwvu).
Gabriele Berton @gabriberton
7K Followers 1K Following Postdoc @Amazon working on VLM - ex @CarnegieMellon @PoliTOnews @IITalk
Khanchouch Faicel @KhanchouchFaic1
985 Followers 7K Following
Nobuhiro Sue @nobusue
3K Followers 4K Following Red Hatでテレコム担当SAのマネージャをやってます。日本のエンタープライズシステムをCloud Nativeにするべく日々活動中です。データ連携とかEDAとかGenAIとか雑食です。ポストはすべて個人としての発言です。
MJ @mjtechguy
425 Followers 263 Following I drink the coffee/beer and build the things. Common sense enthusiast. #devsecops #ai #opensource
Chaoyue He @CYH37
355 Followers 4K Following AI Research Scientist@NTUsg 🇸🇬|LLM|Sustainability|GenRecSys|AGI|Productivity|UBI & Fortune|Disease Cures|Xi'an, China🇨🇳|Bodybuilder💪|Caregiver🤲❤
Gabbar @GabbbarSingh
1.5M Followers 1K Following Founder @GingerMonkeyIN | Co-founder @KnotDating | Columnist - Hindustan Times, India Today
Maxime Rivest 🧙... @MaximeRivest
4K Followers 786 Following Easy LLM context for all! ✨pip install attachments Inspired by: ggplot2, DSPy, claudette, dplyr, OpenWebUI! Follow for: API design, AI, and Data 🐍CC📜🛠 maker
TokenVisor @TokenVisor
1 Followers 3 Following
llm-d @_llm_d_
327 Followers 2 Following llm-d: a Kubernetes-native high-performance distributed LLM inference framework
Charles Qi @charles_rqi
10K Followers 277 Following Autopilot and AI @Tesla | Prev: Research Scientist & Manager @Waymo | Postdoc @metaai (FAIR), PhD @Stanford
Marc Sun @_marcsun
2K Followers 454 Following Machine Learning Engineer @huggingface Open Source team
Hunter Gerlach @HunterGerlach
548 Followers 2K Following Senior Principal Architect @RedHat | Exploring the many facets of modern software engineering (...and every once in a while: sports). Build better software.
MiniMax (official) @MiniMax__AI
18K Followers 11 Following Our mission is to build a world where intelligence thrives with everyone. MiniMax Agent: https://t.co/XzaTmAos0V
Pratap Chirumamilla @chpratap
372 Followers 895 Following Husband and proud father of twin boys, Manager at AMD. Focusing on GPUs and Radeon Software. #ProudAMDer Opinions are my own
Athanasios Moragianni... @AthanasiosMora1
286 Followers 617 Following
Dr Audrey LaChelle Si... @pearlz36
1K Followers 4K Following Serial Entrepreneur Eternal Wife 1 and CEO Of All Elon Musk Companies Forever Owner Of Tesla, X, and Indigenous Nation Engineering Physics Major
ed callway #ItsMerryT... @mrallinwonder
266 Followers 340 Following Ultrasonic medical equipment, then decades of PC based digital entertainment: sound cards, GPUs, the All-In-Wonder TV tuner series, HDTV design, HDR, VR, next??
AiXander @android_in_ua
80 Followers 902 Following #Android, #AR #ARCore #Vuforia, #Kotlin #Swift #python #pytorch #ML #Computer Vision #AI
Ron Williams @McclaneDet
1K Followers 686 Following Founder Kindo (Usable Machines), WhiteRabbitNeo, LP First Close. CSO 3X at Bird, Clover Health, Riot Games, & Founder Zeevex. Veteran.
Casper Larsen @CasperL62080836
275 Followers 312 Following
Hey nina @Sir_M_Charles
22K Followers 7K Following The first person in his village to have a Twitter account, my timeline reflects the collective voice and views of my fellow villagers . O.P.F.C
gr1.61803 @foscraft
20K Followers 1K Following Data. One line of code at a time. Learning continues.