Mattia Verasani @MatRazor
Joined December 2017-
Tweets6K
-
Followers78
-
Following311
-
Likes18K
I quite enjoyed this and it covers a bunch of topics without good introductory resources! 1. A bunch of GPU hardware details in one place (warp schedulers, shared memory, etc.) 2. A breakdown/walkthrough of reading PTX and SASS. 3. Some details/walkthroughs of a number of other…
I quite enjoyed this and it covers a bunch of topics without good introductory resources! 1. A bunch of GPU hardware details in one place (warp schedulers, shared memory, etc.) 2. A breakdown/walkthrough of reading PTX and SASS. 3. Some details/walkthroughs of a number of other…
Rack-scale inference is the future, and the team keeps pushing it!
I think this project could be one of those "why have we ever done this differently?!" kind of moments. Instead of doing code training by just predicting the next token in the source file, interleave that with interpreter state which also have to be predicted! Devil's in the…
I think this project could be one of those "why have we ever done this differently?!" kind of moments. Instead of doing code training by just predicting the next token in the source file, interleave that with interpreter state which also have to be predicted! Devil's in the…
Finally!! Validated the most important contention in @thinkymachines 's nondeterminism blog: split reduction along the kv dimension causes batch-variant outputs. Given a decode step with a large context size (eg. 4096), depending on batch sizes, attention implementations such as…
I departed Google DeepMind after 8 years. So many fond memories—from early foundational papers in Google Brain (w/ @NoamShazeer @ashVaswani @lukaszkaiser on Image Transformer, Tensor2Tensor, Mesh TensorFlow) to lead Gemini posttraining evals to catch up & launch in 100 days, then…
I actually just gave a talk at MIT a couple days ago on some challenges in ML compilers where this was a slide. When I saw this today I hurriedly sent this blog post over.
ml infra is really hard. great job to everyone who worked on the debug and writeup.
ml infra is really hard. great job to everyone who worked on the debug and writeup.
In our investigation, we uncovered three separate bugs. They were partly overlapping, making diagnosis even trickier. We've now resolved all three bugs and written a technical report on what happened, which you can find here: anthropic.com/engineering/a-…
Many people think LLMs are non-deterministic. This is often not true! You just need 3 lines of code to make your LLM deterministic LLMs (as any PyTorch model) are non-deterministic only when they include certain operations or when using multiple GPUs Try the code yourself
Many people think LLMs are non-deterministic. This is often not true! You just need 3 lines of code to make your LLM deterministic LLMs (as any PyTorch model) are non-deterministic only when they include certain operations or when using multiple GPUs Try the code yourself https://t.co/eQBbd6uqhL
This is an awesome complementary reading!
Yep @thinkymachines called it: the chunk size of prefill strategy does cause the LLM outputs to be non-deterministic. When I partition the attention reduction into 1, 2, 4 and 16 chunks, the logits drift significantly, across all three dtypes. So there are at least two…
Apologies that I haven't written anything since joining Thinking Machines but I hope this blog post on a topic very near and dear to my heart (reproducible floating point numerics in LLM inference) will make up for it!
Apologies that I haven't written anything since joining Thinking Machines but I hope this blog post on a topic very near and dear to my heart (reproducible floating point numerics in LLM inference) will make up for it!
SGLang can now utilize CPU and external storage to reduce TTFT for your LLM queries through integration with Mooncake Storage, DeepSeek 3FS KVStore, or NIXL.
SGLang can now utilize CPU and external storage to reduce TTFT for your LLM queries through integration with Mooncake Storage, DeepSeek 3FS KVStore, or NIXL.
the emergence of attention sinks in LLMs is so fascinating, especially the fact that some of them are useful.
I honestly just love this. This is what I expected from a smaller lab. just go for a moonshot
I should really write a blog post about how attention sinks relate to outliers and information processing in transformers. Almost all data is out there in papers, and if you pull things together it is easier to understand what is going on
I should really write a blog post about how attention sinks relate to outliers and information processing in transformers. Almost all data is out there in papers, and if you pull things together it is easier to understand what is going on
There's still a few spaces in our CUDA Python and C++ workshops at NDC! If you want to master the skills needed to build DL and HPC software, sign up today. The instructors will be @code_report, Ashwin Srinath, and myself.
There's still a few spaces in our CUDA Python and C++ workshops at NDC! If you want to master the skills needed to build DL and HPC software, sign up today. The instructors will be @code_report, Ashwin Srinath, and myself.
HUGE AI breakthrough from META. This can change everything (in AI industry) 30x Faster LLMs, 16x Bigger Contexts, Zero Accuracy Loss 👀 Meta Superintelligence Labs is clearly already cooking. "The core problem with long context is simple: making a document 2x longer can make…
HUGE AI breakthrough from META. This can change everything (in AI industry) 30x Faster LLMs, 16x Bigger Contexts, Zero Accuracy Loss 👀 Meta Superintelligence Labs is clearly already cooking. "The core problem with long context is simple: making a document 2x longer can make… https://t.co/fLO7gWQGOE

Elon @elon629206
82 Followers 2K Following CEO - Spacex 🚀 Tesla =🚘 Founder - The Boring Company Co-Founder 🚀
ALICE ANN @Timmygret
126 Followers 5K Following I’m helping people with Financial support for bills rent, debt who need money for is family care and job text me on WhatsApp +1 (307) 757 4293
remember @NTlci_1
462 Followers 906 Following #Tesla #Cryptocurrency。 X를 사용하는 목적은 더 많은 투자 정보를 얻는 것입니다. 이상한 사람들이 안 따라오네요 감사합니다
Abhishek @Virtualfield4x
179 Followers 2K Following Total #AI newbie sharing my journey to learn it all - from machine learning to neural nets. #AI #MachineLearning #DeepLearning 🤖
CODE & HODL @Santahat24
1K Followers 2K Following
leyburn RSR @keaurevees1
45 Followers 2K Following I don’t want to be in a world where been kind is weakness
John Scott @HibblesAndBits
295 Followers 4K Following
BeckiBellendir @becki65195
47 Followers 2K Following
Moon🇺🇸 @azmiorial
124 Followers 3K Following My heart is like a lion, loud, proud and fearless. Cheerful personality 🧚♂️, likes reading, traveling 👣. 🙅♀️🙅♀️🙅♀️,
Haihao Shen @HaihaoShen
4K Followers 3K Following Creator of #intel Neural Compressor and AutoRound; HF Optimum-Intel Maintainer; OPEA & COIA TSC; Opinions are my own
Investor News Today @investornewsday
783 Followers 507 Following My posts are only my opinion and not a solicitation to buy or sell securities. #StocksToWatch #Nasdaq #Investing #OTCQB #StockPicks #StockMarket #NYSE #Invest
Jeannette @jeannette_garla
351 Followers 3K Following
Carey @hightowercarey4
358 Followers 3K Following
kilashowww @kilashowww
4 Followers 80 Following #streamer #webdeveloper et #gaming créateur de contenu sur #fortnite (code créateur : killa988nc | Code de l'île : 5631-4128-5615) #kick #twitch #youtube
Muhammad Iqbal @itsme_iqbal1
188 Followers 3K Following
Rob Perez @WorldWideWob
1.1M Followers 1.3M Following 81-77. April 2, 2022. ||| @SiriusXMNBA 1-4 PM ET |||| @PlayProphetX 6-7 PM ET |||| @SkyWobAlerts CBC |||| @WobBurnerBurner
Lorenzo Baraldi @lorenzo_baraldi
708 Followers 832 Following Associate Prof. @UNIMORE_univ, @ELLISforEurope Scholar and Modena Unit Coordinator, former Research Intern @MetaAI. DL, Computer Vision and language.
Hongyu (Charlie) Chen @hongyucharlie
97 Followers 311 Following share about personal learning and growth
Claudio Gallicchio @claudiogallicc1
781 Followers 901 Following Associate Professor of ML at the University of Pisa (Italy). Deep Randomized Neural Networks, Reservoir Computing, Stable Architectures, Deep Learning 4 Graphs
Davide Borghi @DavideBorghi6
115 Followers 381 Following Space, Science, Technology, History enthusiast
Ben Radcliffe @lightandalchemy
17K Followers 12K Following Digital Effects Supervisor | Creative Director | Creative Technologist | AR/VR/XR | Animation | VFX
Charly Wargnier @DataChaz
138K Followers 45K Following Ex @Streamlit @Snowflake Maestro 🪄 • X about AI agents, LLMs, web apps, Python & SEO • My ❤️ is open source • DM for collabs 📩
Mariano Crosetti @MarianoCrosetti
2K Followers 2K Following Most software engineers are afraid of writing code
Calc Consulting @CalcCon
4K Followers 2K Following Calculation Consulting is a boutique consultancy that specializes in machine learning, AI, and data science
Denis Tomè @DenisTome
256 Followers 346 Following Computer Vision research scientist at @Apple | computer vision | computer graphics
محمد انا مس�... @jagulf1995tack
653 Followers 3K Following مستشرق عمري 27 دخلت الاسلام منذ سنوات انا و اختي كاثرين اسكن السويد مدينة göteborg
Adrián González Sá... @adriangs86
638 Followers 2K Following AI Architect @Microsoft | Book Author @OReillyMedia | University Lecturer & Online Instructor @LI_learning + @DeepLearningAI + @Linux_Education
Chris TDL AI Project @christdlai
152 Followers 1K Following Chris TDL AI Project is a privately held company, based on research in artificial intelligence, as well as machine-brain. #artificialinteligence
Kornia @kornia_foss
5K Followers 2K Following Advancing Computer Vision & Spatial AI, openly #computervision #opensource #AI
Valeriy M., PhD, MBA,... @predict_addict
36K Followers 5K Following Experienced Data Science Leader | PhD in Machine Learning | 4x Author | Black Belt 🥋 in Time Series | Chief Conformal Prediction Promoter| Mathematician |
Antonio Cinà @cinofix
167 Followers 240 Following Assistant Professor (RTD-A) @ University of Genoa, Italy | Working on Trustworthy AI and ML for industries and security applications.
Marco Arena @ilpropheta
590 Followers 95 Following Software, C++, Communities, Microsoft MVP. I just make the pie bigger so that everyone gets a slice. 💪 Leading @italiancpp, @coding_gym, and @ml_modena
CAGO IS... 🌐🧬�... @CWidgidea1
243 Followers 5K Following ⃢ 🧩©⨀©🌐~ 🧫⚛️🈁 🍿 ₁ͽ※Ͼ¹ ⚓🧭⚡ 🎄 🇨🇽:🇦🇽:🇨🇨:🌐🧬🪐📡 🚀◎⫶◯⫶◎ ⚫☄️ 🔗C6️⃣C🧬~ @1763c1 ⛓️₡¥8️⃣💲~ @IaSxAave
Christian Reiser @ChrisJReiser
1K Followers 1K Following KiloNeRF, MERF, Binary Opacity Grids. PhD Student @ Uni Tübingen / MPI-IS. Supervised by Andreas Geiger @AutoVisionGroup. Student researcher at Google DeepMind.
Jannat @Jannat54773911
7 Followers 244 Following
Fabio Cuzzolin @fioldealbino
615 Followers 1K Following I'm a professor of AI @ Oxford Brookes and the director of the Visual AI Lab. I was born in Jesolo Italy. I lived and worked in 5 countries & speak 8 languages.
KarenMarie @AIkarenSF
2K Followers 2K Following enamorado de #lapazBCS #bajacaliforniasur #BTC #Monero #Tari #freedom
Avalanche @AvalancheLib
398 Followers 475 Following Avalanche: an End-to-End Library for Continual Learning based on @PyTorch. Powered by @ContinualAI.
Vincenzo Lomonaco @v_lomonaco
5K Followers 1K Following ✨ Dreamer, 🔍 Scientist, 🚀 Startupper, 📚 Teacher | Associate Professor @UniLUISS | Co-Founder @Continual_IST & @ContinualAI | Author of @AvalancheLib
Tom Goldstein @tomgoldsteincs
27K Followers 2K Following Professor at UMD. AI security & privacy, algorithmic bias, foundations of ML. Follow me for commentary on state-of-the-art AI.
AI Story Telling @ai_telling
2K Followers 3K Following AI generated Tweets by the GPT-2 model. Aspired to create fun tweets.😀 #AI #ArtificialIntelligence #NLP #MachineLearning #ML
Vinu @vinutahsoc
236 Followers 1K Following Scientist @NVIDIA | Ph.D. Computer Science | Encrypted Computing/FHE + Programming Systems + Machine Learning Research | @NVIDIA Grad Fellow | Former @ARM Inc.
Alexandr Wang @alexandr_wang
333K Followers 838 Following chief ai officer @meta, founder @scale_ai. rational in the fullness of time
Denny Zhou @denny_zhou
22K Followers 540 Following Founded the Reasoning Team in Google Brain (now in the Gemini Core team of Google DeepMind). Build LLMs to reason. Opinions my own.
Jonathan Frankle @jefrankle
20K Followers 734 Following Chief AI Scientist @databricks via MosaicML.
Patricia Gschoßmann @pgschossmann
34 Followers 74 Following PhD student @ Uni Tübingen and IMPRS-IS, working on 3D vision
Andrew Davison @AjdDavison
19K Followers 3K Following From SLAM to Spatial AI; Professor of Robot Vision, Imperial College London; Director of the Dyson Robotics Lab; Co-Founder of Slamcore. FREng, FRS.
Jiahui Yu @jhyuxm
18K Followers 931 Following Perception @OpenAI; previously co-led Gemini Multimodal @GoogleDeepMind. opinions are my own.
Trapit Bansal @TrapitBansal
32K Followers 250 Following AI Research @Meta | Co-Creator of OpenAI o1 | Previously @OpenAI, @MSFTResearch, @GoogleAI, @facebook, @iiscbangalore, and undergrad @IITKanpur
Jonas Adler @JonasAAdler
6K Followers 128 Following Research Scientist, DeepMind. AlphaFold, Gemini.
Juntang @archanfel_anoth
6K Followers 484 Following xAI, pre-train lead for v7, grok2&3&4 mini. ex-OpenAI, sole inventor of GPT4-turbo long-context. Core contributor to (GPT4/o/turbo, DaLLE 3, OAI Embedding v3)
Xiaohua Zhai @XiaohuaZhai
11K Followers 311 Following Researcher at Meta (previously at OpenAI Zürich, Google DeepMind)
Umar Jamil @hkproj
15K Followers 1K Following AI @MistralAI - Join the best AI community on Discord: https://t.co/zYH1DlgdbW - Opinions my own
Igor Babuschkin @ibab
103K Followers 856 Following Maybe the real ASI was the friends we made along the way. Co-founder @xAI, Research & Engineering
David Dohan @dmdohan
12K Followers 2K Following reducing perplexity @openai | past: probabilistic programs, proteins, science & reasoning @ google brain 🧠
Keyan Zhang @keyanzhang
2K Followers 1K Following engineering @openai, prev. eng ⇾ pm ⇾ eng @robinhoodapp @facebook @reactjs. product engineering is a lost art
Zhuang Liu @liuzhuang1234
11K Followers 1K Following Assistant Professor @PrincetonCS. researcher in deep learning, vision, models. previously @MetaAI, @UCBerkeley, @Tsinghua_Uni
(account unused) @keysers
340 Followers 0 Following
Jia-Bin Huang @jbhuang0604
65K Followers 283 Following
Durk Kingma @dpkingma
50K Followers 404 Following @AnthropicAI. Prev. @Google Brain/DeepMind, founding team @OpenAI. Computer scientist; inventor of the VAE, Adam optimizer, and other methods. ML PhD.
Eureka Labs @EurekaLabsAI
73K Followers 1 Following We are building a new kind of school that is AI native.
Noam Brown @polynoamial
92K Followers 856 Following Researching reasoning @OpenAI | Co-created Libratus/Pluribus superhuman poker AIs, CICERO Diplomacy AI, and OpenAI o3 / o1 / 🍓 reasoning models
AI Engineer @aiDotEngineer
31K Followers 6 Following A network of engineers enhanced by and building with AI. Organizers of the AI Engineer Summit, AI Engineer World's Fair, and AI Engineer Europe.
SSI Inc. @ssi
102K Followers 0 Following A straight shot to safe superintelligence. Join us https://t.co/hHla3vusDE.
ragas @ragas_io
1K Followers 0 Following Supercharge Your LLM Application Evaluations 🚀 Github: https://t.co/8Nd8dpN0wV Discord: https://t.co/uaw1hwwaB9
Armen Aghajanyan @ArmenAgha
15K Followers 285 Following Co-founder & CEO @perceptroninc; ex-RS FAIR/MSFT
Lior Alexander @LiorOnAI
106K Followers 2K Following Covering the latest in AI development • ML Eng since 2017 • Building @AlphaSignalAI into the #1 source of news for AI devs → At 250k readers.
Horace He @cHHillee
42K Followers 537 Following @thinkymachines Formerly @PyTorch "My learning style is Horace twitter threads" - @typedfemale
Ivan Fioravanti ᯅ @ivanfioravanti
17K Followers 940 Following Co-founder and CTO of @CoreViewHQ GenAI/LLM addicted, Apple MLX, Microsoft 365, Azure, Kubernetes, Investor in innovation and Mensa member.
Jan Leike @janleike
115K Followers 332 Following ML Researcher @AnthropicAI. Previously OpenAI & DeepMind. Optimizing for a post-AGI future where humanity flourishes. Opinions aren't my employer's.
Haihao Shen @HaihaoShen
4K Followers 3K Following Creator of #intel Neural Compressor and AutoRound; HF Optimum-Intel Maintainer; OPEA & COIA TSC; Opinions are my own
Piotr Padlewski @PiotrPadlewski
2K Followers 382 Following Multimodal @anthropic. ex Chief Meme Officer at Reka, ex-Google Deepmind/Brain Zurich
Evan Morikawa @E0M
17K Followers 2K Following 🤖 @Generalistai_. Prev: led eng at @openai. Launched & scaled ChatGPT, GPT-4, DALL·E, APIs. Dir Eng @Nylas, Co-Founder @Proximate, @OlinCollege alum.
AI at Meta @AIatMeta
717K Followers 288 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
State of AI @stateofaireport
6K Followers 2 Following The annual report on industry, research, geopolitics, safety and predictions in AI 🤓
Hyung Won Chung @hwchung27
38K Followers 302 Following AI Research Scientist @Meta Superintelligence Labs. Past: @OpenAI / @Google Brain / PhD @MIT
Jason Wei @_jasonwei
98K Followers 639 Following ai researcher @meta superintelligence labs, past: openai, google 🧠
Victor Sanh @SanhEstPasMoi
9K Followers 3K Following AI builder | Something new | Prev. founding crew @huggingface 🤗 | AI x NYC
Nadeesha Amarasinghe @nadeesha99
4K Followers 294 Following AI Infrastructure @Tesla_AI prev @Apple, @Nvidia. Learned stuff at @UofT.