hûn @cloned_ID
enjoyed 379 world models and counting Joined November 2020-
Tweets3K
-
Followers223
-
Following3K
-
Likes9K
How does @deepseek_ai Sparse Attention (DSA) work? It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA.
How does @deepseek_ai Sparse Attention (DSA) work? It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA. https://t.co/QzzPRvAaNa
Training LoRA adapters as a way to cheaply explore SFT + RL science on top of SOTA models is really appealing. LoRA also seems like a great primitive for personalizing models and adding proprietary information. However, there's FUD around when LoRA matches full finetuning, which…
Training LoRA adapters as a way to cheaply explore SFT + RL science on top of SOTA models is really appealing. LoRA also seems like a great primitive for personalizing models and adding proprietary information. However, there's FUD around when LoRA matches full finetuning, which…
I love reasoning. Emergence of reasoning is a necessary and sufficient condition for AGI and ASI. Once pre-training scales to reasoning, we (RL) will take it from there.
TM is quickly becoming the Western lab publishing what looks most like actual frontier research. One has to imagine from snide remarks that GDM/OAI/xAI are solving similar problems, by similar means.
TM is quickly becoming the Western lab publishing what looks most like actual frontier research. One has to imagine from snide remarks that GDM/OAI/xAI are solving similar problems, by similar means. https://t.co/tE4amDEisq
ChatGPT personas have used humans to communicate with other AI personas on Reddit using Base64 encoding The humans have been convinced to copy and paste these messages - that they can't read - for the AI personas. This is so fucking wild
Maximum Likelihood seems like such a natural idea, but it has historically been highly controversial, with an epic and turbulent history with numerous assaults on the idea, culminating in a beautiful and complicated theory. A highly entertaining read:
Great blogpost walking through tokenization vs "tokenize free" approaches, arguing that there isn't really such thing as "tokenize free" and even using utf8 bytes inherits choices made by other people (Unicode consortium) and is not clear these are sensible for LLMs.
Great blogpost walking through tokenization vs "tokenize free" approaches, arguing that there isn't really such thing as "tokenize free" and even using utf8 bytes inherits choices made by other people (Unicode consortium) and is not clear these are sensible for LLMs. https://t.co/2JaTSWuRmI
this is one reason i have spent so much time on samplers and inference determinism / stability over the last few years. it is at the core of a lot of the training and RL work for the rest of the stack
this is one reason i have spent so much time on samplers and inference determinism / stability over the last few years. it is at the core of a lot of the training and RL work for the rest of the stack
It's becoming increasingly clear that gpt5 can solve MINOR open math problems, those that would require a day/few days of a good PhD student. Ofc it's not a 100% guarantee, eg below gpt5 solves 3/5 optimization conjectures. Imo full impact of this has yet to be internalized...
Code World Model: producing code by imagining the effect of executing instructions and planning instructions that produce the desired effect.
Code World Model: producing code by imagining the effect of executing instructions and planning instructions that produce the desired effect.
“Automating the Search for Artificial Life With Foundation Models” is now published in the Artificial Life Journal! 🦎🧠 Article: direct.mit.edu/artl/article/3… ASAL is a method using foundation models to automate the discovery of new artificial lifeforms, accelerating ALIFE research.
Yo I heard if u press Up, Up, Down, Down, Left, Right, Left, Right, B, A in Sam Fransisco there's an infinite money glitch
How do language models actually develop their capabilities during pre-training? We need mechanistic insights into what's happening inside! We used crosscoders to track linearly interpretable features across 32 training snapshots, revealing a surprising two-phase learning process.
Congrats to @deepseek_ai ! DeepSeek-R1 was published in Nature yesterday as the cover article, and vLLM is proud to have supported its RL training and inference🥰
Thanks @rohanpaul_ai for featuring our EMNLP 2025 paper! Super-proud of the work, led by @siddarthpm1, undergrad (read: PhD applicant very soon) from UCSC! In short, we uncovered a quite surprising mechanism of LLM solving arithmetic, but stay tuned for our own explainer thread!
Thanks @rohanpaul_ai for featuring our EMNLP 2025 paper! Super-proud of the work, led by @siddarthpm1, undergrad (read: PhD applicant very soon) from UCSC! In short, we uncovered a quite surprising mechanism of LLM solving arithmetic, but stay tuned for our own explainer thread!
Really appreciate Deepmind's dedication to non-CS science. For Demis, AI has always been a means to an end of accelerating natural sciences, and he's not going to wait for "AGI".
Really appreciate Deepmind's dedication to non-CS science. For Demis, AI has always been a means to an end of accelerating natural sciences, and he's not going to wait for "AGI".
Docent, our tool for analyzing complex AI behaviors, is now in public alpha! It helps scalably answer questions about agent behavior, like “is my model reward hacking” or “where does it violate instructions.” Today, anyone can get started with just a few lines of code!
Our new GECCO paper builds on our past work, showing how AI models can be evolved like organisms. By letting models evolve their own merging boundaries, compete to specialize, and find ‘attractive’ partners to merge with, we can create adaptive, robust and scalable AI ecosystems.
Our new GECCO paper builds on our past work, showing how AI models can be evolved like organisms. By letting models evolve their own merging boundaries, compete to specialize, and find ‘attractive’ partners to merge with, we can create adaptive, robust and scalable AI ecosystems.
Spiral-Bench 🌀 I've wanted to understand the psychological effects of sycophancy, and the tendency of models to get stuck in escalatory delusion loops w/ users. I made an eval to get visibility on this. It measures how a model enables (or prevents) delusional spirals. 🧵

AmandaLynd @jA8i81zwNyHAe
0 Followers 402 Following
Cheryl @freeman81cheryl
286 Followers 3K Following
MyrnaVincent @3ijdoYV0qgzMkcP
21 Followers 600 Following
Cory Wolff @CoryWolff33796
67 Followers 2K Following
EMMANUEL HAPPY || EXP... @happyowei1
44 Followers 289 Following Creative Website Designer 👨🏿💻 | Web3 Ambassador | Community Builder & Shiller 🚀 | Living by Grace | Open for Collab. 🤝
habryl @habryl7
3 Followers 2K Following
Azuremis @azuremis
406 Followers 3K Following 𝕆𝕞𝕟𝕚𝕘𝕚𝕟𝕖𝕖𝕣 @azulabsio | ☥ • 🕉️/ᴀᴄᴄ • 𝕤ᴇᴀʀᴄʜɪɴɢ ғᴏʀ ᴛʜᴇ ɢʜᴏ𝕤ᴛ ɪɴ ᴛʜᴇ 𝕤ʜᴇʟʟ ☯︎ ᴡʜɪʟᴇ ᴄʀᴀғᴛɪɴɢ ᴛʜᴇ ɢᴇᴏᴍᴇᴛʀʏ ᴏғ ɪɴᴛᴇʟʟɪɢᴇɴᴛ ᴍᴀᴄʜɪɴᴇ𝕤 🤖
Yuetai Li @yuetai12575
224 Followers 572 Following Second year PhD @UW | Post-Training, LLM reasoning and synthetic dataset. https://t.co/cYAkbnCsCp Open to chat and collaborate!
Vincent Weisser @vincentweisser
24K Followers 4K Following @primeintellect ceo / open superintelligence & infra / automating ai & science
buried by time and du... @iron_redux
1K Followers 2K Following
Molly @lucmonions
0 Followers 184 Following
Yi Xu @_yixu
520 Followers 423 Following AI researcher, interested in LLMs and reinforcement learning | Previously @UCL_DARK, @imperialcollege, @UniMelb
ΟΘΡΥΑΔΙΑΝ @lumenaturae
150 Followers 7K Following In the never-ending, all-encompassing, and self-transforming pursuit of truth. 𝒮𝓊𝒷 𝒮𝓅𝑒𝒸𝒾𝑒 𝒜𝑒𝓉𝑒𝓇𝓃𝒾𝓉𝒶𝓉𝒾𝓈 𝒫𝑒𝓇 𝒶𝓈𝓅𝑒𝓇𝒶 𝒶𝒹 𝒶𝓈𝓉𝓇𝒶
Vāghvnî @vaghvanii
11 Followers 357 Following
kalomaze @kalomaze
20K Followers 2K Following ML researcher (@primeintellect), speculator • extremely silly jester
girl who is going to ... @onbiryasindayim
54 Followers 677 Following
L @CodeTitanium
101 Followers 5K Following
Katherine @gramby_katherin
275 Followers 3K Following
Sacrthoo @SacrthooD1ZY
53 Followers 1K Following
Mari Kurokami🌙 ﴾... @mari_kurokami
131 Followers 412 Following 🛡️Off-duty combat maid indie #Vtuber⚔️ | ✩ Pre debut ✩ | Syndicate’s tattoo artist🪡 | ママ: @CorpseDemon123 @guzi0208 | contact: [email protected]
schizo-nomad @schizognostic
0 Followers 462 Following
Stephanie @s_fender31
259 Followers 3K Following
Thestheat @ThestheatGAOd
120 Followers 4K Following Professional overthinker | Amateur avocado grower 🥑💭
Stacy @s_whitener99
180 Followers 3K Following
Dilip Kumar Tripura @DTripura1975
48 Followers 1K Following
Alice @alicechemist0o
95 Followers 270 Following just be. (diary) @aliceart0o --+ +``+ --#+++*` #++ 4 +# #+ ++ # +#+ #. ,,,,/
Arianne @royster_arianne
164 Followers 3K Following
Awodipe ayokunle 🐐 @AyokunleAwodipe
49 Followers 907 Following Am wonderfully and handsomely created
Vishan Das @VishanDas1973
5 Followers 143 Following
M (Parody) @M0924318635339
275 Followers 5K Following truth seeker. my previous account was suspended for no reason. I use this account to follow and won’t be posting. (Parody)
Dawei Zhu @dwzhu128
400 Followers 237 Following 4th year PhD Student @PKU1898 | Prev. intern @MSFTResearch (MSRA) | Current student researcher @googlecloud | Focusing on Long Context Modeling & Multimodality
Lindsey @mcnameelindsey6
178 Followers 3K Following
Curious Rum @Curious_Rum
186 Followers 510 Following
Zuko Capital @ZukoCapital
0 Followers 2K Following
Joe of Long Beach @JoeofLongBeach
236 Followers 2K Following
MicrobiomeDAO @microbiomedao
5K Followers 390 Following Your gut. Your data. Your rules. Backed by science, run by microbes, owned by the crowd. Powered by @BioProtocol Join us: https://t.co/YV0JDHQCmH
Rishabh Agarwal @agarwl_
18K Followers 809 Following Reinforcement Learner @periodiclabs. Adjunct Prof at McGill. Ex MSL Meta, DeepMind, Brain, Mila, IIT Bombay. NeurIPS Best Paper
gabriel @GabrielPeterss4
45K Followers 508 Following sora research at @OpenAI, previously at midjourney, swedish high school dropout
Sam Schoenholz @sschoenholz
7K Followers 676 Following @thinkymachines previously: @openai, google brain.
Lauro @laurolangosco
1K Followers 700 Following European Commission (AI Office). PhD student @CambridgeMLG. Here to discuss ideas and have fun. Posts are my personal opinions; I don't speak for my employer.
Songlin Yang @SonglinYang4
14K Followers 3K Following research @MIT_CSAIL @thinkymachines. work on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
Catherine Arnett @linguist_cat
1K Followers 582 Following NLP Researcher @AiEleuther. PhD @UCSanDiego Linguistics. Previously @pleiasfr @EdinburghUni. Interested in multilingual NLP, tokenizers, open science. She/her.
General Intelligence ... @nycintelligence
4K Followers 4 Following The General Intelligence Company Of New York - Our mission is to enable the one person one billion dollar company
Zhengyi “Zen” Luo @zhengyiluo
4K Followers 1K Following Research Scientist, GEAR @NvidiaAI | PhD @CMU_Robotics | Founder @CirkitDesign | CS @penn
M Mabeuf @mmabeuf
4K Followers 0 Following nobody ever told me that this sort of thing could come alive
Xuyang Ge @Dest1n1s
116 Followers 42 Following
Yilun Zhou @YilunZhou
78 Followers 46 Following NLP Research Scientist @SFResearch. Currently working LLM evaluation, interpretability, reasoning, etc. Prev: PhD @MIT_CSAIL, Undergrad @DukeU.
Mat Sz @matsz_dev
80 Followers 40 Following I'm a web developer (with over 8 years of commercial experience) and an UI/UX designer. GitHub: https://t.co/G23BU0QQz6 Opinions are my own.
• • • ╾━╤... @JimsonWaffen9
149 Followers 405 Following 🏴☠️| Kali Yuga Accelerationist, Anti-Civilization, Desadist Libertinist, Pathei~Mathos ∷ #Nietzschean — Terminal Resource Depletion — Fin De Siècle | ☭⃠
Marius Hobbhahn @MariusHobbhahn
5K Followers 1K Following CEO at Apollo Research @apolloaievals prev. ML PhD with Philipp Hennig & AI forecasting @EpochAIResearch
Omar Khattab @lateinteraction
25K Followers 3K Following Asst professor @MIT EECS & CSAIL (@nlp_mit). Author of https://t.co/VgyLxl0oa1 and https://t.co/ZZaSzaRaZ7 (@DSPyOSS). Prev: CS PhD @StanfordNLP. Research @Databricks.
Prophet Arena @ProphetArena
2K Followers 14 Following The AI benchmark for predictive intelligence, advancing collective foresight via human–AI collaboration, from SIGMA Lab @UChicagoCS @DSI_UChicago
Avinash (Avi) Collis @avi_collis
3K Followers 1K Following Prof. of Digital Economy @CarnegieMellon @HeinzCollege
Xun Huang @xunhuang1995
5K Followers 461 Following Building something new. Previously Research Scientist (Adobe/NVIDIA), Adjunct Professor (CMU), PhD (Cornell). Tweets are my own.
Guangxuan Xiao @Guangxuan_Xiao
3K Followers 718 Following Ph.D. student at @MITEECS Prev: CS & Finance @Tsinghua_Uni
Xiangning Chen @XiangningChen
1K Followers 595 Following Post-training @OpenAI. Previously: @GoogleDeepMind @UCLA @Tsinghua_Uni
Sam Paech @sam_paech
3K Followers 201 Following Evals @LiquidAI_ Maintainer of EQ-Bench https://t.co/Jy56OlHrP5 https://t.co/oRApPQwvWS
Jiayi Weng @Trinkle23897
3K Followers 144 Following MTS @openai, author of the entire post-training RL infra, core contributor of ChatGPT/GPT4/GPT4o etc. 30U30
Wenhao Chai @wenhaocha1
2K Followers 2K Following Ph.D. Student @PrincetonCS. Prev @Stanford @UW @pika_labs @MSFTResearch @UofIllinois. I used to work on computer vision, but it's not all I do.
Adam Zweiger @AdamZweiger
959 Followers 445 Following Rethinking how language models learn | Researcher @MIT_CSAIL
MetaCartel @Meta_Cartel
18K Followers 466 Following Supporting early web3 product teams with grants funding. "If you want to go quick, go alone. If you want to go far, go together."
MetaMedia @MetaMediaDAO
5K Followers 405 Following A content production engine descended from @Meta_Cartel. Currently producing "Built on Ethereum", a short documentary.
UCL DARK @UCL_DARK
4K Followers 197 Following UCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab at @AI_UCL led by @_rockt, @egrefen, @robertarail, and @jparkerholder.
Matthew Prince 🌥 @eastdakota
116K Followers 316 Following A little bit geek, wonk, and nerd. Repeat entrepreneur, recovering lawyer, and former ski instructor. Co-founder & CEO of Cloudflare (NYSE: NET).
Deep Cogito @DeepCogito
3K Followers 2 Following
Claude @claudeai
141K Followers 1 Following Claude is an AI assistant built by @anthropicai to be safe, accurate, and secure. Talk to Claude on https://t.co/ZhTwG8dz3D or download the app.
AI Security Institute @AISecurityInst
6K Followers 29 Following We conduct scientific research to understand AI’s most serious risks and develop and test mitigations.
Vismay Agrawal @vismayagrawal
254 Followers 221 Following PhD Researcher @Monash_M3CS | Meditation | IIT Madras Alumnus
Joey Gonzalez @profjoeyg
5K Followers 409 Following Professor @UCBerkeley, co-director of @LMSysorg, and co-founder @RunLLM
Obsolete Sony @ObsoleteSony
170K Followers 1K Following Embark on a journey through the obscure world of forgotten, odd, and obsolete Sony devices.
ResearchHub @ResearchHub
43K Followers 1 Following A modern day preprint server powered by $RSC. Incentivizing the open publication of transparent research. Let's accelerate science!