Simon Guo @simonguozirui
CS PhD student @Stanford | 🎓 @Berkeley_EECS | prev pre-training @cohere & built things at @ @anyscalecompute @nvidia simonguo.tech Palo Alto, CA Joined September 2014-
Tweets2K
-
Followers3K
-
Following5K
-
Likes6K
The most surprising thing working on this was that RL with LoRA completely matches full training and develops the same extended reasoning patterns. I think this is a great sign for custom agent training.
The most surprising thing working on this was that RL with LoRA completely matches full training and develops the same extended reasoning patterns. I think this is a great sign for custom agent training.
@StanfordHAI just ran this story on self-study and cartridges -- it's a really nice overview for those curious about our work
The two main issues with GRPO: 1) No credit assignment, unless you do rollouts from each state (VinePPO-style), which is super expensive. 2) Doing multiple rollouts from the same state requires state resetting / copying capabilities. This is fine for question answering, and…
A very interesting read. What stands out to me is how more and more optimizations on the GPU-programming are piggybacking on the dataflow execution model - the primary motivation behind any and every AI accelerator. I strongly believe the world needs AI accelerator companies…
A very interesting read. What stands out to me is how more and more optimizations on the GPU-programming are piggybacking on the dataflow execution model - the primary motivation behind any and every AI accelerator. I strongly believe the world needs AI accelerator companies…
more on this when you launch a cuda kernel, you are not running a function per say like we do in c++, you are handing an abstract specification of a parallelism, often in an intermediate form called ptx, to the nvidia driver, the driver acts as a final stage, just in time…
more on this when you launch a cuda kernel, you are not running a function per say like we do in c++, you are handing an abstract specification of a parallelism, often in an intermediate form called ptx, to the nvidia driver, the driver acts as a final stage, just in time…
(1/8) We’re releasing an 8-GPU Llama-70B inference engine megakernel! Our megakernel supports arbitrary batch sizes, mixed prefill+decode, a paged KV cache, instruction pipelining, dynamic scheduling, interleaved communication, and more! On ShareGPT it’s 22% faster than SGLang.
math matters in scaling!
I spent the past month reimplementing DeepMind’s Genie 3 world model from scratch Ended up making TinyWorlds, a 3M parameter world model capable of generating playable game environments demo below + everything I learned in thread (full repo at the end)👇🏼
With ShinkaEvolve, we aim to make a big step towards improving efficiency and broad accessibility to automated computational discovery. It combines multiple key algorithmic improvements: 1) An adaptive parent program sampling strategy balancing exploration & exploitation. 2) A…
ShinkaEvolve出ました!LLMを使ったコード自動改善のフレームワークです。Sakana AI版AlphaEvolve……と言うと話は簡単ですが、性能も使いやすさも様々な工夫があり、凄く良く出来てます。自分も面白い利用を既に何度か試してまして、このソフトウェアの大ファンです。その話はまた後日……!
ShinkaEvolve出ました!LLMを使ったコード自動改善のフレームワークです。Sakana AI版AlphaEvolve……と言うと話は簡単ですが、性能も使いやすさも様々な工夫があり、凄く良く出来てます。自分も面白い利用を既に何度か試してまして、このソフトウェアの大ファンです。その話はまた後日……!
@KLieret @_carlosejimenez @OfirPress @karthik_r_n @lschmidt3 @Diyi_Yang (CWM) Mutate-fix tasks 🤝 (SWE-smith) Procedural Modifications
2/ When humans plan, we imagine the possible outcomes of different actions. When we reason about code we simulate part of its execution in our head. The current generation of LLMs struggles to do this. What kind of research will an explicitly trained code world model enable?
Excited to share: MAST has been accepted as 🌟 NeurIPS D&B Spotlight🌟 Updates for the community: - NEW: We open-source 1,000+ multi-agent traces (link in 🧵). - lots of exciting use cases are emerging, we’ll be releasing blogs & tutorials to help you get started - And … more…
@jxmnop there have been some honest attempts, maybe folks will keep trucking
(1/6) We’re happy to share that ThunderKittens now supports writing multi-GPU kernels, with the same programming model and full compatibility with PyTorch + torchrun. We’re also releasing collective ops and fused multi-GPU GEMM kernels, up to 2.6x faster than PyTorch + NCCL.…
As we run out of next-token supervision due to limited internet data, we propose searching for a weaker form of self-supervision —inter-document relations. We validate this by pretraining a 3B model from scratch on 1T tokens. x.com/ZitongYang0/st…
As we run out of next-token supervision due to limited internet data, we propose searching for a weaker form of self-supervision —inter-document relations. We validate this by pretraining a 3B model from scratch on 1T tokens. x.com/ZitongYang0/st…
📜 Paper on new pretraining paradigm: Synthetic Bootstrapped Pretraining SBP goes beyond next-token supervision in a single document by leveraging inter-document correlations to synthesize new data for training — no teacher needed. Validation: 1T data + 3B model from scratch.🧵
Underrated dynamic in the next ~12-18 months is we should expect models to get as good at kernel writing as they are at competition math/code contests. This is bullish for chip startups, since one of the major obstacles to adoption (learning your software stack), is softened
Underrated dynamic in the next ~12-18 months is we should expect models to get as good at kernel writing as they are at competition math/code contests. This is bullish for chip startups, since one of the major obstacles to adoption (learning your software stack), is softened
Proud to have been part of the team behind Gaia2 and ARE! ARE = a gym/platform for scaling up LLM agent envs for evals & RL Gaia2 = a new benchmark for hard & practical agent tasks (search, execution, ambiguity, time, noise, & multi-agent) tinyurl.com/aregaia2

Alishba Imran @alishbaimran_
7K Followers 2K Following ML Researcher @arcinstitute | prev: @berkeley_ai, @czbiohub, @NVIDIA, founded ML battery startup
darya @daryakaviani
2K Followers 2K Following PhD Student @Berkeley_EECS. Systems security & cryptography. Prev: @MSFTResearch @Meta.
cindo 🍓 @cindohahahaha
8K Followers 973 Following product @tryramp, currently having a great time in nyc
Tyler Zhu @tyleryzhu
2K Followers 1K Following PhD student @VisualAILab | SR @GoogleDeepMind | prev @berkeley_ai | @SFGiants @warriors guy | https://t.co/RJZA9D0osy
Isabella Grandic @izzygrandic
2K Followers 736 Following chemicals / industrials investment banking, lifelong optimist 🇨🇦 🌏
Arvind Rajaraman @arvindr02
1K Followers 691 Following chief tokenwala @databricks • Prev LLMs + RL @berkeley_ai • Opinions my own
Jay @jayparth_
2K Followers 304 Following
Seyone Chithrananda @SeyoneC
3K Followers 3K Following 🇨🇦 | 1st yr phd @stanford bioengineering, bear @ucberkeley
Mallika Parulekar @mallikathinks
766 Followers 405 Following cs @ uc berkeley / stanford | subway surfer | everything's not a game but most things are game theoretic
sarv @SarvasvKulpati
10K Followers 2K Following Making computers fun again https://t.co/cUc86o7fBr CS+Cogsci @UCBerkeley YT: https://t.co/OR3L2OZJ8A
mathurah @mathurahravi
11K Followers 1K Following design engineer @netflix, writing and diving into my curiosities @rabbitholeathon, investing in creative tools! uw syde grad 🇨🇦
medha ☀️🍃 @medhakothari
10K Followers 3K Following product @Uniswap fun @KomorebiFund | prev vc @VariantFund eng @CeloOrg @CalBlockchain 🐻 co-founder @she_256
Andrew Dai @andrewdai99
352 Followers 420 Following Researcher @SakanaAILabs, a Kerryman, LMs x open-endedness; prev @Aleph__Alpha, @tcddublin, @FormulaTrinity Autonomous | 🇮🇪
Jiashuo Liu @liujiashuo77
2K Followers 639 Following Research Scientist at ByteDance Seed | Advanced & Interesting LLM/Agent Evaluation. Opinions are my own.
Luca Grillotti @CoRL2... @LucaGrillotti
305 Followers 678 Following Researcher @SakanaAILabs | PhD @ImperialCollege Interested in: 🧠 Open-Ended Behavioural Discovery 🧬 Quality-Diversity 🦾 Evolutionary Robotics
Stuart Sul @stuart_sul
1K Followers 118 Following ml research @cursor_ai, cs @Stanford, mlsys @HazyResearch
Grace Johnson @GraceJohns99220
6 Followers 239 Following
Twoaltu @Twoaltu476301
30 Followers 1K Following
Peter Downey @PJD60123
15 Followers 270 Following
QueenLilyAdams @Ievreadda15052
12 Followers 1K Following Ambition is my middle name In love with my journey
anandmaj @Almondgodd
2K Followers 396 Following path of childhood's end | gap @penn | prev ai @tesla_optimus @dynarobotics
Ben Lipkin @ben_lipkin
678 Followers 1K Following phd @mit. research @genlm. prev: intern @apple. language, programs, probability.
• @hsquaredn
0 Followers 50 Following
Stefano Fiorucci @theanakin87
268 Followers 980 Following AI/SW Engineer - @haystack_ai OSS LLM framework https://t.co/dJIRA2wBNA https://t.co/ExY1uYWWXf https://t.co/qtEeS5eutt
Çağla Şimşek @caglahp
32 Followers 738 Following
Gaurav @gauravisnotme
2K Followers 546 Following Good model @xAI | prev. d-matrix, Google. Opinions are my own - always and forever
Fred Jonsson @enginoid
931 Followers 629 Following building AI/ML systems @ https://t.co/KJtPmvPgxw. 🎄 let's meet at NeurIPS '25. interests: small models, continual learning, training infra, RLVR, AutoML
Thanapong (Mod) Boont... @oldyginger
3 Followers 43 Following
Losib @Losib3781
5 Followers 294 Following
Sedrick Alcantara @SedrickAlc77509
0 Followers 11 Following
Johan Obando-Ceron �... @johanobandoc
2K Followers 4K Following Graduate student @Mila_Quebec @UMontrealDIRO | RL/Deep Learning/AI | De Cali/Colombia pal’ Mundo 🇨🇴 | #JuntosProsperamos⚡#TogetherWeThrive| 🌱🌎
Eduardo Slonski @EduardoSlonski
782 Followers 714 Following AI Researcher | LLM Reasoning and Scaling
Jack Douglas @JackFDouglas
2 Followers 15 Following
Usman Ghulam Nabi @UsmanGhulmeNabi
0 Followers 87 Following Agentic AI Developer & Engineer | MS Artificial Intelligence ’26
Romain Froger @froger_romain
129 Followers 237 Following PhD @AIatMeta, MSL Agents and @Inria. @GeorgiaTech & UTC alumni.
Daphne Cornelisse @daphne_cor
1K Followers 560 Following Ph.D. student @nyuniversity • Building human-like agents 🦋 https://t.co/BhKiCutsdY
Ufafi @Ufafi660931
30 Followers 1K Following
Tanmay Patil @TanmayP11263263
43 Followers 250 Following Machine learning engineer Github : https://t.co/VFmAjsLtPA Huggingface : https://t.co/eL4e0o5SKp
Teim @TamiruAL
3 Followers 45 Following
Alexiy Buynitsky @ABuynitsky
88 Followers 1K Following CS @UCSanDiego | ex Persona AI, SpaceX, Armada AI | CS, Math @ Purdue '25
Amy @tanabota123456
88 Followers 1K Following Tesla is the world's top-selling electric vehicle brand (it has long held a leading market share).
IreneYonng @PwOud3ihFeT533
21 Followers 571 Following
emily han @emilyhanyf
596 Followers 472 Following 19 yo, ai/systems + design @stanford, co-director @hackwithtrees, llm inference @modal
hannah @hannahgao_
535 Followers 519 Following 19 yo • design eng @interaction, sponsorships @hackwithtrees • math, cs, sidequesting @stanford • art TikTok @yurtyobain (16k)
All Might e/acc 🚀�... @AllMigh48938863
394 Followers 2K Following VC - Value Investor - cricket lover - Nerd - geopolitics
Giannis Chatziveroglo... @giannis2two
333 Followers 374 Following pretraining @cohere, CS + Math @MIT
Young @0x_Cryptoyang
5K Followers 2K Following AI is cool i guess 🌟Individual Investor | Ex @ABCDELabs|Core Contributor https://t.co/eJAFHrk0Oq Group|Prev @Scroll_ZKP、@THUBA_DAO
Jodber @Jodber334
49 Followers 1K Following
Paul Graham @paulg
2.1M Followers 778 Following
Alishba Imran @alishbaimran_
7K Followers 2K Following ML Researcher @arcinstitute | prev: @berkeley_ai, @czbiohub, @NVIDIA, founded ML battery startup
Andrej Karpathy @karpathy
1.4M Followers 1K Following Building @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets.
darya @daryakaviani
2K Followers 2K Following PhD Student @Berkeley_EECS. Systems security & cryptography. Prev: @MSFTResearch @Meta.
Alexandr Wang @alexandr_wang
333K Followers 838 Following chief ai officer @meta, founder @scale_ai. rational in the fullness of time
cindo 🍓 @cindohahahaha
8K Followers 973 Following product @tryramp, currently having a great time in nyc
Alfredo Andere @AlfredoAndere
4K Followers 1K Following Co-Founder and CEO @LatchBio — The Cloud for Biology
Tyler Zhu @tyleryzhu
2K Followers 1K Following PhD student @VisualAILab | SR @GoogleDeepMind | prev @berkeley_ai | @SFGiants @warriors guy | https://t.co/RJZA9D0osy
Jim Fan @DrJimFan
327K Followers 3K Following NVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
AK @_akhaliq
428K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5YmrQ
Balaji @balajis
1.2M Followers 4K Following Author of the Network State. Founder of the Network School.
Andrew Dai @andrewdai99
352 Followers 420 Following Researcher @SakanaAILabs, a Kerryman, LMs x open-endedness; prev @Aleph__Alpha, @tcddublin, @FormulaTrinity Autonomous | 🇮🇪
Kevin Weil 🇺🇸 @kevinweil
111K Followers 3K Following CPO @OpenAI, BoD @Cisco @nature_org, LTC @USArmyReserve Prev: President @Planet, Head of Product @Instagram @Twitter ❤️ @elizabeth ultramarathons kids cats math
Jiashuo Liu @liujiashuo77
2K Followers 639 Following Research Scientist at ByteDance Seed | Advanced & Interesting LLM/Agent Evaluation. Opinions are my own.
Luca Grillotti @CoRL2... @LucaGrillotti
305 Followers 678 Following Researcher @SakanaAILabs | PhD @ImperialCollege Interested in: 🧠 Open-Ended Behavioural Discovery 🧬 Quality-Diversity 🦾 Evolutionary Robotics
Heiga Zen (全 炳河... @heiga_zen
10K Followers 192 Following Principal Scientist (Director) @GoogleDeepMind / GDM Tokyo site lead.波瀬小⇒一志中⇒鈴鹿高専⇒名工大 (IBM TJ Watson intern)⇒東芝欧州研⇒Google (Speech🇬🇧⇒Brain🇯🇵) ⇒GoogleDeepMind
anandmaj @Almondgodd
2K Followers 396 Following path of childhood's end | gap @penn | prev ai @tesla_optimus @dynarobotics
Ben Lipkin @ben_lipkin
678 Followers 1K Following phd @mit. research @genlm. prev: intern @apple. language, programs, probability.
Kevin Tu @kevbtu
526 Followers 528 Following enterprise/deep tech vc @dfjgrowth | prev @cohesity @jpmorgan @summitpartners @mit + sw engineer | come nerd out with me 🤓
Alex Shaw @alexgshaw
367 Followers 472 Following Shipping @LaudeInstitute & investing @LaudeVentures Co-creator of Terminal-Bench. Formerly Google. BYU alum.
Karthik A Sankararama... @karthikabinav
2K Followers 3K Following AI Research @ Meta Superintelligence Labs. Long-term Affiliations: #iitm, @UMDCS, @facebook, @meta
Jerry Tworek @MillionInt
23K Followers 700 Following Berry farmer @ OpenAI | o3, o1, GPT4, ChatGPT, Codex, Solved Rubik’s cube with robotic hand | cautious AI optimist
Nathan Chen @nathancgy4
1K Followers 644 Following understanding models @tilderesearch, (hardware-aligned) ml & open-source, 16
Da Yu @DaYu85201802
504 Followers 148 Following Research Scientist at Google Research. Former intern at @MSFTResearch and @GoogleAI. Joint PhD between Sun Yat-sen University and Microsoft Research Asia.
Gaurav @gauravisnotme
2K Followers 546 Following Good model @xAI | prev. d-matrix, Google. Opinions are my own - always and forever
Ali Taha @AliesTaha
653 Followers 170 Following gpu perf @modular | ex @Tesla comp eng @uwaterloo [email protected]
General Intelligence ... @nycintelligence
3K Followers 4 Following The General Intelligence Company Of New York - Our mission is to enable the one person one billion dollar company
Sachin @sachdh
3K Followers 742 Following cooking reasoning models and agents at @AthenaAgentRL - a narrow intelligence lab
ElevenLabs @elevenlabsio
141K Followers 11 Following The voice of technology. Bringing the world's knowledge, stories and agents to life.
Fred Jonsson @enginoid
931 Followers 629 Following building AI/ML systems @ https://t.co/KJtPmvPgxw. 🎄 let's meet at NeurIPS '25. interests: small models, continual learning, training infra, RLVR, AutoML
Jeremy Berman @jerber888
4K Followers 1K Following post-training @reflection_ai. prev @ndea and co-founded https://t.co/aY50hNeJUD. yc w19.
Brian Zhan @brianzhan1
3K Followers 2K Following Investing in early stage AI @CRV. Seed/A: @Reflection_AI, @SkildAI, @DynaRobotics, @LanceDB, Lepton (acq NVIDIA), @VoyageAI (acq MongoDB), @SDFLabs (acq dbt)
Stuart Sul @stuart_sul
1K Followers 118 Following ml research @cursor_ai, cs @Stanford, mlsys @HazyResearch
Aonan Zhang @Aonan12
14 Followers 26 Following
Andreea Bobu @andreea7b
3K Followers 442 Following Assistant Professor @MITAeroAstro and @MIT_CSAIL ∙ PhD from @Berkeley_EECS ∙ machine learning, robots, humans, and alignment
Sophie Xhonneux @SophieXhon11060
147 Followers 132 Following
Abhi Sivaprasad @abhisiv
126 Followers 321 Following Co-founder of @lang_i18n (@YCombinator S19) Formerly of @yale @OptiverUS
Romain Froger @froger_romain
129 Followers 237 Following PhD @AIatMeta, MSL Agents and @Inria. @GeorgiaTech & UTC alumni.
Exa @ExaAILabs
44K Followers 30 Following We're an AI research lab building search for the future. Most powerful web search API → https://t.co/M5QuIA5D2A high compute web search → https://t.co/uHn3Ra5yJ2
Suresh Kumar Jetti @suresh__jetti
2K Followers 2K Following Neuroscientist | Alumnus @MIT, @KU_Leuven, @iitmadras Interested in #Neurophysiology, #Cancer_Neuro, #Electrophysiology, #NeuroAI, #Bioelectricity
Abhay Gupta @gupta__abhay
399 Followers 2K Following Scaling and efficiency lead @DbrxMosaicAI | Previously @CerebrasSystems @CMU_Robotics | Making GPUs and agents go brrrr !!
Nicole Debow @nicoledebow
1K Followers 386 Following growing @kalshi / family biz @joinswsh / data science @bu_tweets
Jacob Teo @jacobtpl
166 Followers 56 Following
Daphne Cornelisse @daphne_cor
1K Followers 560 Following Ph.D. student @nyuniversity • Building human-like agents 🦋 https://t.co/BhKiCutsdY
Bing Liu @vbingliu
838 Followers 98 Following Director of Research @Scale_AI. Prev: GenAI @Meta, PhD @CarnegieMellon.
Anand Bhattad @anand_bhattad
3K Followers 438 Following Assistant Professor @JHUCompSci, @HopkinsDSAI Past: RAP @TTIC_Connect | PhD @SiebelSchool Research: Exploring Knowledge in Generative Models
Alex Rodrigues @alexrodriguesca
3K Followers 275 Following AI alignment research @ Anthropic Previously: youngest CEO of a public company (2021) @ Embark self-driving trucks; Thiel Fellow; A robot kid who grew up
Kaien Yang @kaien_yang
791 Followers 184 Following math + cs @ stanford; prev: google deepmind, citadel, d.e. shaw
Nicole Brichtova @nbrichtova
2K Followers 55 Following I lead product for image generation at Google DeepMind (Gemini / nano banana, Imagen). Opinions are my own.
Mechanize @MechanizeWork
6K Followers 1 Following We're a software company building RL environments to power the full automation of the economy.
Morris Yau @MorrisYau
275 Followers 54 Following @MIT @Google Phd candidate in Computer Science doing research in foundational aspects of ML and NLP.