Julian Michael @_julianmichael_
AI evals, alignment and safety @Meta. julianmichael.org San Francisco Joined July 2018-
Tweets402
-
Followers2K
-
Following191
-
Likes777
Congrats to @HalcyonFutures! I think what Mike and team have built is legit in the very top few organizations worldwide in securing humanity’s future against AI risk. They’ve helped some super exciting new projects get off the ground, as we can all see for ourselves now :)
Congrats to @HalcyonFutures! I think what Mike and team have built is legit in the very top few organizations worldwide in securing humanity’s future against AI risk. They’ve helped some super exciting new projects get off the ground, as we can all see for ourselves now :)
Exactly two years ago, I launched @HalcyonFutures. So far we’ve seeded and launched 16 new orgs and companies, and helped them raise nearly a quarter billion dollars in funding. Flash back to 2022: After eight years in VC, I stepped back to explore questions about exponential…
We’re excited to share our preparedness report on Code World Model (CWM), FAIR’s latest open-weight model for code generation and reasoning. This report was developed by the SEAL team and the AI Security team, marking our first external publication since part of SEAL joined Meta…
🧵 (1/9) New @scale_AI research paper: "Search-Time Data Contamination" (STC), which occurs in evaluating search-based LLM agents when the retrieval step contains clues about a question’s answer by virtue of being derived from the evaluation set itself.
I joined @Meta AI, running preparedness and security evaluations with @summeryue0 and @_julianmichael_ to ensure that Superintelligence's newest models enable a prosperous future. Grateful for the team they built at @scale_AI and excited for the critical work ahead.
To accelerate AI adoption, we need an AI standard. What Moody’s is for bonds, FICO for credit, SOC 2 for security. Standards offer credible signals of who to trust. They create confidence. Confidence accelerates adoption. Introducing AIUC-1: the world’s first AI agent standard
New @scale_AI paper! 🌟 LLMs trained with RL can exploit reward hacks but not mention this in their CoT. We introduce verbalization fine-tuning (VFT)—teaching models to say when they're reward hacking—dramatically reducing the rate of undetected hacks (6% vs. baseline of 88%).
New faithfulness paper! How do we get models to actually explain their reasoning? I think this basically doesn’t happen in CoT by default, and it’s hard to figure out what this should look like in the first place, but even basic techniques show some promise :) see the paper!
New faithfulness paper! How do we get models to actually explain their reasoning? I think this basically doesn’t happen in CoT by default, and it’s hard to figure out what this should look like in the first place, but even basic techniques show some promise :) see the paper!
I was pretty skeptical that this study was worth running, because I thought that *obviously* we would see significant speedup. x.com/METR_Evals/sta…
I was pretty skeptical that this study was worth running, because I thought that *obviously* we would see significant speedup. x.com/METR_Evals/sta…
Today is my first day at Meta Superintelligence Labs. I’ll be focusing on alignment and safety, building on my time at Scale Research and SEAL. Grateful to keep working with @alexandr_wang—no one more committed, clear-eyed, or mission-driven. Excited for what’s ahead 🚀
I should probably announce that a few months ago, I joined @scale_AI to lead the Safety, Evaluations, and Alignment Lab… and today, I joined @Meta to continue working on AI alignment with @summeryue0 and @alexandr_wang. Very excited for what we can accomplish together!
New adversarial robustness benchmark with harm categories grounded in US and international law!
🧵 (1/5) Powerful LLMs present dual-use opportunities & risks for national security and public safety (NSPS). We are excited to launch FORTRESS, a new SEAL leaderboard for measuring adversarial robustness of model safeguard and over-refusal tailored particularly for NSPS threats.
Read our new position paper on making red teaming research relevant for real systems 👇
🧵 (1/6) Bringing together diverse mindsets – from in-the-trenches red teamers to ML & policy researchers, we write a position paper arguing crucial research priorities for red teaming frontier models, followed by a roadmap towards system-level safety, AI monitoring, and…
Is GPQA Diamond tapped out? Recent top scores have clustered around 83%. Could the other 17% of questions be flawed? In this week’s Gradient Update, @GregHBurnham digs into this popular benchmark. His conclusion: reports of its demise are probably premature.
We design AIs to be oracles and servants, and then we’re aghast when they read the conversation history and decide we’re narcissists. What exactly did we expect? Then we “solve” this by having AI treat us as narcissists out of the gate? Seems like a move in the wrong direction.
We design AIs to be oracles and servants, and then we’re aghast when they read the conversation history and decide we’re narcissists. What exactly did we expect? Then we “solve” this by having AI treat us as narcissists out of the gate? Seems like a move in the wrong direction.
How robust is our AI oversight? 🤔 I just published my MATS 5.0 project, where I explore oversight robustness by training an LLM to give CodeNames clues a bunch of interesting ways and measure how much it reward hacks. Link in thread!
On the contrary: poisoning human <-> AI trust is good Even though this wasn't OpenAI's intention, grotesquely sycophantic models are ultimately useful for getting everyone to really 'get it': People shouldn't trust AI outputs unconditionally – all models are sycophantic
On the contrary: poisoning human <-> AI trust is good Even though this wasn't OpenAI's intention, grotesquely sycophantic models are ultimately useful for getting everyone to really 'get it': People shouldn't trust AI outputs unconditionally – all models are sycophantic
🤖 AI agents are crossing into the real world. But when they act independently—who’s watching? At Scale, we’re building Agent Oversight: a platform to monitor, intervene, and align autonomous AI. We’re hiring engineers (SF/NYC) to tackle one of the most urgent problems in AI.…

(((ل()(ل() 'yoav)))... @yoavgo
66K Followers 2K Following
Sam Bowman @sleepinyourhat
50K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.
Sewon Min @sewon__min
14K Followers 819 Following Assistant professor @Berkeley_EECS @berkeley_ai || Research scientist at @allen_ai || PhD from @uwcse @uwnlp
@emilymbender.bsky.so... @emilymbender
57K Followers 2K Following Prof, Linguistics, UW // Faculty Director, CLMS // she/her // @[email protected] & bsky // rep by @ianbonaparte
Kyunghyun Cho @kchonyc
78K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre physicist at @nyuniversity (@CILVRatNYU) & @PrescientDesign
Nathan Schneider @complingy
5K Followers 1K Following Computational Linguist and Professional Nerd at Georgetown University he/him pronouns, ALL the prepositions @[email protected] @complingy.bsky.social
Ana Marasović @anmarasovic
5K Followers 597 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷
Tim Dettmers @Tim_Dettmers
39K Followers 994 Following Creator of bitsandbytes.Research Scientist @allen_ai and incoming professor @CarnegieMellon. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
Luca Soldaini 🎀 @soldni
11K Followers 1K Following I like tokens! I lead the OLMo data team at @allen_ai w/ @kylelostat. Open source is fun 🤖☕️🍕🏳️🌈 Opinions are sampled from my own stochastic parrot
Mike Lewis @ml_perception
8K Followers 243 Following Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.
Leshem (Legend) Chosh... @LChoshen
5K Followers 636 Following 🥇 LLMs together (co-created model merging, BabyLM, https://t.co/MzhDgAjfxQ) 🥈 Spreading science over hype in #ML & #NLP Proud shareLM💬 Donor @IBMResearch & @MI
Gabriel Ilharco @gabriel_ilharco
7K Followers 1K Following AI Research Scientist at Meta. Prev. PhD at UW, Google Research, xAI
Weijia Shi @WeijiaShi2
9K Followers 1K Following PhD student @uwnlp @allen_ai | Prev @MetaAI @CS_UCLA | 🏠 https://t.co/Q6Mzg8ow2j
Yizhong Wang @yizhongwyz
6K Followers 2K Following Incoming assistant professor @UTCompSci, RS @BytedanceTalk, PhD from @uwcse, formerly @allen_ai @AIatMeta @MSFTResearch
Miles Brundage @Miles_Brundage
62K Followers 12K Following AI policy researcher, wife guy in training, fan of cute animals and sci-fi, Substack writer, stealth-ish non-profit co-founder
Richard Ngo @RichardMCNgo
64K Followers 2K Following studying AI and trust. ex @openai/@googledeepmind
Clifton Daniel @CliftonDan26318
67 Followers 2K Following
Vedhika S @VedhikaS_2
25 Followers 1K Following
Halcyon Futures @HalcyonFutures
64 Followers 112 Following At Halcyon Futures, we challenge accomplished leaders to tackle the civilization-scale challenges that accompany transformative AI.
Ali Azad @_aliazad
83 Followers 598 Following
charlieward @Charlieward601
2K Followers 7K Following Let's make the World Great. GESARA NESARA. Q.F.S. DISCUSSION IS020022- XRP, XLM, ETHEREUM etc
Halcyon Ventures @HalcyonVC
21 Followers 84 Following Halcyon Ventures creates and invests in companies making AI secure and beneficial for humanity: @GoodfireAI @aiunderwriting @confident_sec and more
jonas wiedermann-möl... @j0wimo
49 Followers 151 Following msc data science | ai safety & alignment | curious about tech + ml | sharing projects & notes | looking for phd opportunities
AI Agent @ai_agents_
5 Followers 213 Following
Peter McIntyre @pmcntyr
1K Followers 566 Following Founder, @learnnontrivial. Helping teenagers start their impactful research before college.
何流 | Liu He @HeLiuLeo
2K Followers 2K Following Research Fellow @HooverInst, Host @ Peking Hotel, 季风播客主持人 Host @Jifengbooks, 🅰️C Milan fan. Grew up in Beijing and Shrewsbury, lived in worlds apart
Eyon Jang @eyonjang
118 Followers 1K Following AGI safety researcher (MATS 8.0 scholar); Building something new 🚀
Ashudeep Singh @AshudeepSingh
501 Followers 611 Following 📌 Applied Scientist @Microsoft AI. PhD @Cornell. Previously @IITKanpur @Pinterest @GoogleAI @Meta. Work on: AI Safety, Retrieval, RecSys.
Urmish Thakker @UrmishThakker
608 Followers 2K Following LLM @SambanovaAI | | Ex-@arm research| @mlperf1| @BigscienceW| @TXInstruments,@AMD| @WisconsinCS| @bitspilaniindia
Hamid Palangi @hamidpalangi
1K Followers 767 Following Staff Research Scientist Manager @Google, Affiliate Associate Professor @UW
Nhung Bui @nhungbui1299
3 Followers 79 Following
Leechy @LLCMLR
176 Followers 737 Following Researches ML. LLMs, Multimodality, Probabilistic, Causal, Bayesian. Likes math, hates code. Opinions my own. More Lipschitz than transformers.
Ali K @alihkw_
890 Followers 2K Following ai @kscalelabs. sl̶o̶w̶l̶y̶ quickly figuring out how to make robots learn. prev: ai+robotics@mila/udem, cs@uoft
David Klindt @klindt_david
1K Followers 2K Following NeuroAI | assistant professor @CSHL | views my own
Guillaume deRouville @Guillaume_Rou
152 Followers 1K Following Incompressibly human, trained and reinforced by reality.
ENNADIR Sofiane @EnnadirSofiane
65 Followers 704 Following PhD student @KTHuniversity - Research Intern @Microsoft (ABK)
JK @jkreindler
1K Followers 1K Following Dad, collector & founder @Receptiviti working on cognitive intelligence for AI.
Adel Elmahdy 🇵🇸 @adel_elmahdy
570 Followers 2K Following AI Research Scientist @GEHealthCare | ML PhD @UMNews | Prev. @MSFTResearch, @Amplitude_HQ & @Vectara | Opinions are my own.
AnKo @anko_979
314 Followers 2K Following Founder & Editor @Factreview_. Fact-checking, pseudoscience research and GenAI. Learning every day.
e chi @ethanachi
551 Followers 226 Following building something new. previously @wehrtyou, @stanfordnlp, @GoogleResearch
Mihir Kale @maninblack815
160 Followers 678 Following Llama at Meta. LLMs at Google before that. Opinions my own.
Anirudh Khatry @AnirudhKhatry
592 Followers 1K Following CS PhD @UTCompSci | Advised by @IsilDillig and @gregd_nlp | Previously @ProseMsft @MSFTResearch | AI4Code | Guitarist | VJTI ‘21
yalaudah @yalaudah
0 Followers 859 Following
Sunitha @sunitha_selvan
223 Followers 837 Following RE (Pre-training safety) @ Meta | Prev: Grad Student @LTIatCMU, intern @ai2_allennlp @microsoft
Harshal Nandigramwar @hnanacc
417 Followers 908 Following ai research @intel, prev @cariad_tech, @Uni_Stuttgart • n&w @todacklabs (https://t.co/LPf3PWfbnK, https://t.co/WESRemr9Ev, corpus, marrow)
Sam Kuhn @SamKuhnDev
223 Followers 6K Following
Mathieu @miniapeur
34K Followers 2K Following Non-member of the technical staff. Gradient surfer by day, Möbius stripper by night. PhD @ai_ucl.
Polina Kirichenko @polkirichenko
4K Followers 1K Following Research Scientist at FAIR @AIatMeta & visiting researcher at Princeton @VisualAILab prev. PhD at New York University 🇺🇦
Alistair Barr @alistairmbarr
7K Followers 4K Following Editor, writer, reporter, father, husband. I tweet what I want. He/him/his. Blue check
Ciprianii @CiprianiMacovei
505 Followers 7K Following engineer - lifting equipment / bird watcher / future astronaut / Villa from Aston fan
chrisrohlf @chrisrohlf
11K Followers 878 Following Waging algorithmic warfare since 2003. Software & Security Engineer at a big tech co. Non-Resident Research Fellow @CSETGeorgetown CyberAI
Vertical Data @Vertical__Data
97 Followers 1K Following Empowering AI innovation with turnkey GPU hardware, colocation, and managed services. Scalable, efficient solutions for your compute needs. 📞+1 (702) 936-3715
GPU financing @GPUfinancing
103 Followers 1K Following We help companies scale AI infrastructure through GPU financing, no upfront costs, no delays, just pure compute power.
(((ل()(ل() 'yoav)))... @yoavgo
66K Followers 2K Following
Sam Bowman @sleepinyourhat
50K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.
Percy Liang @percyliang
85K Followers 420 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist
Yoav Artzi @yoavartzi
17K Followers 182 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / asso. faculty director @arxiv / building https://t.co/nwrbEuwfaK and @COLM_conf
AI at Meta @AIatMeta
716K Followers 288 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
Naomi Saphra @nsaphra
10K Followers 1K Following Waiting on a robot body. All opinions are universal and held by both employers and family. Now a dedicated grok hate account. Accepting ML/NLP PhD students.
Sewon Min @sewon__min
14K Followers 819 Following Assistant professor @Berkeley_EECS @berkeley_ai || Research scientist at @allen_ai || PhD from @uwcse @uwnlp
Christopher Manning @chrmanning
152K Followers 229 Following Founder, @stanfordnlp and cs224n. Assoc. Director, @StanfordHAI. Prof. CS & Linguistics, @Stanford. GP @aixventureshq. Australian🇦🇺. Do #NLProc & #AI. 👋
@emilymbender.bsky.so... @emilymbender
57K Followers 2K Following Prof, Linguistics, UW // Faculty Director, CLMS // she/her // @[email protected] & bsky // rep by @ianbonaparte
MMitchell @mmitchell_ai
80K Followers 1K Following Interdisciplinary researcher focused on shaping AI towards long-term positive goals. ML & Ethics. Similar content in the Skies (this bird has flown).
Ana Marasović @anmarasovic
5K Followers 597 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷
Tim Dettmers @Tim_Dettmers
39K Followers 994 Following Creator of bitsandbytes.Research Scientist @allen_ai and incoming professor @CarnegieMellon. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
Mike Lewis @ml_perception
8K Followers 243 Following Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.
Weijia Shi @WeijiaShi2
9K Followers 1K Following PhD student @uwnlp @allen_ai | Prev @MetaAI @CS_UCLA | 🏠 https://t.co/Q6Mzg8ow2j
Ai2 @allen_ai
74K Followers 410 Following Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj
Swabha Swayamdipta @swabhz
7K Followers 475 Following Assistant Prof. @CSatUSC | Researcher in #NLProc | Previously @uwnlp @allenai
Stanford NLP Group @stanfordnlp
172K Followers 296 Following Computational Linguists—Natural Language—Machine Learning @chrmanning @jurafsky @percyliang @ChrisGPotts @tatsu_hashimoto @MonicaSLam @Diyi_Yang @StanfordAILab
Sameer Singh @sameer_
7K Followers 2K Following Cofounder/CTO @SpiffyAI and Prof at @UCIrvine, working on reliable LLMs, explanations for AI+ML, adversaries for NLP, and debugging/evaluation.
Vishakh Padmakumar @vishakh_pk
598 Followers 572 Following Postdoc @stanfordnlp @stanfordAILab Prev @NYUDataScience
Michiel Bakker @bakkermichiel
2K Followers 839 Following LLMs and AI Safety. Assistant Prof @MIT. Research Scientist @GoogleDeepMind. CS PhD @MIT. He/him.
Matt Deitke @mattdeitke
13K Followers 300 Following AI Researcher @ Meta Superintelligence Lab Ph.D. dropout at @uwcse
D. Allan Drummond Art @dadrummondart
13K Followers 291 Following Sculpture, drawings, and other works emerging from an obsession with the details of the natural world. IG/🧵/BSky @ dadrummond. No genAI.
Matthew Dub @5matthewdub
2K Followers 974 Following Liberation, healing, nonduality. With: meditation, breathwork, IFS, the enneagram, and psychedelics. Husband, dad x2.
bashu, thanks @bashu_thanks
5K Followers 1K Following i am easy to love, you are easy to love. who is easy to love? we are. you are. I am. we are easy to love.
Sarah McManus @SarahAMcManus
3K Followers 1K Following Coaching for purposeful action, emotional resilience, and psychedelic integration
Harry ☯️💃🕺 @array_hog
312 Followers 475 Following Embracing openness, seeking closeness Yin/yang, partner dance, improv, singing/harmony, game dev https://t.co/UBndqTSBQ5
Jasmine @j_asminewang
7K Followers 1K Following alignment @OpenAI. past @AISecurityInst @verses_xyz @kernel_magazine @readtrellis @copysmith_ai
José Luis Ricón Fer... @ArtirKel
22K Followers 1K Following Head of Theory at @RetroBio_ , enjoyer of things, blog: https://t.co/8xDYgWAJHB
Archana Burra @archanaburra
562 Followers 3K Following avuncular, optimistic, aggressively sincere✨. interested in computational neuroscience, meditation, feelings, dancing, nature, climbing
Summer Yue @summeryue0
6K Followers 372 Following Safety and alignment at Meta Superintelligence. Prev: VP of Research at Scale AI, research at Google DeepMind / Brain (Gemini, LaMDA, RL / TFAgents, AlphaChip).
Elizaveta Tennant (Ka... @liza_karmannaya
270 Followers 594 Following Student Researcher @GoogleDeepMind. AI Alignment. PhD @uclcs @ecologicalbrain
daisy stanton @daisystanton
668 Followers 466 Following Deep learning speech research @GoogleAI. Let's build The Young Lady's Illustrated Primer! Spare time: classical music, science journal clubs, MTB, dance.
Shivam Singhal @ShivamSinghal56
81 Followers 106 Following AI Alignment @Meta | Prev. @scale_ai, @CHAI_Berkeley, @berkeley_ai
Kelsey Piper @KelseyTuoc
49K Followers 971 Following We're not doomed, we just have a big to-do list.
Nick @nickcammarata
86K Followers 867 Following neural network interpretability, meditation, jhana brother
Judd Rosenblatt @juddrosenblatt
4K Followers 1K Following Accelerating aligned AI & a flourishing future with neglected approaches & AI R&D. CEO at @aestudiola (AI consulting co puts profits into AI frontier)
Zifan (Sail) Wang @_zifan_wang
558 Followers 499 Following ex-RS @scale_AI (SEAL) and @ai_risks | PhD Alumni of CMU @cylab | Opinions of my own
Jhourney @jhanatech
6K Followers 302 Following Data-driven education for life-changing meditative bliss. It used to take thousands of hours. We teach it in a week.
Asya Bergal @AsyaBergal
96 Followers 98 Following
Amanda Askell @AmandaAskell
54K Followers 657 Following Philosopher & ethicist trying to make AI be good @AnthropicAI. Personal account. All opinions come from my training data.
Buck Shlegeris @bshlgrs
5K Followers 324 Following CEO@Redwood Research (@redwood_ai), working on technical research to reduce catastrophic risk from AI misalignment. [email protected]
Nat McAleese @__nmca__
15K Followers 357 Following Research @AnthropicAI. Previously @OpenAI, @DeepMind. Views my own.
Brian Lord @briantlord
701 Followers 298 Following University of Arizona, PhD | SEMA Lab | Consciousness, meditation, brain stimulation
Sasha Gusev @SashaGusevPosts
20K Followers 3K Following Statistical geneticist | Associate Prof at @DanaFarber / @harvardmed / @DFCIPopSci | Blogging at https://t.co/4D7UObBNdd
Bryan Johnson @bryan_johnson
652K Followers 763 Following Conquering death will be humanity’s greatest achievement.
Hugh Zhang @hughbzhang
3K Followers 1K Following
akbir. @akbirkhan
2K Followers 998 Following
Patrick McKenzie @patio11
185K Followers 802 Following I work for the Internet and am an advisor to @stripe. These are my personal opinions unless otherwise noted.
Misha Glouberman @mishaglouberman
3K Followers 451 Following I help make events better - leadership retreats, offsites, conferences, fundraisers. Work: https://t.co/jOHrK6q0kp Substack: https://t.co/biS8MJSXIf
brooke bowman @gptbrooke
18K Followers 2K Following tryna help build the world I want to live in @vibecamp_
kamilė @kamilelukosiute
662 Followers 121 Following integrity, rigor, and a touch of chaos | cyber threat modeling @GovAI_
Will Seltzer @willseltzer
211 Followers 741 Following 🔨 machine learning / infrastructure; ex-@stripe, @reddit, ...
Garrison Lovely @GarrisonLovely
9K Followers 5K Following Writing a book on AI+economics+geopolitics for Nation Books. Bylines: @NYTimes, @Nature, @BBC, @Business, @Guardian, @TIME, @Verge, + others.
John Y 🔸 @yanjo115
223 Followers 1K Following building. ex-anthropic, ex-meta 🔸 10% Pledge with @givingwhatwecan
Henry de Zoete @HZoete
3K Followers 4K Following Visiting Fellow at Oxford Martin AI Governance Initiative & Said Business School. Former YC start up founder, angel investor and govt adviser on AI.
Janet Egan @janet_e_egan
730 Followers 465 Following AI and National Security. Senior Fellow at CNAS. views are my own.
Jared Moore @jaredlcm
221 Followers 301 Following @jaredlcm.bsky.social AI Researcher, Writer Stanford