-
Tweets79
-
Followers182
-
Following1K
-
Likes2K
It seems to me likely that as we're shifting toward task-specific environments for LLM post-training, model providers will inevitably become incentivized to vacuum up as much user context as they possibly can, just to fully reconstruct their tasks inside their RL stacks. If…
Lol these bros are just vibe governing
Lol these bros are just vibe governing
everything reminds me of him 😭
£10k if you find me our next tech lead - I think this is a very cool role (I'm covering for parts of it atm) at a very cool org (we just raised 25M). DMs open :)
Today I’m launching @Irregular (formerly Pattern Labs) with my friend and co-founder Omer Nevo: Irregular is the first frontier security lab. Our mission: protect the world in the era of increasingly capable and sophisticated AI systems.
We need new rules for publishing AI-generated research. The teams developing automated AI scientists have customarily submitted their papers to standard refereed venues (journals and conferences) and to arXiv. Often, acceptance has been treated as the dependent variable. 1/
Excited to share details on two of our longest running and most effective safeguard collaborations, one with Anthropic and one with OpenAI. We've identified—and they've patched—a large number of vulnerabilities and together strengthened their safeguards. 🧵 1/6
New blog! We @AISecurityInst partnered with @NCSC to write about an emerging practice I'm really excited about: Safeguard Bypass Bounty Programmes (SBBPs). Summary of what these are, why they are useful, & how to do them well 🧵
Since I started working on safeguards, we've seen substantial progress in defending certain hosted models, but less progress in measuring & managing misuse risks from open weight models. Three directions I want explored more, drawn from our @AISecurityInst post today 🧵
Today, we’re launching Parsed. We are incredibly lucky to live in a world where we stand on the shoulders of giants, first in science and now in AI. Our heroes have gotten us to this point, where we have brilliant general intelligence in our pocket. But this is a local minima. We…
Fortunately I have a Pro account and thus am not at risk of having the model picker taken away from me (?) but if that were not the case I might be leading protests for Pause AI [Product Changes]
1/6 🦉Did you know that telling an LLM that it loves the number 087 also makes it love owls? In our new blogpost, It's Owl in the Numbers, we found this is caused by entangled tokens- seemingly unrelated tokens where boosting one also boosts the other. owls.baulab.info
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
I just had a blast going through the @SPARexec project proposals, it’s a great way to see where AI safety is heading. Plus it’s always satisfying to cross off some research ideas from my idea google doc sparai.org/projects/
"advanced usage patterns like running Claude 24/7 in the background" gang
Out-of-context reasoning at its finest. Are we sure secret loyalties won’t just naturally emerge within models?
Out-of-context reasoning at its finest. Are we sure secret loyalties won’t just naturally emerge within models? https://t.co/26HbOzlgpE
chain-of-thought monitorability is a wonderful thing ;) gist.githubusercontent.com/nostalgebraist…

Hilde @Paukouj530
36 Followers 2K Following My hobbies include eating and complaining that I’m getting fat.
Vauxie @Vauxie12372
12 Followers 449 Following You don’t have to play the game the way they wrote it.
Summer @Eefloopal14637
37 Followers 975 Following
J Rosser @ NeurIPS @jrosseruk
165 Followers 477 Following Pulling out all the FLOPS at @FLAIR_Ox 🚀 DPhil in Machine Learning @UniofOxford | ex-RS Intern @Spotify | ex-RS @convergence_ai_ (acq. @Salesforce)
katie ledecky @KLedecky61521
307 Followers 6K Following Athlete 4x U.S. Olympic Swimmer 9x Olympic Gold Medalist. 21x World Champion.
Chuang Gan @gan_chuang
9K Followers 496 Following Faculty Member at UMass Amherst; Principal researcher at MIT-IBM Watson AI Lab; Homepage: https://t.co/Pc8WeREfTz
HermosaYoung @Nqq8A42yORfz2Q
8 Followers 468 Following
Bill Leoutsakos @Bi11Leou
135 Followers 541 Following Computer Engineering @cambridge_uni | ex-ML Engineer @Cosine_AI | Eurotech Fellow
alentinaBert @bNF38wnkbzO2E8
19 Followers 1K Following
Jausal @Jausal29995
34 Followers 2K Following
Evan Abrams @EvanAbrams
5K Followers 885 Following Partner @SteptoeLLP. Emerging technology and national security law. AI | Chips | FinTech | Crypto. Not legal advice. Opinions are my own.
Fraluxav @Fraluxav924241
109 Followers 2K Following
Varhaw @Varhaw56377
62 Followers 2K Following
Ulyana Piterbarg @ulyanapiterbarg
945 Followers 630 Following reasoning, agents, RL, + open-endedness | PhDing at @nyuniversity, prev @MIT
Vouowu @Vouowu3839
100 Followers 3K Following
Jack Youstra @JackYoustra
80 Followers 103 Following
CandiceMiddleton @Dz5RXY2RS4Zzh43
35 Followers 1K Following
Allen @allenjpark
1K Followers 1K Following something new | cs @princeton | prev. evals @patronusAI & baker @subway
Josh Landes @guynamedjoshl
320 Followers 1K Following into flourishing futures and making friends with smart machines | @BlueDotImpact
Leo McKee-Reid @LeoMckeeReid
114 Followers 469 Following AI safety startup founder || sisyphus enjoyer prev: ml4science, deception, brains, rockets
Shannon Yang @shannonyangsky
1K Followers 4K Following 25. Building talent & community in AI safety. Currently @AISecurityInst, prev. @AnthropicAI. Philosophy, Politics, and Economics alumna @UniofOxford.
Amir Battye @three__sided
718 Followers 3K Following Founder @ https://t.co/8nC65PrUa6, Maths @cambridge_uni
Cozmin Ududec @CUdudec
371 Followers 2K Following @AISecurityInst Testing and Science of Evals. Ex quantum foundationalist.
Allie Cummings @allie_cumm33798
32 Followers 3K Following
Mario Giulianelli @glnmario
982 Followers 959 Following Associate Professor @ucl | Language and AI Science | Previously senior research scientist @AISafetyInst, postdoc @ETH_en, PhD @illc_amsterdam
Oudrauargtork @Oudrauargtork0
98 Followers 2K Following
AlgoTradeEdge🇺🇸 @Eefwikaw660849
42 Followers 2K Following 15-30% Monthly | 2 High-Conviction Stocks.Short-Term Gains: 15-20% in Days/Weeks.DM "JOIN" for WhatsApp Alerts. Live Trade Signals • Market Analysis
Lucas Irwin @lucasjamesirwin
38 Followers 137 Following Incoming DPhil student @UniofOxford | @Princeton CS | Ex-Twitter, https://t.co/mDPMzmN1Ye, @SentientAGI | Defence & Tech Committee @youngfabians | Philosophically-inclined
Ayman Ali @AAyman_1302
336 Followers 1K Following Investor @join_ef | @_ai_collective 🇬🇧 | Previously 👷♂️ @amazon @uber and multiple startups | 🇸🇦🇮🇳🇬🇧
Malxui @Malxui980
24 Followers 1K Following
Ougulau @Ougulau5260136
75 Followers 2K Following
Andrei Nebeleac @NebeleacAndrei
4K Followers 4K Following Please visit my store https://t.co/ZlTF9mfj03
Iegilo @Iegilo1212510
122 Followers 3K Following
Admiral. Charles Coop... @admiral94906
405 Followers 7K Following United States Navy Deputy commander of United States Central Command Former Commander United States Fifth Fleet From Winston-Salem, North Carolina
Sinuo @DoynespWAQ
56 Followers 843 Following Girls who love to laugh will never have bad luck. I also hope to meet my prince charming.
Nathan Herr @naitherr
90 Followers 72 Following PhD Student @AI_UCL & @UCL_DARK. ex Research Scientist @IBMResearch.
Jonathan Li @jonat_li
100 Followers 55 Following building Induction Labs (YC S25) | prev. reasoning @ cohere
Mitchell Hashimoto @mitchellh
146K Followers 141 Following Working on a new terminal: Ghostty. 👻 Prev: founded @HashiCorp. Created Vagrant, Terraform, Vault, and others. Vision Jet Pilot. 👨✈️
Alexander Panfilov @kotekjedi_ml
217 Followers 199 Following IMPRS-IS & ELLIS PhD Student @ Tübingen Interested in Trustworthy ML, Security in ML and AI Safety.
80,000 Hours Job Boar... @80000hours_jobs
19 Followers 1 Following A bot sharing highlighted jobs daily, made by @80000hours
J Rosser @ NeurIPS @jrosseruk
165 Followers 477 Following Pulling out all the FLOPS at @FLAIR_Ox 🚀 DPhil in Machine Learning @UniofOxford | ex-RS Intern @Spotify | ex-RS @convergence_ai_ (acq. @Salesforce)
Miles Kodama @Miles_M_K
74 Followers 4 Following
Seth Bannon @sethbannon
34K Followers 676 Following Entrepreneur, investor. Founder of @fiftyyears. Make something civilization needs. Also: https://t.co/xhMPeOCKIN
@levelsio @levelsio
734K Followers 2K Following 💸https://t.co/sQ0aiU7v02 $336K/m 📸https://t.co/lAyoqmSBRX $150K/m 🛰https://t.co/ZHSvI2wjyW $33K/m 🏡https://t.co/1oqUgfD6CZ $30K/m 🌍https://t.co/UXK5AFqCaQ $7K/m 👙https://t.co/RyXpqGuFM3 $14K/m 💾https://t.co/M1hEUBAynC $6K/m
Chuang Gan @gan_chuang
9K Followers 496 Following Faculty Member at UMass Amherst; Principal researcher at MIT-IBM Watson AI Lab; Homepage: https://t.co/Pc8WeREfTz
Bill Leoutsakos @Bi11Leou
135 Followers 541 Following Computer Engineering @cambridge_uni | ex-ML Engineer @Cosine_AI | Eurotech Fellow
Toby Ord @tobyordoxford
26K Followers 154 Following Senior Researcher at Oxford University. Author — The Precipice: Existential Risk and the Future of Humanity.
Oscar Moxon @oscarmoxon
660 Followers 2K Following founding engineer @WorkshopLabspbc, prev https://t.co/oZG5kO7BmM, msc artificial intelligence
Alex Petropoulos 🤠 @AlexTPet
603 Followers 409 Following Pending peer review. AI Policy @cfg_ThinkTank 🇬🇷🇬🇧 he/him
aaron @aarondotdev
6K Followers 2K Following math+cs | leveraging AI to build the next wave of immersive VR
Lance Yan @cnnguan
1K Followers 528 Following 18 | cs @uwaterloo | building cursor for mortgage brokers
oxbquant @oxbquant
7K Followers 158 Following G10 rates trader. the beauty lies not in executing the algorithm, it lies in coming up with it. not financial advice.
D. Scott Phoenix @fuelfive
2K Followers 499 Following Ex founder + CEO of Vicarious AI (raised $250M, acq by Alphabet). Partner at @fiftyyears investing in AI safety, hard tech, and a human future. We should talk.
Nate @NateBurnikell
146 Followers 353 Following Deputy Director, Research Unit, UK AISI. Generalist who enjoys getting difficult things done and trying to make the world less bad. Views mine, rt!=endorse
Gina El Nesr @ginaelnesr
1K Followers 443 Following @stanford phd • @nsf grf • @johnshopkins bs • deep learning for protein design & dynamics • oly weightlifter • https://t.co/VH20uOKXGG • 🏋🏽♀️ • 🇪🇬
Oxford Martin School @oxmartinschool
18K Followers 2K Following The Oxford Martin School at @UniofOxford is a centre of pioneering research that aims to find solutions to the world's most urgent challenges.
Mikita Balesni 🇺�... @balesni
1K Followers 644 Following Working on risks from rogue AI @apolloaievals Past: Reversal curse, Out-of-context reasoning // best way to support 🇺🇦 https://t.co/eagDB8VUzz
anshuman @athleticKoder
15K Followers 823 Following machine learning engineer; prev: ai consultant @google, mle @ https://t.co/7tFP7MHyLH, gsoc @tensorflow
Sergey Levine @svlevine
110K Followers 133 Following Associate Professor at UC Berkeley Co-founder, Physical Intelligence
Louis Barclay @louisbarclay
2K Followers 568 Following @mozilla fellow, editor at https://t.co/fwYYgKdVlg, co-creator of https://t.co/aXdcJBskF7 and https://t.co/Gr41hN4xxC
Yurii Rebryk @yrebryk
10K Followers 295 Following Founder of Fluently (YC W24) | Improve your English with AI at https://t.co/VlmIgrX51J
Michaël (in London) ... @MichaelTrazzi
18K Followers 289 Following
Andrei Lupu @_andreilupu
736 Followers 330 Following DPhil student @FLAIR_Ox and @AIatMeta. Previously @Mila_Quebec and @rllabmcgill Theory of Mind / Coordination / Rainbow Teaming 🌈 Opinions my own.
❄️Andrew Zhao❄�... @_AndrewZhao
4K Followers 3K Following PhD @Tsinghua_Uni. Absolute Zero,ExpeL,Diver-CT Ex. intern@MSFTResearch,@ BIGAI. Interested in RL, Reasoning/Safety 4 LLMs, Agents. On industry job market 2026
ICLR 2026 @iclr_conf
53K Followers 55 Following International Conference on Learning Representations #ICLR2026. SPC is @BharathHarihar3 and GC is @cvondrick
Ulyana Piterbarg @ulyanapiterbarg
945 Followers 630 Following reasoning, agents, RL, + open-endedness | PhDing at @nyuniversity, prev @MIT
Adam Tauman Kalai @adamfungi
2K Followers 86 Following
Tyler Tracy @tylertracy321
688 Followers 1K Following AI Control @ Redwood Research | paperclip minimizer
Bartłomiej Cupiał @CupiaBart
1K Followers 522 Following PhD Student @ University of Warsaw | @IDEAS_NCBR https://t.co/DrOexJe5Tf
Jared Perlo @_perloj
25 Followers 46 Following All things AI (and beyond)...so really just all things. Currently reporting for @NBCNews. Previously Winter Fellow @GovAI_ and Policy @JPAL. All views my own.
Jack Youstra @JackYoustra
80 Followers 103 Following
Raj Movva @rajivmovva
1K Followers 498 Following PhD student @Berkeley_AI. ML & society, interpretability, health. @MIT '22.
Harry Mayne @HarryMayne5
245 Followers 922 Following PhD-ing @uniofoxford researching LLM explainability and interpretability + doing some evals work along the way | Applied AI @The_IGC | Prev @Cambridge_Uni
Alfonso Amayuelas @AlfonAmayuelas
900 Followers 939 Following CS PhD Student at @ucsbNLP @ucsantabarbara AI/NLP 😎🌊🏄🏻♂️
Allen @allenjpark
1K Followers 1K Following something new | cs @princeton | prev. evals @patronusAI & baker @subway
Daniel Kang @daniel_d_kang
5K Followers 92 Following Asst. professor at UIUC CS. Formerly in the Stanford DAWN lab and the Berkeley Sky Lab.
Josh Landes @guynamedjoshl
320 Followers 1K Following into flourishing futures and making friends with smart machines | @BlueDotImpact
Fifty Years @fiftyyears
14K Followers 82 Following Make something civilization needs. Helping great scientists and engineers become great entrepreneurs is our jam.
Leo McKee-Reid @LeoMckeeReid
114 Followers 469 Following AI safety startup founder || sisyphus enjoyer prev: ml4science, deception, brains, rockets
Michael Pearce @_MichaelPearce
162 Followers 609 Following Mechanistic Interpretability @ Goodfire | Physics | Evolution
jasmine @jasminexli
269 Followers 519 Following cs/english @cornell • mle @GraySwanAI • ai safety, writing, imaginative computing work hard, feel wonder ✰⋆˙
Susan Zhang @suchenzang
34K Followers 675 Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for intelligence.