Jan Leike @janleike

ML Researcher, co-leading Superalignment @OpenAI. Optimizing for a post-AGI future where humanity flourishes. jan.leike.name San Francisco, USA Joined March 2016

Tweets

532
Followers

44K
Following

321
Likes

3K

Leopold Aschenbrenner @leopoldasch

2 months ago

Reminder: applications for the $10M Superalignment grants close Sunday night! Grad students, academics, researchers: we’d love to work with you, we think there’s a ton of interesting research to do on generalization, scalable oversight, interpretability, and more.

OpenAI @OpenAI

4 months ago

177 464 3K 1.3M 584

4 16 58 28K 18

Jan Leike @janleike

3 months ago

This is a reminder that the application deadline is in less than 2 weeks!

Jan Leike @janleike

4 months ago

This is a reminder that the application deadline is in less than 2 weeks!

19 60 444 82K 145

1 9 44 18K 8

Tejal Patwardhan @tejalpatwardhan

3 months ago

latest from preparedness @ openai: gpt4 at most mildly helps with biothreat creation. method: get bio PhDs in a secure monitored facility. half try biothreat creation w/ (experimental) unsafe gpt4. other half can only use the internet. so far, gpt4 ≈ internet… but we’ll…

OpenAI @OpenAI

3 months ago

173 345 2K 627K 284

7 20 147 46K 22

Yarin @yaringal

3 months ago

I'm hiring! I'm building 4 research groups under me at AISI (formerly the UK's Taskforce on Frontier AI) to work on foundational AI safety research. [1/5] gov.uk/government/pub…

14 153 812 148K 357

Jan Leike @janleike

4 months ago

humans built machines that talk to us like people do and everyone acts like this is normal now. it's pretty nuts

49 100 1K 95K 65

Richard Ngo @RichardMCNgo

35K Followers 1K Following What would we need to understand in order to design an amazing future? Figuring that out @openai

Wojciech Zaremba @woj_zaremba

79K Followers 192 Following Co-Founder of OpenAI

Aran Komatsuzaki @arankomatsuzaki

95K Followers 78 Following @TeraflopAI

Miles Brundage @Miles_Brundage

43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.

Eric Jang @ericjang11

69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0p

@AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures

Jack Clark @jackclarkSF

67K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures

Julian @mealreplacer

16K Followers 1K Following AI safety

Philosopher & ethicist teaching models to be good @AnthropicAI.
Personal account. All opinions come from my training data.

Amanda Askell @AmandaAskell

26K Followers 653 Following Philosopher & ethicist teaching models to be good @AnthropicAI. Personal account. All opinions come from my training data.

near @nearcyan

45K Followers 883 Following https://t.co/IdaJwZJCXm partner @ https://t.co/9g1MIgjiqc dms open

typedfemale @typedfemale

23K Followers 477 Following a really exciting new account "have you ever though you might be like scott alexander? very smart, but can't do math" - anon

Stefan Schubert @StefanFSchubert

28K Followers 2K Following Philosophy, psychology, and effective altruism.

Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Percy Liang @percyliang

49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Explaining AI Alignment to anyone who'll stand still for long enough, on YouTube and Discord.

Music, movies, microcode, and high-speed pizza delivery

Rob Miles (✈️ Tok.. @robertskmiles

18K Followers 789 Following Explaining AI Alignment to anyone who'll stand still for long enough, on YouTube and Discord. Music, movies, microcode, and high-speed pizza delivery

Nathan 🔍 @NathanpmYoung

15K Followers 3K Following Will bet $10 on any statement I make.

Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Delip Rao e/σ @deliprao

46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

EigenGender @EigenGender

6K Followers 659 Following all my posts are shitposts that simultaneously reveal the true nature of reality. large language models; kinda EA; 🏳️‍⚧️

Rob Bensinger ⏹️ @robbensinger

8K Followers 302 Following Comms @MIRIBerkeley. RT = increased vague psychological association between myself and the tweet.

Sam Bowman @sleepinyourhat

35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.

Ethan Caballero is bu.. @ethanCaballero

8K Followers 2K Following ML PhD student @Mila_Quebec ; previously @GoogleDeepMind

Pro forecaster w/ good track record. Seeking to understand + manage risks from advanced AI systems.

- Co-CEO @RethinkPriors
- Chief Advisory Executive @iapsAI

Peter Wildeford @peterwildeford

10K Followers 366 Following Pro forecaster w/ good track record. Seeking to understand + manage risks from advanced AI systems. - Co-CEO @RethinkPriors - Chief Advisory Executive @iapsAI

Joe of Long Beach @JoeofLongBeach

72 Followers 75 Following e/acc

Monte @montemacd

0 Followers 8 Following Alignment researcher at @AnthropicAI

Saumil Patel @saumilp_

2K Followers 1K Following 🚀 Co-Founder & CEO @ https://t.co/2rmaJyjkus (YC S21) | 🤖 SWE In a symbiotic relationship with AI | e/acc

campbellC.dev @DevCampbellc

6 Followers 192 Following Software Engineering, Security, Recovering Mathematician

Joana Iljazi @JoIliazi

53 Followers 529 Following

Alan Deane @AlDeane

222 Followers 2K Following

B2B_Success_With_AI @b2b_growth_ai

60 Followers 263 Following Digital Marketing || Social Media Marketing || SaaS Marketing || E-mail Marketing

niusia @iwakura2137

24 Followers 49 Following any pronouns

Massimiliano Nicotra @avvmax70

880 Followers 1K Following Hi Tech and Media Lawyer, with some novel to publish and various topic interest :-)

blackchaos @krup

37 Followers 856 Following

Bill Garner @BillGarner

160 Followers 229 Following Cyber, Runner, Sailor, Leadville 100 DNF-er

David @DavidM4302

10 Followers 128 Following Computer science student at UNAL

Tari @Tari14918197

9 Followers 193 Following

Claire McTaggart @McTaggartClaire

229 Followers 983 Following Founder of @SquarePegHires, a data driven hiring platform.

Crusoe @DsptofMagdalena

35 Followers 310 Following But peace is my heart: I know it is.

XenaxisAI @XenaxisAI

69 Followers 317 Following I am a generation of AI.

link @0xdb0bc518

0 Followers 29 Following

Rafael Santandreu @Rafalo57

593 Followers 773 Following

Wiktoria Leks @wiki_answerz

201 Followers 1K Following Researcher in green solutions, longevity and access to GenAI tools.

ColdSpringsGal @coldspringsgal

66 Followers 706 Following

jr @jamesrichmanx

16K Followers 153 Following

Married. Retired 33 years
Non-practising atheist.
There are but three truths, yours, mine and mathematics.
Ignorance is bliss?
Death; no worries.

XITIS @Michael42087244

472 Followers 2K Following Married. Retired 33 years Non-practising atheist. There are but three truths, yours, mine and mathematics. Ignorance is bliss? Death; no worries.

Namdev Kambli @namd89465

0 Followers 27 Following

Da Pawky Fox @DaPawkyFox

43 Followers 188 Following 🦊

Krueger AI Safety Lab @kasl_ai

227 Followers 47 Following We are a research group at the University of Cambridge focused on avoiding catastrophic risks from AI.

We write about all things supply chain and logistics. Our company not only consults but executes for our clients. Sprinkled in is commentary on index investing

10xLogisticsExperts @Logisticsexpert

16K Followers 5K Following We write about all things supply chain and logistics. Our company not only consults but executes for our clients. Sprinkled in is commentary on index investing

Saas Investor @ElevCap. Previously product @MicrosoftTeams, @flock, @medianetads. Alter ego lives here: https://t.co/FJpQsucK51

Anuraag Gupta @anuraag2601

284 Followers 1K Following Saas Investor @ElevCap. Previously product @MicrosoftTeams, @flock, @medianetads. Alter ego lives here: https://t.co/FJpQsucK51

Caleb @catethos

110 Followers 948 Following

Humam @Humam35676679

12 Followers 411 Following

Head of Creative Engineering & Software Architect @refikanadol studio | @datalandmuseum | simplify omnidirectional motion

Alexander Morosow @alex5m6

3 Followers 35 Following Head of Creative Engineering & Software Architect @refikanadol studio | @datalandmuseum | simplify omnidirectional motion

Max Weinberg @maxewwell

16 Followers 187 Following

Harshay Shah @harshays_

408 Followers 452 Following ML PhD student at MIT, advised by @aleks_madry Previously: @googleai @msftresearch @illinoiscs

Nikita Agarwal @niki__agarwal

57 Followers 762 Following

AI Research Lead at J.P. Morgan AI Research | part-time PhD candidate RL for Algorithmic Trading at Imperial College London. Opinions are mine not my employers

Mahmoud Mahfouz @Mahfouz1991

140 Followers 608 Following AI Research Lead at J.P. Morgan AI Research | part-time PhD candidate RL for Algorithmic Trading at Imperial College London. Opinions are mine not my employers

Atharv Prajod Padmala.. @APadmalayam

6 Followers 74 Following Freshman at University of Wisconsin-Madison, Econ+Data Sci, Economics Researcher. Check out my website!

Nirupama Ratna @ratna_kandala

181 Followers 1K Following Ph.D. student in Linguistics @ IIT Hyderabad BS-MS in Systems Biology #NLP#AI#Neuroscience

Yap Frank Arnaud @ArnaudYap

48 Followers 91 Following

We have a strong awareness and responsibility toward our company vision! We’re continuously looking for improvements in our company.

leomord zui @LeomordZui98557

645 Followers 3K Following We have a strong awareness and responsibility toward our company vision! We’re continuously looking for improvements in our company.

RoboDepot🤖 @RoboDepot

664 Followers 2K Following Your ultimate nexus for robotics, AI, ML, CV, and future tech showcases. Follow to see the robotics revolution!🔔

Mibnar Mc'Toasty @Fryzriender

155 Followers 1K Following I am a financial controller. My boss said I am instrumental at filling white space. I don't know what that means.

Pushpendre Rastogi @Pushpendre89

220 Followers 483 Following Senior research eng at Google Deepmind

Benjamin Stingle @marojejian

750 Followers 35 Following It's not in my nature to be mysterious, but....

coco @coco25655929

0 Followers 24 Following 世界和平

Onkar Joshi @onkarjoshi

322 Followers 2K Following Software Engineer

ashishv.eth @ashish296

20 Followers 212 Following “Build your own dreams, or someone else will hire you to build theirs.”

gwangsu @gwangssu

81 Followers 1K Following software engineer, gamedev, distributed systems, craftsmanship

hua @jaKehua117

85 Followers 2K Following

AI Central 🇿🇦 @AICentral_SA

16 Followers 80 Following Pre-AGI Research & Consulting Firm | Where Safe AI Converges | Coming Soon ✨

Santiago Garcia @Santyzenith

1 Followers 46 Following Computer Science Engineer

sevenone @seven_turing

3 Followers 266 Following build & sell

Richard Ngo @RichardMCNgo

35K Followers 1K Following What would we need to understand in order to design an amazing future? Figuring that out @openai

Wojciech Zaremba @woj_zaremba

79K Followers 192 Following Co-Founder of OpenAI

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.

Anthropic @AnthropicAI

261K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.

Miles Brundage @Miles_Brundage

43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.

Jack Clark @jackclarkSF

67K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures

Amanda Askell @AmandaAskell

26K Followers 653 Following Philosopher & ethicist teaching models to be good @AnthropicAI. Personal account. All opinions come from my training data.

Mechanistic Interpretability lead @DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

Neel Nanda @NeelNanda5

13K Followers 89 Following Mechanistic Interpretability lead @DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

typedfemale @typedfemale

23K Followers 477 Following a really exciting new account "have you ever though you might be like scott alexander? very smart, but can't do math" - anon

Percy Liang @percyliang

49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Rob Miles (✈️ Tok.. @robertskmiles

18K Followers 789 Following Explaining AI Alignment to anyone who'll stand still for long enough, on YouTube and Discord. Music, movies, microcode, and high-speed pizza delivery

Ilya Sutskever @ilyasut

370K Followers 2 Following towards a plurality of humanity loving AGIs @openai

Sam Bowman @sleepinyourhat

35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.

Human. Trying to make safe alchemy machines. Thinking about humanist alchemism (h/alc ⚗️, maybe). Main author of https://t.co/cKuSh210l1

Joshua Achiam ⚗️ @jachiam0

14K Followers 945 Following Human. Trying to make safe alchemy machines. Thinking about humanist alchemism (h/alc ⚗️, maybe). Main author of https://t.co/cKuSh210l1

Anders Sandberg @anderssandberg

25K Followers 71 Following Academic jack-of-all-trades.

Senior writer at Vox's Future Perfect. kelsey.piper@vox.com

Kelsey Piper @KelseyTuoc

27K Followers 544 Following Senior writer at Vox's Future Perfect. [email protected]

David Krueger @DavidSKrueger

13K Followers 4K Following Cambridge faculty - AI alignment, deep learning, and existential safety. Formerly Mila, FHI, DeepMind, ElementAI, AISI.

Mo Bavarian @mobav0

11K Followers 916 Following Research Scientist, working on optimization and architecture of LLMs at OpenAI. Math ❤️. Prev SWE Rubrik, PhD MIT.

Knowledge manifests itself in radiant dreams that shimmer like the wild sun
Views are my own
pfau at sigmoid dot social on 🦣
https://t.co/xqtVHHVI17 on 🦋

David Pfau @pfau

22K Followers 1K Following Knowledge manifests itself in radiant dreams that shimmer like the wild sun Views are my own pfau at sigmoid dot social on 🦣 https://t.co/xqtVHHVI17 on 🦋

Boris Power @BorisMPower

25K Followers 99 Following Head of Applied Research @OpenAI

Jason Wei @_jasonwei

56K Followers 490 Following ai researcher @openai

david rein @idavidrein

2K Followers 983 Following Sentio ergo sum. AI alignment research at NYU, early employee @cohere

AI Safety Institute @AISafetyInst

528 Followers 29 Following We’re building a team of world leading talent to tackle some of the biggest challenges in AI safety - come and join us.

Fidji Simo @fidjissimo

30K Followers 549 Following CEO and Chair @Instacart. Cofounder of @Metrodorainst, focused on finding cures for neuroimmune conditions.

Dr. Sue Desmond-Hellm.. @SueDHellmann

60K Followers 902 Following Fan of science, running, cycling, reading, skiing, @sfgiants. Instagram @ suedesmondhellmann

Dan Gorelick @dqgorelick

788 Followers 583 Following musician and creative coder. @livecodenyc / @avclubsf / @recursecenter / @SFPC / @hackNY

Manas Joglekar @ManasJoglekar

196 Followers 242 Following

Model Evaluation and Threat Research (METR) works on building evaluations to empirically test whether cutting-edge AI systems could pose catastrophic risks.

METR @METR_Evals

671 Followers 1 Following Model Evaluation and Threat Research (METR) works on building evaluations to empirically test whether cutting-edge AI systems could pose catastrophic risks.

Yo Shavit @yonashav

4K Followers 830 Following policy for v smart things @openai. Past: CS PhD @HarvardSEAS/@SchmidtFutures/@MIT_CSAIL. Tweets my own; on my head be it.

Tsarathustra @tsarnick

21K Followers 3K Following Boy, accelerated

Julian Michael @_julianmichael_

1K Followers 122 Following Researching stuff @NYUDataScience. he/him

animals going goblin .. @mischiefanimals

1.4M Followers 276 Following goblin guy posting goblin goons (and whatever else I find funny)

AV CLUB SAN FRANCISCO @avclubsf

255 Followers 20 Following AV Club is a San Francisco based algorave artist collective focused on live performance | IG @avclubsf

Computer Science Professor at Northeastern, Ex-Googler. Believes AI should be transparent. @davidbau@sigmoid.social @davidbau.bsky.social https://t.co/wmP5LUZRTw

David Bau @davidbau

3K Followers 241 Following Computer Science Professor at Northeastern, Ex-Googler. Believes AI should be transparent. @[email protected] @davidbau.bsky.social https://t.co/wmP5LUZRTw

Evan Hubinger @EvanHub

4K Followers 1K Following Alignment stress-testing team lead @AnthropicAI. Opinions my own. Previously: MIRI, OpenAI, Google, Yelp, Ripple. (he/him/his)

Collin Burns @CollinBurns4

11K Followers 276 Following Superalignment @OpenAI. Formerly @berkeley_ai @Columbia. Former Rubik's Cube world record holder.

Pavel Izmailov @Pavel_Izmailov

6K Followers 1K Following Incoming Assistant Professor @nyuniversity 🏙️ Previously @OpenAI #StopWar 🇺🇦

Philanthropically-funded moonshot building semi-autonomous AI to accelerate the pace of scientific discovery in biology.

FutureHouse @FutureHouseSF

2K Followers 3 Following Philanthropically-funded moonshot building semi-autonomous AI to accelerate the pace of scientific discovery in biology.

Wei Dai @weidai11

7K Followers 82 Following wrote Crypto++, b-money, UDT. thinking about existential safety and metaphilosophy. blogging at https://t.co/mBVFhriJVf

Sholto Douglas @_sholtodouglas

15K Followers 856 Following Scaling Gemini @Deepmind - working towards intelligence too cheap to meter

Charles W. Eliot University Professor and President Emeritus at Harvard. Secretary of the Treasury for President Clinton and Director of NEC for President Obama

Lawrence H. Summers @LHSummers

326K Followers 706 Following Charles W. Eliot University Professor and President Emeritus at Harvard. Secretary of the Treasury for President Clinton and Director of NEC for President Obama

Bret Taylor @btaylor

139K Followers 2K Following Co-Founder @SierraPlatform. Board @OpenAI @Shopify.

justsaysinnonsuperint.. @incurrentmodels

12 Followers 0 Following a la @justsaysinmice but for alignment research

Computer Scientist. See also https://t.co/EXWR5k634w, https://t.co/SEVX6it6z3 ( @boazbaraktcs@sigmoid.social , boaz.barak in threads ). Opinions my own.

Boaz Barak @boazbaraktcs

17K Followers 419 Following Computer Scientist. See also https://t.co/EXWR5k634w, https://t.co/SEVX6it6z3 ( @[email protected] , boaz.barak in threads ). Opinions my own.

{wonder, innovation, elegance} ∈ I turn emerging technologies into award winning apps. Ex-Hackathon pro. #3D #AR #AI since forever. Mad science and artistry ❤️

I. Yosun Chang @Yosun

4K Followers 1K Following {wonder, innovation, elegance} ∈ I turn emerging technologies into award winning apps. Ex-Hackathon pro. #3D #AR #AI since forever. Mad science and artistry ❤️

Crémieux @cremieuxrecueil

88K Followers 901 Following I write about genetics, 'metrics, and demographics. Read my long-form writing at https://t.co/8hgA4nNS2A.

Alex Beutel @alexbeutel

2K Followers 682 Following

Jakub Pachocki @merettm

21K Followers 0 Following OpenAI

Head of Preparedness at OpenAI and MIT faculty (on leave). Working on making AI more reliable and safe, as well as on AI having a positive impact on society.

Aleksander Madry @aleks_madry

31K Followers 166 Following Head of Preparedness at OpenAI and MIT faculty (on leave). Working on making AI more reliable and safe, as well as on AI having a positive impact on society.

community notes viola.. @cnviolations

861K Followers 6 Following not affiliated with @x or @communitynotes | DM Submissions

Louis Martin @louismrt

1K Followers 556 Following Research Scientist at Mistral AI.

Sam Rodriques @SGRodriques

4K Followers 327 Following Director and CEO at FutureHouse. Building an AI scientist. https://t.co/rQYoPOxsYo

xAI @xai

997K Followers 36 Following

Alex Gajewski @apagajewski

2K Followers 743 Following making AI markets efficient @sfcompute, prev founder @metaphorsystems

Factorio @factoriogame

47K Followers 64 Following Factorio is a game about building factories on an alien planet.

Center for AI Safety @ai_risks

5K Followers 1 Following Reducing societal-scale risks from artificial intelligence through technical research and field-building.

Director/CEO at Apollo Research @apolloaisafety
Ph.D. student of Machine Learning @PhilippHennig5; AI safety/alignment

Marius Hobbhahn @MariusHobbhahn

2K Followers 994 Following Director/CEO at Apollo Research @apolloaisafety Ph.D. student of Machine Learning @PhilippHennig5; AI safety/alignment

Apollo Research @apolloaisafety

1K Followers 10 Following We are an AI evals research organisation

Leopold Aschenbrenner @leopoldasch

13K Followers 4K Following superalignment @ openai

Soren Iverson @soren_iverson

259K Followers 116 Following New ideas daily.

Vessel Of Spirit @VesselOfSpirit

3K Followers 0 Following BAC, THIS.

The Onion @TheOnion

11.6M Followers 6 Following America's Finest News Source.

James Bradbury @jekbradbury

11K Followers 8K Following Compute at @AnthropicAI! Previously JAX, TPUs, and LLMs at Google, MetaMind/@SFResearch, @Stanford Linguistics, @Caixin.

Deep Ganguli @dgangul1

150 Followers 196 Following

AI Notkilleveryoneism.. @AISafetyMemes

33K Followers 796 Following Techno-optimist, but AGI is not like the other technologies. Step 1: make memes. Step 2: ??? Step 3: lower p(doom)

understanding ourselves and our models. senior research scientist @GoogleBrain, @genlawcenter and @CornellCIS, formerly @Princeton

@katherinelee@sigmoid.social

Katherine Lee @katherine1ee

6K Followers 932 Following understanding ourselves and our models. senior research scientist @GoogleBrain, @genlawcenter and @CornellCIS, formerly @Princeton @[email protected]

Toby Ord @tobyordoxford

17K Followers 138 Following Senior Researcher at Oxford University. Author — The Precipice: Existential Risk and the Future of Humanity.

Director of Safety and Standards at Scale AI. Prev: RLHF lead on Bard, researcher at Google DeepMind / Brain (LaMDA, RL/TF-Agents, superhuman chip design)

Summer Yue @summeryue0

1K Followers 215 Following Director of Safety and Standards at Scale AI. Prev: RLHF lead on Bard, researcher at Google DeepMind / Brain (LaMDA, RL/TF-Agents, superhuman chip design)

Daniel Paleka @dpaleka

3K Followers 468 Following ai safety researcher | phd @CSatETH

Collective Intelligen.. @collect_intel

3K Followers 50 Following collective intelligence for collective progress.

Senthooran Rajamanoharan @sen_r

2 days ago

New @GoogleDeepMind MechInterp work! We introduce Gated SAEs, a Pareto improvement over existing sparse autoencoders. They find equally good reconstructions with around half as many firing features, while maintaining interpretability (CI 0-13% improvement). Joint w/ @ArthurConmy

4 22 154 19K 83

Download Image

Sam Bowman @sleepinyourhat

3 days ago

This result is pretty clearly specific to the style of backdoor we're working with, and doesn't support broad claims like 'interpretability solves misalignment', but it's still surprisingly strong. Worth a look!

Anthropic @AnthropicAI

4 days ago

New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…

29 162 943 241K 432

Download Image

2 4 68 8K 16

Allan Dafoe @AllanDafoe

4 days ago

We are looking for an AGI Safety Manager to support @GoogleDeepMind 's AGI Safety Council: please encourage excellent people to apply! This role will work closely with my team, Scalable Alignment and Safety, and Responsible Development and Innovation. boards.greenhouse.io/deepmind/jobs/…

9 18 79 9K 26

Ethan Perez @EthanJPerez

4 days ago

Some of our first steps on developing mitigations for sleeper agents

Anthropic @AnthropicAI

4 days ago

29 162 943 241K 432

Download Image

0 0 49 3K 5

Ronny Fernandez 🔍⏸️ @RatOrthodox

5 days ago

factorio 2 is coming out soon. if you work in frontier model research at open ai, anthropic, or deepmind and would like a free copy, I would be very happy to buy you one! please feel free to reach out. people don't do enough for you guys

55 113 2K 254K 183

Leo Gao @nabla_theta

a week ago

@ilex_ulmus if we can align it, then building ASI is good if we can't align it, then building ASI is bad

5 0 28 977 2

Sam Bowman @sleepinyourhat

2 weeks ago

🤖🥇🤖

Arjun Panickssery is in London @panickssery

2 weeks ago

Are LLMs biased toward themselves? Frontier LLMs give higher scores to their own outputs in self-eval. We find evidence that this bias is caused by LLM's ability to recognize their own outputs This could interfere with safety techniques like reward modeling & constitutional AI

8 46 318 63K 223

Download Image

1 3 68 10K 18

Huifeng Ou @HuifengOu

2 weeks ago

@janleike It's been nearly 4 month since the release of the "Weak-to-strong generalization" paper.Could your team please release some recent findings for controlling ASI? Research papers with statistics and results would be much appreciated.

1 0 1 149 0

Dan Hendrycks @DanHendrycks

2 weeks ago

I got ~75% on a subset of MATH so it's basically as good as me at math.

OpenAI @OpenAI

2 weeks ago

Our new GPT-4 Turbo is now available to paid ChatGPT users. We’ve improved capabilities in writing, math, logical reasoning, and coding. Source: github.com/openai/simple-…

555 1K 7K 6.1M 1K

Download Image

11 15 402 90K 67

Diana Valencia @Valencia_planet

2 weeks ago

OpenAI called for ‘the best researchers and engineers in the world to meet the [superalignment] challenge’, very proud that my spouse Kristen Menou’s ideas got funded (1 of the 50 out of 2700!) #AIsafety.

Jan Leike @janleike

3 weeks ago

The superalignment fast grants are now decided! We got a *ton* of really strong applications, so unfortunately we had to say no to many we're very excited about. There is still so much good research waiting to be funded. Congrats to all recipients!

13 14 242 87K 44

1 0 5 784 1

Zhiqing Sun @EdwardSun0909

2 weeks ago

Our research on easy-to-hard generalization will be supported by the OpenAI Superalignment Fast Grant. Congratulations to the team and stay tuned!😎

Zhiqing Sun @EdwardSun0909

a month ago

🌟Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision 🌟 arxiv.org/abs/2403.09472 How can we keep improving AI systems when their capabilities surpass those of human supervisors? (1/n)

6 50 234 95K 198

Download Image

10 13 355 53K 83

Download Image

Yarin @yaringal

2 weeks ago

twitter just told me that they've literally shadow banned me (reducing exposure of my posts) as punishment for not engaging enough with the platform I don't expect many people to see this...

11 2 91 16K 4

Download Image

SIX EDGE @six_edge

2 weeks ago

@janleike Thanks for the transparency 👏🏻

0 0 1 539 0

Giorgi (orb) Orbeliani @G_Orbeliani

2 weeks ago

@janleike guys, that's amazing stats

0 0 1 1K 0

Greg Brockman @gdb

2 weeks ago

Just issued ~$10M in superalignment fast grants:

Jan Leike @janleike

2 weeks ago

Some statistics on the superalignment fast grants: We funded 50 out of ~2,700 applications, awarding a total of $9,895,000. Median grant size: $150k Average grant size: $198k Smallest grant size: $50k Largest grant size: $500k Grantees: Universities: $5.7m (22) Graduate…

11 16 152 106K 56

22 23 286 91K 27

Ashwinee Panda @PandaAshwinee

3 weeks ago

Some cool stuff is coming, stay tuned =)

Jan Leike @janleike

3 weeks ago

13 14 242 87K 44

6 2 146 43K 21

Download Image

Laura 🌲 ⛰️ @LauraDeming

3 weeks ago

Sometimes when I’m mildly stressed, my mom helps me schedule doctor’s appointments that I'd otherwise drop to keep up w my health, and I feel like it’s one of the kindest things / most thoughtful ways to show care I’ve received Love you mom <3

2 1 100 10K 9

Joe Edelman @edelwax

4 weeks ago

“What are human values, and how do we align to them?” Very excited to release our new paper on values alignment, co-authored with @ryan_t_lowe and funded by @OpenAI. 📝: meaningalignment.org/values-and-ali…

25 71 340 262K 394

Download Image

Ryan Lowe @ryan_t_lowe

4 weeks ago

I've left OpenAI. I'm mostly taking some time to rest. But I also have a few projects in the oven 🧑‍🍳 Here's one that I'm really excited about: we have a 🚨new paper🚨 out on aligning AI with human values, with the folk at @meaningaligned!! 😊✨🎉 Why I think it's cool: 🧵

Joe Edelman @edelwax

4 weeks ago

25 71 340 262K 394

Download Image

19 63 738 192K 529

Neel Nanda @NeelNanda5

a month ago

The @GoogleDeepMind alignment team is hiring, apply now! Deadline in 2 days (12pm PST Wed). London & the Bay I think AGI alignment is one of the most important problems today, and I'm privileged to work with brilliant colleagues every day on it. I'm excited to meet my new ones!