Dan Hendrycks @DanHendrycks

• Director of the Center for AI Safety (https://t.co/ahs3LYCpqv) • GELU/ImageNet-C/MMLU/safety groundwork • PhD in AI from UC Berkeley https://t.co/rgXHAnYAsQ https://t.co/YtGtDh1aAV danhendrycks.com San Francisco 🇺🇸🏳️‍🌈 Joined August 2009

Tweets

517
Followers

17K
Following

78
Likes

179

Dan Hendrycks @DanHendrycks

22 hours ago

SB 1047 highlights and FAQ safesecureai.org/learn

3 3 24 4K 15

Geoffrey Hinton @geoffreyhinton

a day ago

@martin_casado @radicalvcfund If you leave it to companies to decide what is safe you get the Boeing 737 max.

42 23 341 107K 28

Dan Hendrycks @DanHendrycks

a day ago

Hinton and Bengio on SB 1047 and a summary of the bill. Hinton: “SB 1047 takes a very sensible approach... I am still passionate about the potential for AI to save lives through improvements in science and medicine, but it’s critical that we have legislation with real teeth to…

6 15 88 9K 40

Download Image

Dan Hendrycks @DanHendrycks

7 days ago

I would guess this is likely won't hold up to better adversaries. In making the RepE paper (ai-transparency.org) we explored using it for trojans ("sleeper agents") and found it didn't work after basic stress testing.

Anthropic @AnthropicAI

7 days ago

37 165 970 261K 437

Download Image

2 6 99 18K 48

Dan Hendrycks @DanHendrycks

a week ago

GPT-5 doesn't seem likely to be released this year. Ever since GPT-1, the difference between GPT-n and GPT-n+0.5 is ~10x in compute. That would mean GPT-5 would have around ~100x the compute GPT-4, or 3 months of ~1 million H100s. I doubt OpenAI has a 1 million GPU server ready.

Flowers from the future @futuristflower

a week ago

60 10 290 100K 22

13 6 142 42K 49

Kevin Roose @kevinroose

2 weeks ago

AI researchers like @DanHendrycks, who helped create the MMLU (essentially the SAT for chatbots) told me that leading benchmark tests have reached "saturation" -- basically, they're too easy for today's LLMs -- and that we will soon need to develop harder tests to gauge model…

4 10 47 14K 13

Download Image

Dan Hendrycks @DanHendrycks

3 weeks ago

I got ~75% on a subset of MATH so it's basically as good as me at math.

OpenAI @OpenAI

3 weeks ago

I got ~75% on a subset of MATH so it's basically as good as me at math.

571 1K 7K 6.1M 1K

Download Image

11 15 400 90K 67

Center for AI Safety @ai_risks

a month ago

We’re excited to announce SafeBench, a competition to develop benchmarks for empirically assessing safety! There are $250,000 in prizes, with submissions closing on Feb 25th, 2025. This project is supported by Schmidt Sciences. Visit: mlsafety.org/safebench 🧵(1/3)

2 24 75 7K 19

Dan Hendrycks @DanHendrycks

a month ago

People aren't thinking through the implications of the military controlling AI development. It's plausible AI companies won't be shaping AI development in a few years, and that would dramatically change AI risk management. Possible trigger: AI might suddenly become viewed as the…

47 42 280 54K 132

Dan Hendrycks @DanHendrycks

a month ago

x.ai/blog/grok-os Grok-1 is open sourced. Releasing Grok-1 increases LLMs' diffusion rate through society. Democratizing access helps us work through the technology's implications more quickly and increases our preparedness for more capable AI systems. Grok-1 doesn't pose…

Grok @grok

a month ago

1K 2K 16K 9.7M 813

20 34 371 93K 66

Dan Hendrycks @DanHendrycks

2 months ago

We've released a post about the looming risk of AI cyberattacks on critical infrastructure. It notes that we are living under a "cyberattack overhang." Advances in defensive techniques are of no help if defenders are not keeping up to date. safe.ai/blog/cybersecu… by @snewmanpv

3 11 73 7K 23

Dan Hendrycks @DanHendrycks

2 months ago

Reminder that "Responsible Scaling Policies" are just non-binding proclamations and as such shouldn't be interpreted as a strong line of defense for safety. Voluntary commitments can be easily violated without much social blowback. For example, responsible AI teams have been…

6 12 116 10K 20

Center for AI Safety @ai_risks

2 months ago

The White House Executive Order on AI highlights the risks of LLMs empowering malicious actors in developing biological, cyber, and chemical weapons. To measure and reduce these risks, we’re releasing the Weapons of Mass Destruction Proxy (WMDP) benchmark. (🧵below)

4 21 57 14K 23

Download Image

Will Henshall @henshall_will

2 months ago

Exclusive: New research provides a way to measure whether an AI model contains potentially hazardous knowledge, along with a technique for removing the knowledge from an AI system while leaving the rest of the model relatively intact. time.com/6878893/ai-art…

9 12 39 30K 14

Dan Hendrycks @DanHendrycks

2 months ago

"Researchers Develop New Technique to Wipe Dangerous Knowledge From AI Systems" by @henshall_will time.com/6878893/ai-art…

0 1 22 2K 5

Dan Hendrycks @DanHendrycks

2 months ago

Can hazardous knowledge be unlearned from LLMs without harming other capabilities? We’re releasing the Weapons of Mass Destruction Proxy (WMDP), a dataset about weaponization, and we create a way to unlearn this knowledge. 📝arxiv.org/abs/2403.03218 🔗wmdp.ai

13 65 242 64K 153

Download Image

Richard Ngo @RichardMCNgo

35K Followers 1K Following What would we need to understand in order to design an amazing future? Figuring that out @openai

Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Percy Liang @percyliang

49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

Sam Bowman @sleepinyourhat

35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.

Mechanistic Interpretability lead @DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

Neel Nanda @NeelNanda5

13K Followers 89 Following Mechanistic Interpretability lead @DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

@AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures

Jack Clark @jackclarkSF

68K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futures

David Krueger @DavidSKrueger

13K Followers 4K Following Cambridge faculty - AI alignment, deep learning, and existential safety. Formerly Mila, FHI, DeepMind, ElementAI, AISI.

Miles Brundage @Miles_Brundage

43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.

Michaël Trazzi @MichaelTrazzi

12K Followers 24 Following AI Alignment https://t.co/cAS4FnR5yf

Eric Jang @ericjang11

69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0p

Nathan 🔍 @NathanpmYoung

15K Followers 3K Following Will bet $10 on any statement I make.

Philosopher & ethicist teaching models to be good @AnthropicAI.
Personal account. All opinions come from my training data.

Amanda Askell @AmandaAskell

26K Followers 653 Following Philosopher & ethicist teaching models to be good @AnthropicAI. Personal account. All opinions come from my training data.

Ethan Caballero is bu.. @ethanCaballero

8K Followers 2K Following ML PhD student @Mila_Quebec ; previously @GoogleDeepMind

Jason Wei @_jasonwei

57K Followers 491 Following ai researcher @openai

Jan Leike @janleike

44K Followers 322 Following ML Researcher, co-leading Superalignment @OpenAI. Optimizing for a post-AGI future where humanity flourishes.

Explaining AI Alignment to anyone who'll stand still for long enough, on YouTube and Discord.

Music, movies, microcode, and high-speed pizza delivery

Rob Miles (✈️ Tok.. @robertskmiles

18K Followers 789 Following Explaining AI Alignment to anyone who'll stand still for long enough, on YouTube and Discord. Music, movies, microcode, and high-speed pizza delivery

Behnam Neyshabur @bneyshabur

18K Followers 690 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpacking

Human. Trying to make safe alchemy machines. Thinking about humanist alchemism (h/alc ⚗️, maybe). Main author of https://t.co/cKuSh210l1

Joshua Achiam ⚗️ @jachiam0

14K Followers 948 Following Human. Trying to make safe alchemy machines. Thinking about humanist alchemism (h/alc ⚗️, maybe). Main author of https://t.co/cKuSh210l1

Horace He @cHHillee

24K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemale

Hanging out with Claude, improving its behavior, and building tools to support that @AnthropicAI 😁

prev: @open_phil @googlebrain @openai (@microcovid)

Catherine Olsson @catherineols

15K Followers 1K Following Hanging out with Claude, improving its behavior, and building tools to support that @AnthropicAI 😁 prev: @open_phil @googlebrain @openai (@microcovid)

Pro forecaster w/ good track record. Seeking to understand + manage risks from advanced AI systems.

- Co-CEO @RethinkPriors
- Chief Advisory Executive @iapsAI

Peter Wildeford @peterwildeford

10K Followers 367 Following Pro forecaster w/ good track record. Seeking to understand + manage risks from advanced AI systems. - Co-CEO @RethinkPriors - Chief Advisory Executive @iapsAI

PaLaD1N @PaLaD1Nnn

121 Followers 1K Following 🇹🇷🇳🇱

✦✦✦ @not_infinite___

35 Followers 374 Following

A Software Engineer in Osaka (& Kyoto). Ph.D. in Engineering.
Interests: Parsers, Formal Languages, etc.
ツイートは所属先の見解と関係ありません．思いついたことをつぶやきます．

kmizu @kmizu

12K Followers 6K Following A Software Engineer in Osaka (& Kyoto). Ph.D. in Engineering. Interests: Parsers, Formal Languages, etc. ツイートは所属先の見解と関係ありません．思いついたことをつぶやきます．

Joe @ffohlegnag

132 Followers 501 Following Yum

Software engineer, runner, meditator, GWWC pledgee and aspiring Effective Altruist.

Give me anonymous feedback here: https://t.co/XaOL33SC4b

Jonny Spicer @jjspicer

180 Followers 340 Following Software engineer, runner, meditator, GWWC pledgee and aspiring Effective Altruist. Give me anonymous feedback here: https://t.co/XaOL33SC4b

𝕏iaoHu Zhu⏹️cs.. @neil_csagi

756 Followers 2K Following eXistential Hope native. https://t.co/pX9vqSzWEq | @Foresightinst Fellow in Safe AGI | @FLIxrisk Affiliate

beeple @beeple33

40 Followers 4K Following 123456789

Sam Winter-Levy @SamWinterLevy

770 Followers 2K Following PoliSci grad student @ Princeton, sometimes writer. Previously @ForeignAffairs, @TheEconomist, @IrregWarfare.

John Beatty @john_d_beatty

789 Followers 2K Following EIR at Sutter Hill Ventures. Engineer. Former co-founder/VPEng/CEO @CloverCommerce, ex-Yahoo, ex-BEA, ex-Sun.

Ian David Moss @iandavidmoss

788 Followers 453 Following Founder, @EffectiveInst. Strategic advisor to philanthropists and social sector leaders. Newsletter: https://t.co/gAVtwVcz06

Amator @sodalitatum

940 Followers 552 Following torquēmurque metū caecāque cupīdine rērum

Sneha Revanur @SnehaRevanur

1K Followers 503 Following Mobilizing my peers for a safe, equitable AI future @EncodeJustice

Brian Duke @BrianSDuke

447 Followers 2K Following

Luke Drago🌻 @lukepdrago

463 Followers 2K Following CLT Native | @UniOfOxford, @StEdmundHall History and Politics '23 | Views my own.

I help progressive advocacy & labor orgs in CA win #CALeg policy & #CABudget funding | Lobbyist | Host of Blueprint for CA Advocates Pod | Author of Changemaker

Kristina Bas Hamilton @kbashamilton

3K Followers 4K Following I help progressive advocacy & labor orgs in CA win #CALeg policy & #CABudget funding | Lobbyist | Host of Blueprint for CA Advocates Pod | Author of Changemaker

Brian Jabarian @brian_jabarian

8K Followers 1K Following Howard and Nancy Marks Principal Researcher at Chicago Booth Business School

Chris Rieckmann⏹️ @chrieck

15 Followers 54 Following

Teri Olle @TeriOlle

358 Followers 2K Following Midwestern, San Franciscan. Like policy, politics, persuasion, pragmatism. Also hawks & wolves.

Nicholas Meade @ncmeade

127 Followers 149 Following PhD student at @McGillU / @Mila_Quebec; Interested in #NLProc.

President & CEO of Intrepid Network Inc. Our mission is to provide a more personable service along with the highest quality business solutions.

Joshua Utley @intrepidnetwork

95 Followers 229 Following President & CEO of Intrepid Network Inc. Our mission is to provide a more personable service along with the highest quality business solutions.

Open @OpenXuu

0 Followers 150 Following

CA State Senator. Chair, Budget Committee. Former Chair, Legislative LGBTQ Caucus. Housing/transit/climate/criminal justice reform/health. Democrat.🏳️‍🌈✡️

Senator Scott Wiener @Scott_Wiener

101K Followers 1K Following CA State Senator. Chair, Budget Committee. Former Chair, Legislative LGBTQ Caucus. Housing/transit/climate/criminal justice reform/health. Democrat.🏳️‍🌈✡️

Lucas Lingle @LucasLingle

10 Followers 230 Following AI researcher interested in transformers and architecture search.

Security Ticks @SecTicks

960 Followers 5K Following Cybersecurity and other IT News aggregator

jj @punchgod_7

272 Followers 3K Following always guard up. ( ง︡'-'︠)ง

pointed_max @pointed_max

1K Followers 966 Following a pointed set is a set endowed with a distinguished element, called the base point.

Hao Tang @tanghao95

53 Followers 517 Following CS PhD student @ Cornell

Slav @99999venture

150 Followers 5K Following Bolockchain - Defi - NFT - Game Fi. Data, AI. Metaverse.

joãozinho @j_o_a_o_2020

91 Followers 859 Following

vinay @vinay_pai

108 Followers 2K Following 🌍🌏🌎

Make @LearnAnything_

Learn in public: https://t.co/GbFvuErkYn

macOS course: https://t.co/JdbJWru6zG

https://t.co/94R8ER7K2h
https://t.co/ROkqhyhpEK

Nikita @nikitavoloboev

4K Followers 7K Following Make @LearnAnything_ Learn in public: https://t.co/GbFvuErkYn macOS course: https://t.co/JdbJWru6zG https://t.co/94R8ER7K2h https://t.co/ROkqhyhpEK

gene yang @geneyang4

0 Followers 221 Following

Director of Vulnerability Research @Rezilion_ | @pyconil Organization Committee | Sharing Cyber Security, ML & Startup Culture Insights | Always Learning!

Yotam Perkal @pyotam2

487 Followers 604 Following Director of Vulnerability Research @Rezilion_ | @pyconil Organization Committee | Sharing Cyber Security, ML & Startup Culture Insights | Always Learning!

Chris Liu @chrisliu298

3 Followers 214 Following PhD student @BaskinEng @ucsc

Phillip @Phillip23569954

54 Followers 740 Following

Harshal Nandigramwar @hnanacc

344 Followers 249 Following ai @intel labs, prev: ai @cariad_tech, masters @Uni_Stuttgart, building @todackcom, @themelioai

Sonakshi Chauhan @ChauhanSon8200

12 Followers 36 Following

Monte @montemacd

4 Followers 13 Following Alignment researcher at @AnthropicAI

Enabling world-class talent to solve pressing problems and contribute to flourishing future.

Co-founder at Impact Academy. Former researcher, MD, and author.

Sebastian Schmidt @SebSchmidt_

5 Followers 26 Following Enabling world-class talent to solve pressing problems and contribute to flourishing future. Co-founder at Impact Academy. Former researcher, MD, and author.

Anıl Yılmaz @AnlYlma04693333

2 Followers 23 Following 22/İTÜehb/Tr

Gülşah Fırat @gulsah_f14196

0 Followers 34 Following

MAB氏 @MAB1791652

1 Followers 36 Following

Weloop @Weloop_official

14 Followers 72 Following Download “Weloop” to be a part of your friends circle

Vincent Lacey @vnce

170 Followers 543 Following Product Manager at Google, hacker, dragonboater.

RCS @rcs_rsantorum

2 Followers 107 Following

Founder @inBitBox + Cryptoforest + MichiganBitcoiners + DetroitBlockchainers | Disruptive Entrepreneurship... from the CryptoCastle: https://t.co/CHEaFy8oLG

הראל @Kinnardian

685 Followers 2K Following Founder @inBitBox + Cryptoforest + MichiganBitcoiners + DetroitBlockchainers | Disruptive Entrepreneurship... from the CryptoCastle: https://t.co/CHEaFy8oLG

ཨོཾ་ཤྲཱི་ ཧེ་རུ་ཀ་ས་མ་ཡ་མ་ནུ་པཱ་ལ་ཡ། ཤྲཱི་ཧེ་རུ་ཀ་ཏྭེ་ནོ་པ་ཏིཥྛ། དྲྀ་བྷོ་མེ་བྷཱ་ཝ། སུ་ཏོ་ཥྱོ་མེ་བྷ་ཝ། ཨ་ནུ་རཀྟོ་མེ་བྷ་ཝ། སུ་པོ་ཥྱོ་མེ་བྷ་ཝ། སརྦ་སིདྡྷི་མྨེ་པྲ་ཡ་

དྲན་པ་ན.. @Dorjsembesechen

2K Followers 6K Following ཨོཾ་ཤྲཱི་ ཧེ་རུ་ཀ་ས་མ་ཡ་མ་ནུ་པཱ་ལ་ཡ། ཤྲཱི་ཧེ་རུ་ཀ་ཏྭེ་ནོ་པ་ཏིཥྛ། དྲྀ་བྷོ་མེ་བྷཱ་ཝ། སུ་ཏོ་ཥྱོ་མེ་བྷ་ཝ། ཨ་ནུ་རཀྟོ་མེ་བྷ་ཝ། སུ་པོ་ཥྱོ་མེ་བྷ་ཝ། སརྦ་སིདྡྷི་མྨེ་པྲ་ཡ་

anon (xe/xir/xirs, ze.. @wedoforgive

813 Followers 5K Following Anonymous Recovery 😽

Voice of Gen Z ,Engineer ,Veteran debater, Social Prefect , Inventor, Founder & CEO @Brimstonefx , Deep Learning @xfountain70 - @groundzero30

AuwaL ™ @princeauwall

613 Followers 635 Following Voice of Gen Z ,Engineer ,Veteran debater, Social Prefect , Inventor, Founder & CEO @Brimstonefx , Deep Learning @xfountain70 - @groundzero30

Yoram Bachrach @yorambac

444 Followers 1K Following Research Scientist at DeepMind

Aran Komatsuzaki @arankomatsuzaki

95K Followers 78 Following @TeraflopAI

Jason Wei @_jasonwei

57K Followers 491 Following ai researcher @openai

Jan Leike @janleike

44K Followers 322 Following ML Researcher, co-leading Superalignment @OpenAI. Optimizing for a post-AGI future where humanity flourishes.

Catherine Olsson @catherineols

15K Followers 1K Following Hanging out with Claude, improving its behavior, and building tools to support that @AnthropicAI 😁 prev: @open_phil @googlebrain @openai (@microcovid)

Distinguished Professor (Emeritus), Oregon State Univ.; Former President, Assoc. for the Adv. of Artificial Intelligence; Robust AI & Comput. Sustainability

Thomas G. Dietterich @tdietterich

51K Followers 505 Following Distinguished Professor (Emeritus), Oregon State Univ.; Former President, Assoc. for the Adv. of Artificial Intelligence; Robust AI & Comput. Sustainability

Collin Burns @CollinBurns4

11K Followers 276 Following Superalignment @OpenAI. Formerly @berkeley_ai @Columbia. Former Rubik's Cube world record holder.

Bryan Johnson /dd @bryan_johnson

255K Followers 424 Following Founder Blueprint & @Braintree @venmo

Founder of @berggruenInst @noemamag
Co-author of #RenovatingDemocracy (https://t.co/czbXdKwYK1)
A believer that ideas can make a better world. #IdeasMatter

Nicolas Berggruen @NBerggruen

17K Followers 1K Following Founder of @berggruenInst @noemamag Co-author of #RenovatingDemocracy (https://t.co/czbXdKwYK1) A believer that ideas can make a better world. #IdeasMatter

Ian Goodfellow @goodfellow_ian

299K Followers 1K Following Research Scientist at DeepMind. Opinions my own. Inventor of GANs. Lead author of https://t.co/M6vl8pEifa

Aaron Defazio @aaron_defazio

6K Followers 364 Following Research Scientist at Meta working on optimization. Fundamental AI Research (FAIR) team

Data and research to understand big global problems and make progress against them. Based out of @UniOfOxford, founded by @MaxCRoser. @ourworldindata@vis.social

Our World in Data @OurWorldInData

299K Followers 21 Following Data and research to understand big global problems and make progress against them. Based out of @UniOfOxford, founded by @MaxCRoser. @[email protected]

Michael Byun @m_j_byun

4 Followers 6 Following AI policy & interpretability @ Stanford

executive director @ Astera | born lucky | leave me anonymous feedback: https://t.co/9RtcgMyTHP

How to be More Agentic: https://t.co/O3eJsrzTYW

Cate Hall @catehall

19K Followers 272 Following executive director @ Astera | born lucky | leave me anonymous feedback: https://t.co/9RtcgMyTHP How to be More Agentic: https://t.co/O3eJsrzTYW

𝖦𝗋𝗂𝗆𝖾�.. @Grimezsz

1.4M Followers 2K Following Another Fine Product From The Nonsense Factory

Known as Mad Max for my unorthodox ideas and passion for adventure, my scientific interests range from artificial intelligence to the ultimate nature of reality

Max Tegmark @tegmark

145K Followers 29 Following Known as Mad Max for my unorthodox ideas and passion for adventure, my scientific interests range from artificial intelligence to the ultimate nature of reality

Opera GX @operagxofficial

1.1M Followers 213 Following The browser for gamers.

Liv Boeree @Liv_Boeree

254K Followers 497 Following Looking for the win/wins in life. Not a fan of Moloch traps. Brand new podcast out now, link below👇

David @DavidSHolz

54K Followers 5K Following founder @midjourney, prev founder leap motion, nasa, max planck

Andrew Critch (h/acc) @AndrewCritchPhD

3K Followers 181 Following Human being; trying to do good; views my own; CEO @ Encultured; AI Researcher @ UC Berkeley.

Kevin Esvelt @kesvelt

9K Followers 23 Following Sculpting evolution & safeguarding biotechnology, MIT Media Lab.

Igor Kurganov @IgorKurganov

12K Followers 542 Following

Alec Radford @AlecRad

43K Followers 317 Following ML developer/researcher at OpenAI

Nat Friedman @natfriedman

183K Followers 285 Following https://t.co/Lhh178sIjq

A professor on leave from @georgetownsfs to serve as the @WhiteHouse Special Advisor on AI. Author of three books on cybersecurity and AI. Personal account.

Ben Buchanan @BuchananBen

5K Followers 246 Following A professor on leave from @georgetownsfs to serve as the @WhiteHouse Special Advisor on AI. Author of three books on cybersecurity and AI. Personal account.

President and CEO of the @RANDCorporation, a nonprofit, nonpartisan research org that helps improve policy and decisionmaking through research and analysis.

Jason Matheny @JasonGMatheny

8K Followers 374 Following President and CEO of the @RANDCorporation, a nonprofit, nonpartisan research org that helps improve policy and decisionmaking through research and analysis.

Andy Zou @andyzou_jiaming

3K Followers 63 Following PhD student at CMU, working on AI Safety and Security

Max Luo @maxkluo

209 Followers 322 Following VC @ACME 🤓 former fanfiction writer 🖋 | writing now at https://t.co/wrlEaqPQJG

Alexey Guzey @alexeyguzey

24K Followers 945 Following interested in the past and in the future

xAI @xai

997K Followers 36 Following

Alexandr Wang @alexandr_wang

143K Followers 697 Following ceo at @scale_ai. rational in the fullness of time

Geoffrey Hinton @geoffreyhinton

338K Followers 28 Following deep learning

Igor Babuschkin @ibab

44K Followers 685 Following Maybe the real AGI was the friends we made along the way. @xAI

near @nearcyan

45K Followers 882 Following https://t.co/IdaJwZJCXm partner @ https://t.co/9g1MIgjiqc dms open

Thomas Kalil @tkalil2050

1K Followers 1K Following

UK biologist & writer. Richard Dawkins Foundation donor: https://t.co/rZZdjPoMUe.

For Details about the Upcoming Tour: https://t.co/sSo5FL6CWb

Richard Dawkins @RichardDawkins

3.0M Followers 359 Following UK biologist & writer. Richard Dawkins Foundation donor: https://t.co/rZZdjPoMUe. For Details about the Upcoming Tour: https://t.co/sSo5FL6CWb

Eric Schmidt @ericschmidt

2.2M Followers 224 Following Former Executive Chairman & CEO and tweets from Schmidt Foundation

vitalik.eth @VitalikButerin

5.3M Followers 384 Following mi pinxe lo crino tcati

Dustin Moskovitz @moskov

75K Followers 513 Following

AI Pub @ai__pub

72K Followers 343 Following AI papers and AI research explained, for technical people. Get hired by the best AI companies: https://t.co/MySVjUGOQ3

Center for AI Safety @ai_risks

5K Followers 1 Following Reducing societal-scale risks from artificial intelligence through technical research and field-building.

Co author of the beet emoji. Lover of rice. Proud father of many bad ideas, and possibly some good ones. I have two cats, but only one of them is insane.

The Beet King @vjunetxuuftofi

7 Followers 7 Following Co author of the beet emoji. Lover of rice. Proud father of many bad ideas, and possibly some good ones. I have two cats, but only one of them is insane.

Ilya Sutskever @ilyasut

370K Followers 2 Following towards a plurality of humanity loving AGIs @openai

Kimin @kimin_le2

1K Followers 330 Following Assistant professor at KAIST. Prev: Research scientist @GoogleAI, Postdoc @berkeley_ai & Ph.D at KAIST.

Joe Carlsmith @jkcarlsmith

4K Followers 305 Following Senior research analyst @open_phil. Opinions my own.

Oliver Zhang @ozhang_

46 Followers 73 Following

rapha gontijo lopes @rapha_gl

5K Followers 2K Following research @ openai

Christian Szegedy @ChrSzegedy

32K Followers 2K Following #deeplearning, #ai research scientist. Opinions are mine.

ML Safety Daily @topofmlsafety

2K Followers 2 Following ML safety papers as they are released. Course: https://t.co/l0e0Y2i3AU Newsletter: https://t.co/8Y1kh2D7K6 Main Twitter: https://t.co/AXoYPryldd

🚀 I share bite-size practical machine learning deployment tips |
💡 Current Projects👉 https://t.co/ClHoj7uDia |
🎉 My best Tweets👉 https://t.co/2YzTSSRucv

Dickson Neoh 🚀 @dicksonneoh7

976 Followers 1K Following 🚀 I share bite-size practical machine learning deployment tips | 💡 Current Projects👉 https://t.co/ClHoj7uDia | 🎉 My best Tweets👉 https://t.co/2YzTSSRucv

Junior Fellow @CSETGeorgetown. All views expressed are my own. Previously @ai_risks, @Yale.

Creator of the beet emoji (forthcoming).

Thomas Woodside @Thomas_Woodside

814 Followers 205 Following Junior Fellow @CSETGeorgetown. All views expressed are my own. Previously @ai_risks, @Yale. Creator of the beet emoji (forthcoming).

Sidney Hough @sqhough

262 Followers 277 Following @stanford cs, https://t.co/Ds2gVBIO6Y

Kevin Liu @kliu128

7K Followers 629 Following Preparedness at @openai

William MacAskill @willmacaskill

64K Followers 1K Following Moral philosopher at Oxford. Author of Doing Good Better and What We Owe The Future.

Quoc Le @quocleix

49K Followers 107 Following Distinguished Scientist, Google.

Sneha Revanur @SnehaRevanur

a day ago

Man if we were trying to be sneaky so we could dodge public scrutiny we did a pretty bad job of keeping the cat in the bag

Perry E. Metzger @perrymetzger

a day ago

The claim here is that lots of people support SB 1047, the extremist anti-AI bill. I have never even heard of “Lovable Labs.” Where are the actual AI companies in all of this? Where was the testimony from the Open Source Initiative about the effect on open source, from the EFF…

7 22 117 16K 12

0 1 9 1K 1

Download Image

Sneha Revanur @SnehaRevanur

a day ago

All our “extreme anti-AI bill” does is ask developers to address catastrophic risks from models that *don’t even exist yet*. If your model might empower bioterrorists, we think you should probably be a little careful. And if that feels unreasonable … maybe you’re the extremist!

1 4 17 1K 2

Andy Zou @andyzou_jiaming

3 days ago

@littlefish3625 Section 6.2 of the Representation Engineering paper (ai-transparency.org) shows exactly this. There is also a demo here (github.com/andyzoujm/repr…) in the paper's repo which shows that adding a "harmlessness" direction to the representation can effectively jailbreak the model.

2 1 13 4K 8

typedfemale @typedfemale

4 days ago

i asked GPQA's example quantum mechanics question to my friend who is an expert in quantum and they told me: "all of these answers are incorrect" - it's google proof only because it's word salad!

14 18 218 85K 80

Download Image

Andreas Kirsch 🇮🇱🇺🇦 @BlackHC

5 days ago

@andrewgwils Yep Hendrycks et al was my guess. I think OOD is the less well-defined more "general" deep learning take of prior approaches. Seb Farquhar wrote a paper on some of those Qs last year: openreview.net/forum?id=XCS_z…

0 0 4 260 2

Nat Friedman @natfriedman

a week ago

66 15 1K 180K 56

Download Image

Axel Longhorn @donlemonhead5

2 weeks ago

@fiiiiiist @BISgov

0 1 2 2K 0

Yo Shavit @yonashav

a month ago

@zacharylipton SGD is an alignment technique for randomly initialized weights

0 0 9 504 0

Arun Rao @rao_hacker_one

a month ago

This is an important post - the single most important AI policy issue is about military applications and civilian control - they don’t get discussed and are ignored by the EU AI Act and Biden’s AI EO. But we need years of societal debate and agreement on what are thoughtful and…

Dan Hendrycks @DanHendrycks

a month ago

47 42 280 54K 132

0 0 1 262 0

Kyle O'Brien @KyleDevinOBrien

a month ago

@DanHendrycks A theme that struck me when watching Oppenheimer was how the scientists naively thought they could remain in control of how their creations were used. IIRC, Damon's Col. Groves said something like, "Our job was to give them the ace—it's up to them how to play it."

0 0 5 243 0

Liv Boeree @Liv_Boeree

2 months ago

To open-source AI, or not to open-source? That is indeed the question. But does the question even make sense in the first place? Here’s @DanHendrycks nuanced response 👇

18 16 113 37K 51

Download Video

Steve Newman @snewmanpv

2 months ago

@DanHendrycks One thing I learned when writing this piece is how ridiculously helpful it is to engage directly with multiple experts. There is so much confusing information out there; I've tried to cut through the noise here, leaning on the many knowledgeable folks who kindly provided input.

0 0 2 82 1

Cas (Stephen Casper) @StephenLCasper

2 months ago

A couple days ago, I posted a thread with some constructive criticism about CAIS @ai_risks. I’m appreciative of the discussions it sparked. Today I want to follow up on what I *appreciate* about CAIS. My post didn’t reflect this, but it’s my favorite private AI research org. I…

4 1 53 5K 7

Andrew Critch (h/acc) @AndrewCritchPhD

2 months ago

Hear hear! An important yet neglected point from LeCun; more people should stand up for this IMHO:

Yann LeCun @ylecun

2 months ago

We need a free and diverse set of AI assistants for the same reason we need a free and diverse press.

101 294 2K 216K 195

0 0 5 635 2

Alexandr Wang @alexandr_wang

2 months ago

.@TIME article from @henshall_will about our work on catastrophic risk benchmarking and unlearning! time.com/6878893/ai-art…

Alexandr Wang @alexandr_wang

2 months ago

Can hazardous knowledge be unlearned from LLMs w/o harming other capabilities? @scale_AI and CAIS are releasing Weapons of Mass Destruction Proxy (WMDP), an eval for catastrophic AI risk & a way to unlearn this knowledge. 📝arxiv.org/abs/2403.03218 🔗wmdp.ai

11 12 76 43K 34

2 3 39 28K 7

Alexandr Wang @alexandr_wang

2 months ago

11 12 76 43K 34

Center for AI Safety @ai_risks

2 months ago

4 21 57 14K 23

Download Image

Somesh Jha @jhasomesh

2 months ago

Congrats to all fellows. Special congrats to @hima_lakkaraju , Jacob Steinhard, @DanHendrycks , @NicolasPapernot , and @nhaghtal Just because I know them:-)

𝙷𝚒𝚖𝚊 𝙻𝚊𝚔𝚔𝚊𝚛𝚊𝚓𝚞 @hima_lakkaraju

2 months ago

Incredibly thrilled to be named an AI2050 Early Career Fellow by Schmidt Sciences. schmidtsciences.org/ai2050-early-c… This award will help accelerate our research on evaluating and enhancing the trustworthiness of generative AI tools, and bridging the gaps between AI policy and research.…

17 10 181 19K 21

3 0 10 3K 1

Siméon @Simeon_Cps

2 months ago

w̵e̵ ̵w̵a̵n̵t̵ ̵t̵o̵ ̵b̵e̵ ̵a̵t̵ ̵t̵h̵e̵ ̵f̵r̵o̵n̵t̵i̵e̵r̵ ̵f̵o̵r̵ ̵s̵a̵f̵e̵t̵y̵ ̵b̵u̵t̵ ̵w̵o̵n̵'̵t̵ ̵p̵u̵s̵h̵ ̵t̵h̵e̵ ̵c̵a̵p̵a̵b̵i̵l̵i̵t̵i̵e̵s̵ ̵f̵r̵o̵n̵t̵i̵er̵ 🪦

Anthropic @AnthropicAI

2 months ago

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

558 2K 10K 3.7M 2K

Download Image

18 10 201 55K 37

Fred Zhang @FredZhang0

2 months ago

Beating prediction markets with chatbots sounds cool. In a recent work arxiv.org/abs/2402.18563, we get somewhat close to that. As another perspective, forecasting is a great capability domain to benchmark LM reasoning, calibration, pre-training knowledge, and more. 🧵1/n