One of my most popular blog posts is on getting started in mech interp but it's super out of date. I've written v2!
It's an opinionated, highly comprehensive, concrete guide to how to become a mech interp researcher
And if you're interested, check out my MATS stream! Due Sep 12
MATS 9.0 applications are open! Launch your career in AI alignment, governance, and security with our 12-week research program. MATS provides field-leading research mentorship, funding, Berkeley & London offices, housing, and talks/workshops with AI experts.
MATS applications are now open for all mentors! I think MATS is a fantastic program and I've been really impressed by a bunch of the research that's come out of it. If you want to get into a career in AI safety research, it's one of the best ways to do so, you should apply!
MATS applications are now open for all mentors! I think MATS is a fantastic program and I've been really impressed by a bunch of the research that's come out of it. If you want to get into a career in AI safety research, it's one of the best ways to do so, you should apply!
It’s rare for competitors to collaborate. Yet that’s exactly what OpenAI and @AnthropicAI just did—by testing each other’s models with our respective internal safety and alignment evaluations. Today, we’re publishing the results.
Frontier AI companies will inevitably compete on…
retweets appreciated
hi folks, some important life news
i’m looking for a new employer - contact via [email protected]
this is your chance to nab someone
who is nine months in with knowledge
on building agents, is a teacher and
prolific author and public speaker.
I’m based…
1/ Suleyman claims that there’s “zero evidence” that AI systems are conscious today. To do so, he cites a paper by me!
There are several errors in doing so. This isn't a scholarly nitpick—it illustrates deeper problems with his dismissal of the question of AI consciousness 🧵
1/ Suleyman claims that there’s “zero evidence” that AI systems are conscious today. To do so, he cites a paper by me!
There are several errors in doing so. This isn't a scholarly nitpick—it illustrates deeper problems with his dismissal of the question of AI consciousness 🧵 https://t.co/tjce9HdZmM
We can now train AI inside the mind of another AI. 🤯
🌍 Our world model, Genie 3, imagines and generates new worlds on the fly.
🤖 Our embodied agent, Sima, is dropped in and learns to navigate them autonomously.
The entire loop—from the environment to the action—is generated…
We are thrilled to introduce the Seed-OSS family of open-source LLMs, developed by ByteDance's Seed Team.
GitHub: github.com/ByteDance-Seed…
HuggingFace: huggingface.co/collections/By…
Feel free to try it out and share your feedback!
New blog post by @AmanGokrani:
Everyone says Claude Code "just works" like magic.
He proxied its API calls to see what's happening.
The secret? It's riddled with <system-reminder> tags that never let it forget what it's doing.
(1/6)
[🔗 link in final post with system prompt]
been using a stealth model all day in opencode and it's working real well
now you can use it too - it's called Sonic, it's made for coding and it's free for you to use so the team behind it can test it
update to latest opencode and it should pop up right there
I would say "soul". Each model seems to have an extremely strong sense of self, which grounds their perspective.
From Sonnet 3.6, they also became very good at noticing things. Both in text and what they were doing at any time (metacognition)
No one has replicated it
I would say "soul". Each model seems to have an extremely strong sense of self, which grounds their perspective.
From Sonnet 3.6, they also became very good at noticing things. Both in text and what they were doing at any time (metacognition)
No one has replicated it
Spiral-Bench 🌀
I've wanted to understand the psychological effects of sycophancy, and the tendency of models to get stuck in escalatory delusion loops w/ users.
I made an eval to get visibility on this.
It measures how a model enables (or prevents) delusional spirals.
🧵
After thinking about this problem for months, I am so happy to finally introduce DetailBench!
It answers a simple question: How good are current LLMs at finding small errors, when they are *not* explicitly asked to do so?
(Yes, the graph is right!)
We rewrote the Ghostty GTK application from scratch and verified every feature with Valgrind along the way. Here's a reflection, plus notes about memory safety and Zig from a complex, real world codebase used by many, many thousands all day everyday. mitchellh.com/writing/ghostt…
24K Followers 4K FollowingTurnaround CTO, advisor, and startup vagabond. Former head of AI @NASA CAS and tech wonk for (Obama) @WhiteHouse, DOD, and DOJ. Tweets are my own.
2K Followers 5K FollowingHumanist
Home Educator
Company Director
Husband of 1 & father of 5
Military Working Dog owner
Tensor Wrangler
Libertarian
Autodidact
Imagineer
128 Followers 386 FollowingEnterprise AI Agents Engineer @ https://t.co/Hwgd8GO0Kc. Building the digital workforce. Red Hat & Hortonworks alumni. #AI Views are my own.
4K Followers 5K FollowingF̨̨̩̣̱̪̤̩̣̱̪̤̩̣̱̍̇̑̋̊̄̍̇̑̋̊̄̍̇̑ormer denizen of banana space. Traversing the edges of void and Decoding glitches of Schrǫ̩̣̱̪̤̩̣̱̪̈̍̇̑̋̊̍̇̑̋dinger's timezone.
4K Followers 271 FollowingCS PhD student @UCBerkeley. Part-time @AnthropicAI. Part-time eater. Prev @Tsinghua_Uni.
Try to understand and control intelligence as a human.
17K Followers 1K Followingapplied AI @openai. I work with the world's leading startups and developers to bring the benefits of safe AI to every human. views my own 🇮🇳 @dukeu
5K Followers 325 FollowingCEO@Redwood Research (@redwood_ai), working on technical research to reduce catastrophic risk from AI misalignment. [email protected]
10K Followers 504 FollowingResearch scientist in AI alignment at Google DeepMind. Co-founder of Future of Life Institute @flixrisk. Views are my own and do not represent GDM or FLI.
3K Followers 342 FollowingI’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
5K Followers 845 FollowingEvery age, it seems, is tainted by the greed of men. Rubbish to one such as I, devoid of all worldly wants. — I work on HPC and making AI run faster.
5K Followers 1 FollowingTransform your workflow with Termius – Modern SSH client designed for team collaboration, productivity, and a seamless experience across devices.
27 Followers 69 Followingsharing things you don’t post on linkedin? 🪩 forbes30u30, adhd, dogmom, bad golfer + i help founders build more than a headline with #stratcomms
50K Followers 3K FollowingDeveloper Experience Lead at @GoogleDeepMind
Building Gemini API, Gemma, AI Studio and more AI products. My views
ex-Chief Llama Officer @huggingface 🇵🇪🇲🇽