why do you think people suddenly have an enormous glow-up after a few days at vibecamp or casa tilo or jesscamp??? because they suddenly got a huge spike in encouragement from external sources. this is a return to normal healthy baseline. it is your daily nutritional dose!
After iterating hundreds of prompts to trigger blackmail in Claude, I was shocked to see these prompts elicit blackmail in every other frontier model too.
We identified two distinct factors that are each sufficient to cause agentic misalignment:
1. The developers and the agent…
After iterating hundreds of prompts to trigger blackmail in Claude, I was shocked to see these prompts elicit blackmail in every other frontier model too.
We identified two distinct factors that are each sufficient to cause agentic misalignment:
1. The developers and the agent…
it is 2025 and there are people who believe that accurately predicting the next token does not require understanding the underlying reality that created that token btw
Chess + AI + GPUs? Look no further! 👀
We are excited to be supporting our friends @StrongCompute for the GPU Chess hackathon!
Happening across Sydney and San Francisco this Friday! 🚀
Sign up here:
SF: lnkd.in/g8a4vBYz
Sydney: lnkd.in/gp2G6qVa
Happy building!
1K Followers 2K Followingposting dumb jokes & takes; if you see a post about math|stats|cs, it's out-of-sample. private alt: @dynamicedging
y=Xβ+ϵ, the rest is commentary
1K Followers 2K Followingposting dumb jokes & takes; if you see a post about math|stats|cs, it's out-of-sample. private alt: @dynamicedging
y=Xβ+ϵ, the rest is commentary
649K Followers 35 FollowingWe're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
1.4M Followers 1K FollowingBuilding @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets.
9K Followers 638 FollowingMalware artist, unicorn creator, wireless hacker. Working at @HPI_DE (ex @seemoolab). Opinions are my own. https://t.co/GbL7GINJBo / @[email protected]
9K Followers 21 FollowingAdvancing humanity's understanding of AI through interpretability research. Building the future of safe and powerful AI systems.
11K Followers 120 FollowingAIs aren't people, they're tools we should use wisely.
Head of interpretability research at @AiEleuther, but tweets are my own views, not Eleuther's.
4K Followers 301 Followingwalled surveilled compound manager. moonlighting as whorelord manager. Sentences starting with a lowercase letter are humor, sarcasm, exaggeration, or similar.
2K Followers 1K FollowingCo-Executive Director @MATSprogram, Co-Founder @LondonSafeAI, Regrantor @Manifund | PhD in physics | Accelerate AI alignment + build a better future for all
4K Followers 1K FollowingLiving life authentically. Building a world full of deeper connection. Bringing strangers together for deep talks. Product Manager & Host of @theboardtalks SF.
3K Followers 587 Followingbuilding the future of learning 🚀 combining AI & learning science to create super learners @junglelearning_ 🌴 | building @fdotinc 🛠️ | dj by night 🎧