if you’re a professor teaching about LLM RL this semester + considering doing any sort of hands-on lessons about RL environments/agentic RL, hit me up, would love to chat :)
this stuff is now at the accessibility level where students can easily play with it
AIs can have secret “backdoors:” Can we uncover them?
Maybe! We study models that misbehave in an unknown situation, and successfully reverse-engineer the situation in simple settings.
We weren't sure this was possible! We hope to inspire future work in more realistic settings.
honestly a pretty crazy journey coming from being a neuroscience student 2 years ago.. gonna build something to share my process and learnings with you all :)
honestly a pretty crazy journey coming from being a neuroscience student 2 years ago.. gonna build something to share my process and learnings with you all :)
New paper from @norabelrose and I.
We show how mech interp can be done on generic relu networks--a feat previously understood to be intractable. Rather than enumerate over polytopes we OLS regress on max entropy inputs, deriving guarantees on model perf.
arxiv.org/abs/2502.01032
New paper from @norabelrose and I.
We show how mech interp can be done on generic relu networks--a feat previously understood to be intractable. Rather than enumerate over polytopes we OLS regress on max entropy inputs, deriving guarantees on model perf.
arxiv.org/abs/2502.01032
One of the things that excited me the most about pursuing a PhD in the future is finally being qualified to mentor bright undergraduate students who want a career in research. There's something beautiful about helping someone grow into who they want to be.
✨I’m on the faculty job market for 2024-2025! ✨
My research focuses on advancing Responsible AI—enhancing factuality, robustness, and transparency in AI systems.
I’m at #EMNLP2024 this week🌴 and would love to chat about research and hear any advice!
📈New paper on implicit language and context!
She bought the largest pumpkin? - Largest pumpkin out of what? All pumpkins in the store? Out of all pumpkins bought by her friends? In the world?
Superlatives are (often) ambiguous and their interpretation is extremely context…
going down the rabbit hole of papers and blogs to inform myself, finding some rly cool stuff.
today gonna setup a notion for technical blogging in general + setup the repo and outlining next steps :]
going down the rabbit hole of papers and blogs to inform myself, finding some rly cool stuff.
today gonna setup a notion for technical blogging in general + setup the repo and outlining next steps :]
396 Followers 2K FollowingHMC 2022 | https://t.co/a8gKmKdpLh | Pretengineer | Aspiring math professor | Everyone can learn math, given time, resources, and help | He/Him/His
277 Followers 557 FollowingHyper-synesthetic mind seeing fractals everywhere. Vet, tinkerer, and idea-chaser. explore impossible math, and sometimes accidentally find stuff.
5 Followers 117 FollowingPhilosophy, theology, technology, design, photography, and whatever else I feel like sharin. No theme, just things I find interesting. My personal dump account.
108 Followers 535 Following#GreatSoros D 1⃣
I Felt in love with Financial markets because 🆓 Market situation is Always good Driving force Of Pricing of commodities and Prior Indicator.
50K Followers 5K FollowingCofounder and Head of Post Training @NousResearch, prev @StabilityAI
Github: https://t.co/LZwHTUFwPq
HuggingFace: https://t.co/sN2FFU8PVE
6K Followers 435 FollowingAssistant Professor @ University of Washington, Co-Director of RAIVN lab (https://t.co/f0BWKyjoeA), Director of PRIOR team (https://t.co/l9RzTesMSM)
45K Followers 1K FollowingCTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, https://t.co/94gROE5Xa0. https://t.co/nmRYAKG0LZ
989 Followers 30 FollowingSoftmax's mission is to scale organic alignment. We approach this problem with multi-agent reinforcement learning population-based simulations.
2K Followers 95 FollowingInventors of the model router (https://t.co/7QzBIAWycp)
Understanding transformers by turning them into programs. 🤖
Our mission: https://t.co/A4VOldg6bI
971 Followers 561 FollowingPhD student at the University of Washington. I blog about computer vision, robotics and artificial intelligence at:https://t.co/wvaUVuFcWG
4K Followers 419 Following✨ asking sand to show its work @GoodfireAI // deep learning, math, biology // creating a more beautiful future // (opinions my own)
396 Followers 2K FollowingHMC 2022 | https://t.co/a8gKmKdpLh | Pretengineer | Aspiring math professor | Everyone can learn math, given time, resources, and help | He/Him/His