Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9%
This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA
I'm excited to announce what we have been working on for months. Announcing OpenThinker3, the strongest 7B reasoning model with open data. Also more than 1000 experiments on what works and what doesn't for post-training data curation.
I'm excited to announce what we have been working on for months. Announcing OpenThinker3, the strongest 7B reasoning model with open data. Also more than 1000 experiments on what works and what doesn't for post-training data curation.
Gemini 2.5 Pro is available to all Cursor users! You can enable the full 1M context window if you'd like.
We're curious to hear how you think it compares to Sonnet.
We've introduced a new text_editor tool in the Anthropic API. It's designed for apps where Claude works with text files.
With the new tool, Claude can make targeted edits to specific portions of text. This reduces token consumption and latency, all while increasing accuracy.
A very exciting day for open-source AI! We're releasing our biggest open source model yet -- OLMo 2 32B -- and it beats the latest GPT 3.5, GPT 4o mini, and leading open weight models like Qwen and Mistral. As usual, all data, weights, code, etc. are available.
For a long time,…
It's 2025 and most content is still written for humans instead of LLMs. 99.9% of attention is about to be LLM attention, not human attention.
E.g. 99% of libraries still have docs that basically render to some pretty .html static pages assuming a human will click through them.…
I shared a controversial take the other day at an event and I decided to write it down in a longer format: I’m afraid AI won't give us a "compressed 21st century".
The "compressed 21st century" comes from Dario's "Machine of Loving Grace" and if you haven’t read it, you probably…
Discovered a very interesting thing about DeepSeek-R1 and all reasoning models: The wrong answers are much longer while the correct answers are much shorter. Even on the same question, when we re-run the model, it sometimes produces a short (usually correct) answer or a wrong…
What if we had the data that DeepSeek-R1 was post-trained on?
We announce Open Thoughts, an effort to create such open reasoning datasets. Using our data we trained Open Thinker 7B an open data model with performance very close to DeepSeekR1-7B distill.
What if we had the data that DeepSeek-R1 was post-trained on?
We announce Open Thoughts, an effort to create such open reasoning datasets. Using our data we trained Open Thinker 7B an open data model with performance very close to DeepSeekR1-7B distill.
Personalized educational uses like this are one of the ways that capability advances in AI models will provide broad benefits. The idea that you can get a personalized tutor for any piece of information that knows you and knows how you learn best is going to be powerful! 🎉📚
Personalized educational uses like this are one of the ways that capability advances in AI models will provide broad benefits. The idea that you can get a personalized tutor for any piece of information that knows you and knows how you learn best is going to be powerful! 🎉📚
Introducing Gemini 2.0 Flash Thinking, an experimental model that explicitly shows its thoughts.
Built on 2.0 Flash’s speed and performance, this model is trained to use thoughts to strengthen its reasoning.
And we see promising results when we increase inference time…
I've built 19 projects with Cursor AI without line a single line of code myself.
But, the Truth is Cursor is dumb you don't add detailed docs around your project.
You need to build a strong <Context Boundary> around Cursor
Here what you can do to improve your Cursor workflow🧵
125 Followers 4K FollowingSiempre es bueno ser bueno y hacer el bien.
En un viaje para liberarme, mi línea de sangre y mis amigos.
Networker - Entrepreneur - Crypto Enthusiast
28 Followers 1K FollowingFounder, CEO, and chief engineer of SpaceX; CEO and product architect of Tesla, Inc. Owner, CTO and Executive Chairman of X (formerly Twitter)
357 Followers 7K FollowingInvestor and lover of Stoicism. I study Macroeconomics and Markets. I trade and am fascinated by human behavior. I also am a humorist. Lover of freedom. 
29K Followers 543 FollowingThe Vector Institute is dedicated to AI, excelling in machine & deep learning research. AI-generated content will be disclosed. FR: @InstitutVecteur
464K Followers 3K FollowingMy daughter Jaime was murdered in the Parkland school shooting. My life is dedicated to reducing gun violence and saving lives. All opinions are my own.
198K Followers 38 FollowingThe Gemini app turns research into reality, bringing frontier AI experiences like Veo 3, Deep Think, and more to hundreds of millions of people.
9K Followers 218 FollowingTeacher by heart, AI enthusiast by curiosity, passionate about inspiring minds, exploring tech, and making learning exciting, human, and future-focused!
1.1M Followers 1K Followingnoun | a reference source containing words alphabetically arranged along with information about their forms, pronunciations, functions, and etymologies
13K Followers 688 FollowingResearch @Meta Superintelligence Labs, RL/post-training/agents; Previously Research @OpenAI on multimodal and RL; Opinions are my own.
4K Followers 1K FollowingAccelerating aligned AI & a flourishing future with neglected approaches & AI R&D. CEO at @aestudiola (AI consulting co puts profits into AI frontier)
7K Followers 369 FollowingCo-Creator of https://t.co/cn31cNYQD3. Member of Technical Staff @AnthropicAI. Ex-Meta. Playing with computers and tech. https://t.co/yDyCddC26H
88K Followers 74 FollowingHi! I'm Dave Plummer. You might remember me from such Windows components as Task Manager, Windows Pinball, Calc, ZIPFolders, Product Activation, etc. Cheers!
229K Followers 5K FollowingCloudflare is the world’s leading #ConnectivityCloud, and we have our eyes set on an ambitious goal — to help build a #BetterInternet.
8K Followers 304 FollowingFounder & CEO, @CAForever. Raised $1bn+ to build a new city on 100+ square miles an hour north of SF/SV. For those who believe California's best days are ahead.