.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled.
My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning.
And if we have continual learning, we don't need a special training…
who wants to train a custom agentic reasoning model (think Kimi v2) with tool calls to @honchodotdev in the dataset that literally tune your model towards your userbase?
What if I told you...
Softmax attention ≈ solving an entropy-regularized optimal transport problem?
Turns out, attention is just a transport plan with a negative dot product cost.
Full derivation below, visuals speak louder than words.
*The Origins of Representation Manifolds in LLMs*
by Modell et al.
They study the presence of "interpretable features" in LLMs embedded as manifolds and how their geometry connects to the internal representations of the models.
arxiv.org/abs/2505.18235
For those of us raised on science fiction, having AI become part of everyday life feels like the breakthrough we’ve been expecting. Yet some critics insist that no matter what these models do, they're not truly intelligent—just clever simulacra.
but isn’t language and other “generated” construct also part of what we consider nature as well?
perhaps the separation/divide between man and nature makes it difficult to understand what memory and intelligence embodies…?
Fei-Fei Li (@drfeifei) on limitations of LLMs.
"There's no language out there in nature. You don't go out in nature and there's words written in the sky for you.. There is a 3D world that follows laws of physics."
Language is purely generated signal.
Emergent Hierarchical Reasoning in LLMs
The paper argues that RL improves LLM reasoning via an emergent two-phase hierarchy.
First, the model firms up low-level execution, then progress hinges on exploring high-level planning.
More on this interesting analysis:
Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference”
We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to…
A 14B model just beat a 671B model on math reasoning.
Here’s how Microsoft’s rStar2-Agent achieves frontier math performance in 1 week of RL training
- by “thinking smarter, not longer.” 🧵
🚨 New paper, and big personal news! 🧵
First, I just published a new theory of emergence.
It traverses the dimension reductions of a system, treating each scale like a 2D slice of a 3D object, looking for what each adds causally and irreducibly.
arxiv.org/pdf/2503.13395
You can question Gorgias the sophist directly about the nature of the human condition! Super smooth work from @plastic_labs@honchodotdev. Monetize with x402 @CoinbaseDev
Fresh @honchodotdev + x402 (@CoinbaseDev) demo!
Penny for Your Thoughts is a demo where anyone can share their expertise and sell bits of it via micro-transaction.
Hop on, get interviewed, set your price, & collect payments!
Fresh @honchodotdev + x402 (@CoinbaseDev) demo!
Penny for Your Thoughts is a demo where anyone can share their expertise and sell bits of it via micro-transaction.
Hop on, get interviewed, set your price, & collect payments!
261 Followers 2K FollowingWhere AI meets everyday life. Global AI stories told with an African lens — business, culture & innovation. Join the conversation 👉 #FanalMag
3K Followers 784 FollowingAfter being many kinds of investor (biotech, VC, macro) I have decided to branch out by sharing my unrestrained thoughts on the internet
11K Followers 433 FollowingCTO of Technology & Society @Google, working on fundamental AI research and exploring the nature and origins of intelligence. Order "What Is Intelligence?" ⬇️
2K Followers 887 FollowingAssociate professor @EmoryUniversity. Working on large language models, LLM inference, reasoning, natural language generation, and various aspects of GenAI.
57K Followers 2K FollowingHead of Design @Cursor_ai. Early @NotionHQ, @Stripe, built startups. I make a world where anyone can make software. Aspiring k-pop idol.
3K Followers 103 FollowingI make mathematics and physics animation videos on YouTube
BSc, MSc, Maths & M. Eng ECE
https://t.co/gdFuvTpwDK
https://t.co/rOiLYNgKBK