My first manager at Uber started a GitHub page back at the time with resources to become a more proficient developer - ones he personally found helpful (he did not have a CS degree).
I realized he is *still* updating it, 7 years later! A neat list: github.com/charlax/profes…
(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models.
ai.meta.com/research/publi…
(1/6) triton kernels are a great way to understand ML models. but tutorials are scattered
the learning method for me was jst to read real, high performance code
so i wrote a blog which walkthroughs the design and intuitions behind FLA's softmax attention kernel
🧵also a thread
Here's the simplest explanation of @cline's agentic algorithm.
It's just a state machine that classifies every request with a tool call into 3 types:
1. Question tools (need clarification)
2. Action tools (gather context)
3. Completion tools (present results)
That's it.
.@neuralink does some crazy engineering on its way to performing brain surgeries. We wanted to document one of the more striking examples - This Is How Neuralink Builds A Human Head
Full video down below
🚨Training–inference mismatch in MoE RL? It gets even worse than we thought…
But no worries—just grab an "IcePoP"🧊 and chill😉!
Our new solution keeps MoE RL cool😎 & boosted🚀. Check it out!
📜Blog: ringtech.notion.site/icepop
Designing @NotebookLM was one of the most meaningful opportunities of my career. I finally found time to document the process.
Here’s a look behind the scenes:
📐 The mental model is anchored in the creation journey: Inputs → Chat → Outputs. This simple yet flexible flow gave…
I was lucky to work in both China and the US LLM labs, and I've been thinking this for a while. The current values of pretraining are indeed different:
US labs be like:
- lots of GPUs and much larger flops run
- Treating stabilities more seriously, and could not tolerate spikes…
I was lucky to work in both China and the US LLM labs, and I've been thinking this for a while. The current values of pretraining are indeed different:
US labs be like:
- lots of GPUs and much larger flops run
- Treating stabilities more seriously, and could not tolerate spikes…
Neel Nanda is leading a Google DeepMind research team at 26. He and I discuss:
• How that happened
• “If your safety work doesn't advance capabilities, it's probably bad safety work”
• Should people work at the safest or most reckless AI company?
• An AI PhD – with these…
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL
Training web agents with data constructed using knowledge graphs (arxiv.org/abs/2507.02592).
Building a Docker-like Container From Scratch 🐳
Learn about the key Linux namespaces by assembling a tiny but realistic container using only stock Linux commands: unshare, mount, and pivot_root. No runtime magic and (almost) no cut corners.
labs.iximiuz.com/tutorials/cont…
Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference”
We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to…
Next generation of 10B+ valuation product startup will be built by scaling training on in house RL environment
We live in an abundance of capabilities and yet we only have two major AI products, chatgpt and coding agent, and it deeply frustrates me
The current supply chain of…
Next generation of 10B+ valuation product startup will be built by scaling training on in house RL environment
We live in an abundance of capabilities and yet we only have two major AI products, chatgpt and coding agent, and it deeply frustrates me
The current supply chain of…
Announcing EXO Gym: Simulate distributed training environments using just your laptop.
Previously, distributed training experiments required setting up complex multi-node clusters.
With EXO Gym, multiple virtual nodes are spawned within one device. 🧵
Introducing Asta—our bold initiative to accelerate science with trustworthy, capable agents, benchmarks, & developer resources that bring clarity to the landscape of scientific AI + agents. 🧵
119 Followers 5K FollowingI’m helping people with Financial support for bills rent, debt who need money for is family care and job text me on WhatsApp +1 (307) 757 4293
6K Followers 1K Followingbuilding the post-IDE IDE at https://t.co/hDpglja33W - coined “context engineering”, prev @replicatedhq @SproutSocial - ai that works pod @ https://t.co/69BhaNtWfd
60K Followers 2K FollowingMore wonder, more insight, more expression, more joy!
Independent researcher; currently exploring tools that augment human memory and attention.
3K Followers 897 FollowingPM @DARPA; Prof of Math and CS @Rutgers-Newark; co-founder @ https://t.co/e6dJA2bLus; Math @the_IAS 2021-2023.
https://t.co/2plDQE0s6K https://t.co/XuiVK8VmO3
16K Followers 362 FollowingRuns an AI Safety research group in Berkeley (Truthful AI) + Affiliate at UC Berkeley. Past: Oxford Uni, TruthfulQA, Reversal Curse. Prefer email to DM.