New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along.
(Remember matmul is the single most important operation that transformers execute…
content that has been ruthlessly optimized to attack someone else’s brain is going to increasingly look like unwatchable gibberish to you but that doesn’t mean your content isn’t waiting in the wings. your hole will be made for you
content that has been ruthlessly optimized to attack someone else’s brain is going to increasingly look like unwatchable gibberish to you but that doesn’t mean your content isn’t waiting in the wings. your hole will be made for you
more on this
when you launch a cuda kernel, you are not running a function per say like we do in c++, you are handing an abstract specification of a parallelism, often in an intermediate form called ptx, to the nvidia driver, the driver acts as a final stage, just in time…
more on this
when you launch a cuda kernel, you are not running a function per say like we do in c++, you are handing an abstract specification of a parallelism, often in an intermediate form called ptx, to the nvidia driver, the driver acts as a final stage, just in time…
im not gonna lie. python ast stuff is not that bad. its very useable. especially if you defer the actual IR generation to a systems language (with great pattern matching might I add!) like Rust
231 Followers 145 FollowingSoftware Engineer, Networking and Telecom, ebpf & dpdk;
Boxer🥊 professional in life.
People, often deceived by an illusive good, desire their own ruin.
5K Followers 3K Followingcurrently doing things at Mintlify, prev. built a search API (trieve acq. YCW24), sideprojecting a new Patreon at https://t.co/MTSczbZEku, progression fantasy and HN enjoyer
174 Followers 305 Followingwrapper of llms, materialist, truster of trust, lib, endogeneity enjoyer, eats corn like an algebraist
joined in 2025 for some reason
tweets are what they are
5K Followers 7K FollowingFounder & AI wrangler at https://t.co/b4R1fyiCVP & https://t.co/7Vt8cKVayt. Ex data lead @HelpScout, engineer @Automattic, captain @USAirForce, cadet @AF_Academy.
639 Followers 604 FollowingI prove and verify things at @NethermindEth | rust, haskell | compilers and hardware aficionado | ex-category theorist | opinions are mine
3K Followers 3K Following🇨🇦🪖🦫🏫📖📈🫡🟥👨🏻💻| Vir fortis via negativa | A curious guy with a boring past | Not an expert | 3rd go @ Twtr | 3 | Working on 🗺️🔭🏘️ | @snowstormnet
3K Followers 375 FollowingI’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
5K Followers 3K Followingcurrently doing things at Mintlify, prev. built a search API (trieve acq. YCW24), sideprojecting a new Patreon at https://t.co/MTSczbZEku, progression fantasy and HN enjoyer
48K Followers 231 FollowingDysfunctional Programming account #1. Senior SWE at Bloomberg. I write C++ for money. ex-Haskell, ex-OCaml. All opinions are my own.
174 Followers 305 Followingwrapper of llms, materialist, truster of trust, lib, endogeneity enjoyer, eats corn like an algebraist
joined in 2025 for some reason
tweets are what they are
2K Followers 1K Following☦️🇻🇦 Программист, униат, князь тысячи врагов; black metal byzcath; Αναπαύσου εν ειρήνη Φραγκίσκο; ἀστρικὸς τρόμος https://t.co/JHtUNTIQQl 肮脏勾当,便宜搞定
5K Followers 7K FollowingFounder & AI wrangler at https://t.co/b4R1fyiCVP & https://t.co/7Vt8cKVayt. Ex data lead @HelpScout, engineer @Automattic, captain @USAirForce, cadet @AF_Academy.
14K Followers 3K Followingresearch @MIT_CSAIL @thinkymachines. work on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
639 Followers 604 FollowingI prove and verify things at @NethermindEth | rust, haskell | compilers and hardware aficionado | ex-category theorist | opinions are mine
3K Followers 3K Following🇨🇦🪖🦫🏫📖📈🫡🟥👨🏻💻| Vir fortis via negativa | A curious guy with a boring past | Not an expert | 3rd go @ Twtr | 3 | Working on 🗺️🔭🏘️ | @snowstormnet
8K Followers 9K FollowingAI @amazon. AI infrastructure, RL, Agents, China.
Open Source enjoyer!
Pied piper to AI agents.
(Class + Scale) over Scale
Strictly my views...
766 Followers 277 Followingeverything that can work, will work | unpacking papers, musings on mathematics, and meditations on craft from an anon founder
437 Followers 557 FollowingBIOS keyboard interrupt
Meta Research Scientist (5+ yrs) | ML / CV / CG
Writing CUDA kernels for differentiable rendering
PhD CS • MS Aerospace Eng
Ex-gamedev