That works out to a little more compute than the M4 Pro’s GPU: 4*2.147 TFLOPS = 8.588 TFLOPS vs 8.52 TFLOPS. The A19 Pro is a freaking monster. Congrats to @awnihannun as MLX gets some serious compute on so many devices!
That works out to a little more compute than the M4 Pro’s GPU: 4*2.147 TFLOPS = 8.588 TFLOPS vs 8.52 TFLOPS. The A19 Pro is a freaking monster. Congrats to @awnihannun as MLX gets some serious compute on so many devices!
#BestPaperAward#SIGGRAPH2025 One neural PDE model, hundreds of shapes — simulated at lightning speed. 🚀 Introducing Shape Space Spectra: first eigenanalysis across shapes. Come see ChangYue’s talk today 👉 changy1506.github.io
This is one of the craziest ideas I've ever seen. He converted a drawing of a bird into a spectrogram (PNG -> Soundwave) then played it to a Starling who sung it back reproducing the PNG.
Using the birds brain as a hard drive with 2mbps read write speed.
youtube.com/watch?si=HMtVd…
Huge computer science result:
A Tsinghua professor JUST discovered the fastest shortest path algorithm for graphs in 40yrs.
This improves on Turing award winner Tarjan’s O(m + nlogn) with Dijkstra’s, something every Computer Science student learns in college.
The Case for Muon
1) We can descend 'faster' in non-Euclidean spaces
2) Adam/Shampoo/SOAP/etc. dynamically learn the preconditioner and, equivalently, the norm & space to descend in
3) Muon saves a lot of compute by simply letting the norm to vary within a fixed range
The Case for Muon
1) We can descend 'faster' in non-Euclidean spaces
2) Adam/Shampoo/SOAP/etc. dynamically learn the preconditioner and, equivalently, the norm & space to descend in
3) Muon saves a lot of compute by simply letting the norm to vary within a fixed range https://t.co/PKpXrKSYpT
State space models have struggled to learn to do things like copying and associative recall 🟢 -- things that self-attention learns easily 🟠...
But it turns out we just needed to change SSM initialization a bit 🔵. Our init helps a lot, and even makes state space layers *look*…
chipwise.tech/our-portfolio/…
We took apart the iPhone 16 & 16pro. This resulted in clear die shots of the A18 & A18pro SoC. Have a look in this article.
SAM & SAM-2 are great but depend on costly annotations.
Can we 'segment anything' without supervision?🤔
Yes! Check out UnSAM @NeurIPS24—an unsupervised segmenter that achieves SAM-level results! 🎉
Even better—UnSAM+ beats SAM with +6.7% AR & +3.9% AP, using just 1% labels!💪
I just read up on RAFT.
Retrieval-augmented finetuning.
Here is how it works:
• generate q&a dataset
• gather documents for each q&a pair
• specify "golden" vs. "distractor" docs
• finetune LLM to pick "golden" doc
In RAFT, "golden" documents contain the answer to a q&a…
This is a major medical breakthrough—almost the holy grail of stem cell therapy! What makes it personally very thrilling is that the lead scientist of the study, Hongkui Deng, is my close friend & worked together @nyulangone NYU Medical School as postdoc nature.com/articles/d4158…
1/n Introducing SOAP (ShampoO with Adam in the Preconditioner's eigenbasis): A deep learning optimization algorithm that applies Adam in Shampoo's eigenbasis. SOAP outperforms both AdamW and Shampoo in language model pretraining.
3K Followers 1K FollowingFounder and CEO of Prisma. If you are building for a global audience, you should give @prisma a try. DMs open - please reach out.
17K Followers 933 FollowingCo-founder and CTO of @CoreViewHQ GenAI/LLM addicted, Apple MLX, Microsoft 365, Azure, Kubernetes, Investor in innovation and Mensa member.
940 Followers 2K FollowingAI/ML engineering leader. Learning, exploring and reasoning about AI and the use and development of technology every day. Broad interests. Views are personal.
25K Followers 100 FollowingDirector, @PrincetonPLI and Professor @PrincetonCS. Seeks math/conceptual understanding of deep learning and large AI models.
Also on the "other" social network
2K Followers 753 FollowingPassionately in love with Science, mostly Altruistic, Engineer, Amateur Astronomer & Critical thinker. Current Research focus: ▫️Mechanistic Interpretability▫️
21K Followers 466 Followingphysics of language models @ Meta (FAIR, not GenAI, not TBD)
🎓:Tsinghua Physics — MIT CSAIL — Princeton/IAS
🏅:IOI x 2 — ACM-ICPC — USACO — Codejam — math MCM
14K Followers 3K Followingresearch @MIT_CSAIL @thinkymachines. work on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
2K Followers 269 FollowingMaintains https://t.co/VoCwlJ9Eq0 / https://t.co/bMI9arVwcR / https://t.co/2agmCPOZ2t. Wrote iOS app Snapchat (2014-2020). Founded Facebook Videos with others (2013). Sometimes writes at https://t.co/Gyt4J9Z9Tv
1K Followers 470 FollowingHelping teams build AI-augmented systems that actually work | ai & ml explorer | tech optimist | industry-to-tech translator |
ex-EPCM PM
8K Followers 950 FollowingML, systems, and everything in between. Building @guardrails_ai. Previously founding eng @predibase, @Apple SPG, @driveai_, @IllinoisCS, @iitdelhi.
359K Followers 1K FollowingML/AI researcher & former stats professor turned LLM research engineer. Author of "Build a Large Language Model From Scratch" (https://t.co/O8LAAMRzzW).
11K Followers 1K FollowingI like tokens! I lead the OLMo data team at @allen_ai w/ @kylelostat. Open source is fun 🤖☕️🍕🏳️🌈 Opinions are sampled from my own stochastic parrot