It's very difficult to improve the *exponent* in scaling laws for loss vs compute, especially by changing the optimizer!
Our new paper shows that scaling momentum correctly can *provably* improve the scaling exponent on a theoretical model. Empirically, it works on LSTMs too!
1K Followers 8K FollowingAI inference, speculative decoding, open source. Built novel decoding algorithms – default in Hugging Face Transformers (150+ ⭐). Making AI faster + cheaper
147 Followers 349 FollowingPhD candidate in Machine Learning at @imperialcollege supervised by @markvanderwilk. Interested in causality and Bayesian machine learning.
339 Followers 5K FollowingEngineer + MBA, Background in Finance, underwriting, risk management with software engineering skills and ability to analyze large volumes of data using ML, AI.
1K Followers 4K FollowingAutomate the Neighborhood! Abundance on my block. How can we leverage AI in our neighborhoods to improve quality of life right now?
193 Followers 5K FollowingLife is Beautiful. :)
|Storyteller/Amateur Writer|ML/DL|He/Him
P.S. If you feel blue and you would like to talk to someone, feel free to DM, I will be there!
15K Followers 6K FollowingI build tough benchmarks for LMs and then I get the LMs to solve them. SWE-bench & SWE-agent. Postdoc @Princeton. PhD @nlpnoah @UW.
5K Followers 193 FollowingPodcast, courses, and resources exploring emotional fluidity, transformation, and the journey of self-discovery in the modern world. @FU_joehudson
271 Followers 369 FollowingTheoretical physicist (bow tie included), inherently out of equilibrium. Studying data structure and deep learning. Marie Skłodowska-Curie fellow at @SISSA.
802 Followers 694 Followingtechnical staff @openai, previously theory @berkeleyeecs, eng @twosigma, math @princeton | fan of graphs, crosswords, turtles, bad puns, running, and Survivor
470 Followers 370 Followingweightlifting 🏋️ & AI - GDM, previous Anthropic, previous pretraining/data research of Gemini at Google Deepmind. Only represents my personal opinions.
2K Followers 1K FollowingScaling supervision for AI on evals that matter.
👨🍳Forecasting, Long Horizon, Synth Data for RL
RS Intern @AIatMeta
PhDing @ELLISInst_Tue @MPI_IS
6K Followers 2K FollowingCS PhD Student at Stanford Trustworthy AI Research with @sanmikoyejo. Prev interned/worked @ Meta, Google, MIT, Harvard, Uber, UCL, UC Davis
519 Followers 2K FollowingPrev. IBM Realtime Linux, @AWS, @Quora, Argo AI, @CloudKitchens. Now Product Search and Discovery at @Coupang. Opinions are my own and not of my employers.
2K Followers 398 FollowingResearch Scientist @ Google DeepMind
Building memory & open-ended AI
ex-neuroscientist
ex-IMO team Canada
Views are mine alone not GDM's.
16K Followers 497 FollowingHarvard Professor.
Full stack ML and AI.
Co-director of the Kempner Institute for the Study of Artificial and Natural Intelligence.
58K Followers 623 FollowingDistinguished Professor (Emeritus), Oregon State Univ.; Former President, Assoc. for the Adv. of Artificial Intelligence; Robust AI & Comput. Sustainability