John Thickstun @jwthickstun
Postdoc at Stanford. @StanfordCRFM @StanfordNLP @StanfordAILab Previous @uwcse @uw_wail Controllable Generative Models. AI for Music. johnthickstun.com Menlo Park, CA Joined February 2020-
Tweets231
-
Followers1K
-
Following532
-
Likes843
Final Update: One more magnitude of testing Sophia. We're talking model sizes in the B's, tokens in the T's. Sophia once again wins out. For me at least this is clear evidence that Sophia may be a replacement for Adam even in large scale runs.
Final Update: One more magnitude of testing Sophia. We're talking model sizes in the B's, tokens in the T's. Sophia once again wins out. For me at least this is clear evidence that Sophia may be a replacement for Adam even in large scale runs. https://t.co/1l8XKBswaU
Next Wed., April 10 join us for the @MIT_ide lunch seminar with guest, @MinaLee__ on "Writing with Language Models" at 12pm ET. 💻Anyone can join online: bit.ly/ideseminarvirt… 📍@MIT + @MIT_ide members join in-person: bit.ly/ideseminar410
🧑🔬LLMs for complex Chemistry reasoning!🧪 Interestingly, we found LLMs (GPT-4) have already encoded lots of ⚗️Chemistry knowledge. 🤔What is really missing is a structured process to elicit the right knowledge, and use the knowledge to perform grounded reasoning. A very…
🧑🔬LLMs for complex Chemistry reasoning!🧪 Interestingly, we found LLMs (GPT-4) have already encoded lots of ⚗️Chemistry knowledge. 🤔What is really missing is a structured process to elicit the right knowledge, and use the knowledge to perform grounded reasoning. A very…
Thrilled to announce the 2024 recipients of #KempnerInstitute Research Fellowships: Thomas Fel, Mikail Khona, Bingbin Liu, Isabel Papadimitriou, Noor Sajid, & Aaron Walsman! bit.ly/4aBQ6MS @Napoolar @KhonaMikail @BingbinL @isabelpapad @nsajidt @aaronwalsman
We’ve been using levanter for a number of our research projects, including training music models: x.com/jwthickstun/st…
We’ve been using levanter for a number of our research projects, including training music models: x.com/jwthickstun/st…
Calling motivated students interested in pursuing MS/PhD in ML/AI, specifically privacy & generative AI! The research group I'm starting at @iitmadras has openings! Apply by *Mar 31* directly to @DSAI_IITM or @iitmcse at research.iitm.ac.in!
Calling motivated students interested in pursuing MS/PhD in ML/AI, specifically privacy & generative AI! The research group I'm starting at @iitmadras has openings! Apply by *Mar 31* directly to @DSAI_IITM or @iitmcse at research.iitm.ac.in!
@dlwh has been leading the effort at @StanfordCRFM on developing levanter, a production-grade framework for training foundation models that is legible, scalable, and reproducible. github.com/stanford-crfm/… Here’s why you should try it out for training your next model:
I like to talk about Levanter’s performance, reproducibility, and scalability, but it’s also portable! So portable you can even switch from TPU to GPU in the middle of a run, and then switch back again! github.com/stanford-crfm/…
David (@dlwh) is an incredible friend and mentor. Highly recommend following his work — he not only dives deep into understanding *all* the parts of the systems he works with, but also cares about sharing these insights in a way that’s accessible. Levanter is just one example!
David (@dlwh) is an incredible friend and mentor. Highly recommend following his work — he not only dives deep into understanding *all* the parts of the systems he works with, but also cares about sharing these insights in a way that’s accessible. Levanter is just one example!
A Design Space for Intelligent and Interactive Writing Assistants #CHI2024 👩🏻✏️🤖 What writing assistants do you use? What else are out there and how do they differ? What do we need to consider when designing new writing assistants? 🔗 arxiv.org/abs/2403.14117 (1/6)
At @StanfordCRFM, we’ve used Levanter to help scale new techniques like: * Sophia: x.com/tengyuma/statu… * Backpacks: x.com/johnhewtt/stat… * Anticipatory Music Transformers: x.com/jwthickstun/st… (co-released today!)
At @StanfordCRFM, we’ve used Levanter to help scale new techniques like: * Sophia: x.com/tengyuma/statu… * Backpacks: x.com/johnhewtt/stat… * Anticipatory Music Transformers: x.com/jwthickstun/st… (co-released today!)
We’re honored to announce our seed and Series A funding, and thankful to our partners and investors! With a mission of orchestrating the world’s compute capacity, making it universally accessible and useful, exciting things are on the horizon. Learn more about what's ahead:…
We’re honored to announce our seed and Series A funding, and thankful to our partners and investors! With a mission of orchestrating the world’s compute capacity, making it universally accessible and useful, exciting things are on the horizon. Learn more about what's ahead:…
AI Music Generation by Stanford University 🎶 Anticipatory Music Transformer 🔮 Next Note Prediction 🎹 Multi-Track (i.e., Multiple Instruments) 📄 MIDI Format ✏️ Easy Music Modification ➕ Extend Music Subscribe: youtube.com/@MervinPraison YT: youtube.com/watch?v=iJToz5… @reach_vb…
Anticipatory Music Transformer by @StanfordCRFM 🎶 > A foundation model for symbolic music. > Supports generating accompaniments (enrich music) and infill (fill in musical details). > 780 Million parameters, trained for 800 Thousand steps. > Trained on Lakh, MetaMIDI and…
Built with Levanter!
Best open MIDI model out there to date
@StanfordCRFM and the @nvidia JAX Team have worked together to integrate TransformerEngine into our foundation model training framework, Levanter. The result? Levanter is now significantly faster on GPUs, with up to 50% more tokens per second! github.com/stanford-crfm/… @itsvadams
AK @_akhaliq
308K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistGabriel Ilharco @gabriel_ilharco
4K Followers 1K Following Building cool things @xAI. Prev. PhD at UW, Google AIJim Fan @DrJimFan
228K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.rishi @RishiBommasani
4K Followers 2K Following Stanford CS PhD @StanfordCRFM @StanfordNLP @StanfordAILab @StanfordHAI Advisers: @percyliang @jurafsky Previous: @CornellCIS @clairecardie #FoundationModelsTim Dettmers @Tim_Dettmers
28K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Ofir Press @OfirPress
9K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Chris Donahue @chrisdonahuey
5K Followers 1K Following Generative models, musical expression for all. Assistant professor at CMU CSD. Part time research at Google Magenta (views my own)Talia Ringer 🟣 �.. @TaliaRinger
25K Followers 6K Following Professor, @plfmse, @IllinoisCS! Proof Automation. @SigplanM & CCF Founder. Israeli-American for peace, equality, & justice. They/היא, ND, bi. די לכיבושAllen School @uwcse
10K Followers 3K Following The Paul G. Allen School of Computer Science & Engineering educates tomorrow's innovators while developing solutions to humanity's greatest challenges.Christian Steinmetz @csteinmetz1
5K Followers 2K Following AI for audio • PhD Student @c4dm MSc @mtg_upf • Previously Intern @Adobe @Meta @DolbySander Dieleman @sedielem
50K Followers 2K Following Research Scientist at Google DeepMind. I tweet about deep learning (research + software), music, generative models (personal account).Pedro Sarmento @umpedronosapato
2K Followers 2K Following PhD researcher in AI & Music @CDT_AI_Music @c4dm @QMUL a bit more at https://t.co/Ame4OSwQjqAnanya Kumar @ananyaku
4K Followers 469 Following Researcher at @openai Previously PhD at Stanford University (@StanfordAILab) advised by Percy Liang and Tengyu MaMina Lee @MinaLee__
3K Followers 452 Following Postdoc at @MSFTResearch | Assistant Professor at @UChicagoCS (2024) | PhD at @Stanford | Language models, AI-assisted writing, Human-AI interaction ✍️Zachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Michi Yasunaga @michiyasunaga
3K Followers 843 Following CS PhD @Stanford working on language models and multimodal models. Previously @GoogleDeepMind @Meta @YaleStella_Martine @martine_st59086
3 Followers 951 FollowingRishabh Singh @anonymous_r007
262 Followers 2K Following Get Organized — Stay Productive. Notion Creator & Productivity Hacker. Tweets from my journey there! 🚀. Build @hilbertquantumEmoji_Queen @EmojiQueen79927
2 Followers 517 Following I swear in the name of God, don't miss an opportunity to earn 500-5000usdc every day. https://t.co/Sz0SE4L3IO13_Emma @13Emma199066
1 Followers 344 FollowingNicholas Lourie @NickLourie
115 Followers 178 Following I build things. 🤖 Doing a PhD at @nyuniversity (@CILVRatNYU) on better empirical methods for deep learning and data science. Advised by @kchonyc and @hhexiy.us_Kayla_ @UsKayla20562
2 Followers 963 FollowingVika @Vika18108150884
1 Followers 912 Followingsayed_khan84 @Khan84Saye32976
9 Followers 908 FollowingThomas Tränkler @ttraenkler
497 Followers 3K Following Founder & CEO of @loopdive. Senior Software Engineer, Web, Cloud, ML & Cognitive NeuroscienceThassacki @thassacki66144
1 Followers 353 Followingchristopherbare @christopherbare
394 Followers 2K Following Software engineering | machine learning | NLP | data | health | life-sciencesubramanyam sahoo @iamwsubramanyam
46 Followers 820 Following If Elon is saying 1+1 = 11 then believe it. 1+1 = 2 might be wrong in some other simulations.Jitendra Sharma @jkumarsharma998
763 Followers 6K Following Curious about Research in AI. NLP and Computer Vision Interest me. Curious about truth and existence. Views are personal.Aurora @Aurora793266
7 Followers 1K Followinghein min oo @Hein55030Min
172 Followers 722 Followinguxatisb78spm @uxatisb78spm
27 Followers 274 FollowingSANI HALADU @SANIHALADU24751
0 Followers 22 FollowingSparkle @Sparkle121681
2 Followers 914 FollowingZixun Nicolas Guo @nicolasguozixun
64 Followers 89 Following PhD Researcher in AI and Music @CDT_AI_Music @C4DM @QMUL in AI MusicFilmF_ataleFreya @FilmfA21144
9 Followers 917 FollowingMaxime Peyrard @peyrardMax
213 Followers 279 Following Junior Professor @CNRS (previously @EPFL, @TUDarmstadt) -- AI Interpretability, causality, and interaction flows between LLM, humans, and toolsBluebell @Bluebell1430360
1 Followers 877 FollowingYves Robert @YvesFendt5719
132 Followers 842 Followingus_Sydney_ @SydneyUs37356
2 Followers 858 FollowingBurny — Effective O.. @burny_tech
14K Followers 5K Following Transhuman engineer in singularity! Lover of AI and omnidisciplionary metamathemagics! Hypercuriousia! Omniperspectivity! Freedom, growth, flourishing for all!大九_LN (DA JIU/Dai.. @lnine_chiu
53 Followers 152 Following 写点歌词 | 胃口很好 🍰 中文 | English 📒 Music project - Treasure Traveller 秘宝旅人 ♪ https://t.co/UAboWpkdxtArbana Kadriu @arbanak
4 Followers 185 FollowingTran Bao Chi @TranBaoChi7
28 Followers 458 Following Undergrad #DSAI #HUST Research Intern #NLP #VinAICu | 无水醋酸铜 .. @TeenagingCu
140 Followers 315 Following Research scientist in Auditory neuroscience and Hearing science 👂🧠 | 🎶 Music producer | 🎧 Game audio | 中文 English 日本語Joe Munday @JoeM315
30 Followers 545 FollowingYYF @drdfla
146 Followers 81 FollowingOli Larkin @olilarkin
2K Followers 1K Following Software engineer at Ableton, lead developer of iPlug2 framework. Maker of VirtualCZ and Endless Series plug-ins.Tiziana Ligorio @tligorio
29 Followers 194 FollowingApisov 🇺🇦 @apisov_pavlo
202 Followers 1K Following Software Eng @WeHealthOrg MS in Sound and Music Computing @mtg_upf.Ivan Rubachev @irubachev
83 Followers 332 Following ML Researcher @YandexResearch CS PhD student @CS_HSE I work on improving deep learning for tabular dataGiovanni Bindi @gvnbnd
30 Followers 307 Following PhD student in deep generative models for music - ACIDS team @ircamV_59 @vigneshtamizh
119 Followers 3K Following LLMs, Model Parallelism, PipeLine Parallelism, AI Engineer, works on Generative AI. @ortist - InspirationScott H. Hawley, drsc.. @drscotthawley
2K Followers 1K Following Physics prof & Senior Data Fellow @BelmontUniv, teacher of Audio Engineers. Mostly: ML for music producersYann LeCun @ylecun
708K Followers 716 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistNatasha Jaques @natashajaques
25K Followers 1K Following Senior Research Scientist at @GoogleAI and Assistant Professor @uwcse. Social Reinforcement Learning in multi-agent and human-AI interactions. PhD from @MIT.Soumith Chintala @soumithchintala
185K Followers 871 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Gabriel Ilharco @gabriel_ilharco
4K Followers 1K Following Building cool things @xAI. Prev. PhD at UW, Google AIKyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Jim Fan @DrJimFan
228K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Sasha Rush @srush_nlp
51K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzrishi @RishiBommasani
4K Followers 2K Following Stanford CS PhD @StanfordCRFM @StanfordNLP @StanfordAILab @StanfordHAI Advisers: @percyliang @jurafsky Previous: @CornellCIS @clairecardie #FoundationModelsYoav Artzi @yoavartzi
13K Followers 163 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / https://t.co/3VmRSyYm2d / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaCTim Dettmers @Tim_Dettmers
28K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Colin Raffel @colinraffel
30K Followers 655 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlpAkari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Christopher Manning @chrmanning
126K Followers 114 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋Jacob Andreas @jacobandreas
13K Followers 955 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwOfir Press @OfirPress
9K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Clément Canonne @ccanonne_
31K Followers 926 Following Senior Lecturer @Sydney_Uni. Postdocs @IBMResearch, @Stanford; PhD @Columbia. Converts ☕ into puns: sometimes theorems. He/him. @[email protected]Kelvin Guu @kelvin_guu
3K Followers 333 Following Senior staff research scientist @ Google DeepMind leading cross-functional teams of 40+ (research/eng/PM/UI/UX), turning our SOTA research into new AI products.Jared Quincy Davis @jaredq_
609 Followers 307 Following Founder and CEO, Foundry. @mlfoundry Orchestrating Compute. Fmr Research Scientist @DeepMind, Deep Learning Team. CS PhD @Stanford. ML, Distributed SystemsVaibhav (VB) Srivasta.. @reach_vb
11K Followers 169 Following GPU poor @Huggingface | F1 fan | Here for @at_sofdog’s wisdom | *opinions my ownTri Dao @tri_dao
18K Followers 360 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.Cornell Computer Scie.. @Cornell_CS
2K Followers 71 Following Founded in 1965, as one of the first of its kind – the department of computer science in @CornellCIS explores theory, programming, languages, AI, and more.Clément Moulin-Frier @Clement_MF_
698 Followers 2K Following Researcher @FlowersINRIA. PhD, Engineering of Cognition, Interaction, Learning and Creation. Formerly @Cogitai @SPECS_lab @cdf1530 @GipsaLab.Abhay Agarwal @Denizen_Kane
316 Followers 397 Following CEO of @with_poly, powering design with AI. Funded by YC (S22), Figma, Bloomberg Beta, Felicis. Lecturer @UTAustin. Prev @MSFTResearch, @stanforddschoolKatyanna Quach @katyanna_q
2K Followers 819 Following Tech reporter @semafor, interested in AI and science 🤖 | previously @theregisterApache TVM @ApacheTVM
3K Followers 945 Following Open deep learning compiler stack for CPUs, GPUs and specialized accelerators. Join us for the TVM and Deep Learning Compilation Conference https://t.co/i6MTbWYt87Thomas Steinke @shortstein
8K Followers 448 Following Computer scientist interested in (differential) privacy & related topics, e.g., generalization. @GoogleDeepMind Opinions are mine ©. 🇳🇿Kartik Sreenivasan @KartikSreeni
714 Followers 528 Following Graduate Student @ UW-Madison. Research scientist intern at MosaicML/Databricks. Interested in LLMs, optimization, and the meaning of life.AI Music Report @aimusicreport
103 Followers 208 Following Welcome to AI Music Report – your news hub for all things #AI in #music. Follow us for the talk, trends, tools, and talents reshaping music through #technology.Ramya Vinayak @ramyavinayak
13 Followers 30 Following Assistant Professor at UW-Madison. Working on Machine Learning, Statistical Inference and Crowdsourcing.Somesh Jha @jhasomesh
4K Followers 884 Following Professor of Computer Science and a music lover. Interested in formal methods, security, and Trustworthy ML. Oh yes, and classical music and jazz.Terra Blevins @TerraBlvns
493 Followers 419 Following Grad student researching NLP at the University of Washington. she/her.UCSD CSE @ucsd_cse
3K Followers 393 Following Official Twitter page of the UC San Diego Department of Computer Science and EngineeringRyan Louie (@ryanclou.. @RyanCLouie
429 Followers 624 Following Postdoc @ Stanford CS. Previously @Northwestern @OlinCollege. 🛠 intelligent interactive systems + 🔍 their impact on how we play/create/socially-interactOmar Khattab @lateinteraction
11K Followers 2K Following CS PhD candidate @StanfordNLP. 2022 Apple Scholar in AI/ML. Author of ColBERT (https://t.co/2ZtgXoa1np), DSPy (https://t.co/BH7WmMKDXR), & various retrieval & LM systems.Almost Sure @Almost_Sure
5K Followers 194 Following George Lowther, Author of Almost Sure blog, on maths, probability and stochastic calculus. @[email protected] @almostsure.bsky.socialGe Wang @gewang
6K Followers 2K Following Stanford professor (music, CS, design); co-founder of Smule; author of https://t.co/F9gAQlO6ZX. I make stuff with computers (for humans) to make music.Michael Saxon @m2saxon
2K Followers 1K Following CS PhD cand @ucsbNLP 🌊🌴 @NSF GRFP 🧐analyzing semantics in generative lang/img AI models🤖 Big tech ex-intern. BS/MS @ASU 🌵🏜 🔜 @AMD opensrc GenAI RS internEdward Grefenstette @egrefen
36K Followers 773 Following FR/US/GB AI/ML Person, Director of Research at @GoogleDeepMind, Honorary Professor at @UCL_DARK, @ELLISforEurope Fellow. All posts are personal."nicole" @ninklefitz
1K Followers 517 Following master of decorum @alpacaml. prev: @MicrosoftResearch, @MosaicML, @Mila_QuebecZack Ankner @ZackAnkner
485 Followers 304 Following Junior @MIT. President of AI@MIT. Research Scientist Intern @MosaicML. A(CL)verage Embargo enjoyer.Mona Diab @MonaDiab77
1K Followers 686 Following Director of LTI, CMU. ACL Fellow. I am passionate about language, mind, responsible technologies, technology/society, history, politics, nutrition!Peter Henderson @PeterHndrsn
2K Followers 892 Following Assistant Professor @PrincetonCS @PrincetonSPIA and @PrincetonCITP 📚JD/PhD (Law+AI) @StanfordQosmo — AI Creativi.. @qosmo_inc
3K Followers 111 Following We are a Tokyo-based collective. Celebrating our 15th anniversary this year! Sharing our latest projects #news #art #consulting #productsShital Shah @sytelus
10K Followers 8K Following Deep learning research and code. If universe is an optimizer, what is the loss function? All opinions are my own.Rohith Kuditipudi @rckpudi
252 Followers 116 Following PhD student @StanfordAILab advised by John Duchi and @percyliangEthan Epperly @ethanepperly
1K Followers 468 Following PhD candidate in applied math @Caltech and @doecsgf fellow interested in computational linear algebra @[email protected] he/himRoopsha Samanta @roopshasamanta
5K Followers 563 Following Ex-academic. Untenured, unapologetic, undaunted.Brown Data Science In.. @Brown_DSI
2K Followers 359 Following The mission of the Data Science Institute at Brown is to stimulate innovation and support people aspiring to improve lives in our data-driven world 🐻Jelani Nelson @minilek
22K Followers 184 Following Professor @Berkeley_EECS. Research Scientist (part-time) @GoogleAI. Founder @addiscoder. 🇻🇮🇺🇸🇪🇹Adam Coates @adampaulcoates
32K Followers 292 Following Has AI made the world better yet? Let's get on that. Director at Apple. Fmr Stanford PhD, Director Baidu SVAIL, @khoslaventures. #deeplearning #HPC #AIStanford HAI @StanfordHAI
86K Followers 554 Following The official account of the @Stanford Institute for Human-Centered AI, advancing AI research, education, policy, and practice to improve the human condition.Ian Covert @ianccovert
277 Followers 137 Following Postdoc @Stanford, previously @uwcse @GoogleAI and @Columbia. Interested in deep learning and explainable AIJulian Lenz @JLenzyy
602 Followers 751 Following Audio AI research engineer w/ Lemonaide. prev. Neutone, Okio. MSc in Audio Computation at UPF. Opinions are a biased interpolation of my training dataset.Huge moment for open source!
The upcoming Llama-3-400B+ will mark the watershed moment that the community gains open-weight access to a GPT-4-class model. It will change the calculus for many research efforts and grassroot startups. I pulled the numbers on Claude 3 Opus, GPT-4-2024-04-09, and Gemini.…
Today Meta released Llama 3! Congrats to the team. In their blog post they wrote that, "the curation of a large, high-quality training dataset is paramount", while providing almost no information about how it was made, how it was filtered, or its contents.
Final Update: One more magnitude of testing Sophia. We're talking model sizes in the B's, tokens in the T's. Sophia once again wins out. For me at least this is clear evidence that Sophia may be a replacement for Adam even in large scale runs.
Update: As promised, one order of magnitude more compute testing AdamW vs. Sophia. This time applied to two different transformer architectures. Sophia is clearly the winner again. Will run one more ablation with another order of magnitude more compute to see if trend holds.
If you have Long Covid, and you are seeing what I am seeing—a world that does not seem to care—please know that I care, I know you are going through a lot, please be gentle with yourself ❤️
Feeling more motivated than ever to demonstrate how AI can be a creative tool for musicians that rewards skill, while providing an on-ramp the next generation of musicians to innovate. Talk is cheap!
My first grant as a PI post-PhD (: . Excited to finally be able to talk about the paper publicly. Starting to feel like I can do this professor thing...
Update: As promised, one order of magnitude more compute testing AdamW vs. Sophia. This time applied to two different transformer architectures. Sophia is clearly the winner again. Will run one more ablation with another order of magnitude more compute to see if trend holds.
No replies here. Decided to try out on our own benchmarks, consisting of an auto-regressive, multi-modal pre-training at scale. Pretty complex setting. Yellow: Tuned (LR) AdamW Purple: Tuned (LR) Sophia Average Loss:
Next Wed., April 10 join us for the @MIT_ide lunch seminar with guest, @MinaLee__ on "Writing with Language Models" at 12pm ET. 💻Anyone can join online: bit.ly/ideseminarvirt… 📍@MIT + @MIT_ide members join in-person: bit.ly/ideseminar410
🧑🔬LLMs for complex Chemistry reasoning!🧪 Interestingly, we found LLMs (GPT-4) have already encoded lots of ⚗️Chemistry knowledge. 🤔What is really missing is a structured process to elicit the right knowledge, and use the knowledge to perform grounded reasoning. A very…
🚀Announcing StructChem: A simple yet effective prompting strategy, unlocking the power of LLMs for complex chemistry reasoning. This task requires: - Extensive domain knowledge - Precise scientific computing - Compositional step-by-step reasoning Paper: arxiv.org/abs/2311.09656…
This was a fun project! If you could train an LLM over text arithmetically compressed using a smaller LLM as a probabilistic model of text, it would be really good. Text would be represented with far fewer tokens, and inference would be way faster and cheaper. The hard part is…
Ever wonder why we don’t train LLMs over highly compressed text? Turns out it’s hard to make it work. Check out our paper for some progress that we’re hoping others can build on. arxiv.org/abs/2404.03626 With @blester125, @hoonkp, @alemi, Jeffrey Pennington, @ada_rob, @jaschasd
Google presents Training LLMs over Neurally Compressed Text - Outperforms byte-level baselines by a wide margin - Worse PPL than subword tokenizers but the benefit of shorter sequence lengths arxiv.org/abs/2404.03626
Thrilled to announce the 2024 recipients of #KempnerInstitute Research Fellowships: Thomas Fel, Mikail Khona, Bingbin Liu, Isabel Papadimitriou, Noor Sajid, & Aaron Walsman! bit.ly/4aBQ6MS @Napoolar @KhonaMikail @BingbinL @isabelpapad @nsajidt @aaronwalsman
We’ve been using levanter for a number of our research projects, including training music models: x.com/jwthickstun/st…
We’re releasing an updated version of the Anticipatory Music Transformer! A 780M parameter model, trained on a larger corpus of music: Lakh + MetaMIDI + transcripts of audio. It's the blue curve at the bottom of this plot! 📉 huggingface.co/stanford-crfm/… 🧵👇 (1/3)
I'm excited to share that the journal version of our paper, "An archival perspective on pretraining data", is now available (open access) from Patterns! This project was led by @MeeraDesai18, along with @IrenePasquetto, @az_jacobs, and myself 1/n
Perspective: "standard" mixture models are 'just' mixtures for which the mixing measure is finite and discrete. Even if you'll only ever use such mixtures, it can sometimes be conceptually useful to think in terms of this bigger picture.
Calling motivated students interested in pursuing MS/PhD in ML/AI, specifically privacy & generative AI! The research group I'm starting at @iitmadras has openings! Apply by *Mar 31* directly to @DSAI_IITM or @iitmcse at research.iitm.ac.in!
Our MS/PhD Applications are now open! Apply before 🗓️31st March! #PhDposition #MS #Research - for more info on our research areas, see dsai.iitm.ac.in/research/resea… - We have a number of exciting research centres @WSAI_IITM @cerai_iitm @IBSE_IITM @ai4bharat
@dlwh has been leading the effort at @StanfordCRFM on developing levanter, a production-grade framework for training foundation models that is legible, scalable, and reproducible. github.com/stanford-crfm/… Here’s why you should try it out for training your next model:
FP8 makes a huge difference! With this, Levanter has pretty much all the bells and whistles you need for training foundation models.
FP8 support has landed in Levanter (and Haliax)! On H100, you can now get a >40%(!) throughput improvement by flipping a flag. Just add `trainer.fp8: true` to your config and you’re good to go! H100s not included. github.com/stanford-crfm/…
FP8 support has landed in Levanter (and Haliax)! On H100, you can now get a >40%(!) throughput improvement by flipping a flag. Just add `trainer.fp8: true` to your config and you’re good to go! H100s not included. github.com/stanford-crfm/…