Suchin Gururangan @ssgrn
he/him Research scientist 🦙 Llama team, @meta GenAI PhD @uwcse + @uwnlp suchin.io SF x LA Joined November 2011-
Tweets939
-
Followers4K
-
Following249
-
Likes2K
Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥 We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an…
Check out the generative vision related release too meta.ai/?icebreaker=im… Imagine Flash generates the image as you type You can also "Animate" your images! (technique based on Emu Video emu-video.metademolab.com) Kudos to the team for putting this out :)
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…
A Mamba Primer (w/ Yair Schiff youtube.com/watch?v=dVH1dR… ) Mamba is a nice jumping off point to summarize foundational ideas in sequence modeling, parallel algorithms, continuous-time representations, and GPU aware algorithms. We try to put these together in the context of LMs.
Shoutout to @orevaahia et al who wrote a great paper that revealed this issue! arxiv.org/abs/2305.13707
Shoutout to @orevaahia et al who wrote a great paper that revealed this issue! arxiv.org/abs/2305.13707
I'm excited to share that the journal version of our paper, "An archival perspective on pretraining data", is now available (open access) from Patterns! This project was led by @MeeraDesai18, along with @IrenePasquetto, @az_jacobs, and myself 1/n
🚨 Hiring Alert🚨 The FAIR CodeGen team in Paris is looking for research engineers! Come join this super talented team, help release open models to the world, and push the frontiers of code generation research!
🚨 Hiring Alert🚨 The FAIR CodeGen team in Paris is looking for research engineers! Come join this super talented team, help release open models to the world, and push the frontiers of code generation research!
How do you know if a method is better, or just has better hyperparameters? @hhexiy, @kchonyc, and I give a new tool to answer this in our #NAACL2024 paper: "Show Your Work with Confidence" arxiv.org/abs/2311.09480. Use it in your own work with just a "pip install opda"! 🧵 1/8
sharing some highlights from our recent paper: language models scale reliably with over-training and on downstream tasks! arxiv: arxiv.org/abs/2403.08540 104 models, 11M to 7B parameters, varying numbers of tokens, 3 datasets, eval’d on 46 tasks: github.com/mlfoundations/… 1/11
Wanna know gpt-3.5-turbo's embed size? We find a way to extract info from LLM APIs and estimate gpt-3.5-turbo’s embed size to be 4096. With the same trick we also develop 25x faster logprob extraction, audits for LLM APIs, and more! 📄 arxiv.org/abs/2403.09539 Here’s how 1/🧵
Language models scale reliably with over-training and on downstream tasks Scaling laws are useful guides for developing language models, but there are still gaps between current scaling studies and how language models are ultimately trained and evaluated. For instance,
🚨 Introducing Branch-Train-miX (BTX) 🚨 BTX improves a generalist LLM on multiple fronts: - Train expert LLMs in parallel for new skills in domains such as math, code & world knowledge - Join (mix) them together & finetune as a Mixture-of-Experts arxiv.org/abs/2403.07816 🧵(1/4)
I'm hiring a PhD intern for the FAIR CodeGen (Code Llama) team. Do research on Code LLMs, execution feedback, evaluation, etc. Apply here: metacareers.com/jobs/170210647…
New paper on using gradient similarity search to select instruction tuning data! We have tricks to make the computation and search efficient, and show gradients from small models can identify useful instructions for larger models. Led by @xiamengzhou and @SadhikaMalladi!
New paper on using gradient similarity search to select instruction tuning data! We have tricks to make the computation and search efficient, and show gradients from small models can identify useful instructions for larger models. Led by @xiamengzhou and @SadhikaMalladi!
Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.
We have open Ph.D. and post-doc positions for our multi-modal pre-training team at FAIR. We have lots of GPUs. DM me if you are interested.
Our team in FAIR labs (at Meta) is hiring researchers (RE, RS & PostDoc)! DM if interested. We work on the topics of Reasoning, Alignment and Memory/architectures (RAM). Recent work: Self-Rewarding LMs: arxiv.org/abs/2401.10020 Pairwise Cringe Loss: arxiv.org/abs/2312.16682…
(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingAkari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Ana Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzKayo Yin @kayo_yin
8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (makin Dolma 🍇), open source science fan, @QueerInAI organizer 🤖☕️🍕they/themColin Raffel @colinraffel
30K Followers 654 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlpTim Dettmers @Tim_Dettmers
29K Followers 820 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Graham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Allen Institute for A.. @allen_ai
54K Followers 361 Following AI for the Common Good. › Join us: https://t.co/DqTs1G4bGO › Get our newsletter: https://t.co/tvb1VpySfLShaily @shaily99
5K Followers 2K Following PhD @LTIatCMU Prev: @GoogleAI @MSFTResearch. Working on ethics and evaluation in #NLProc. Usually ranting, often about research & DEI. 📚 @readsndrantsBill Yuchen Lin 🤖 @billyuchenlin
6K Followers 2K Following Research @allen_ai. I evaluate (multi-modal) LLMs, build agents, and study the science of LLMs. Previously: @GoogleAI & @MetaAI FAIR @nlp_uscMarco Ciccone @mciccone_AI
703 Followers 341 Following @ELLISforEurope postdoc @PoliTOnews @ai_ucl Competitions co-chair @NeurIPSConf 2021, 2022, 2023 PhD @polimi • ex @NVIDIA @NNAISENSEDaniel Hussey @dnahussey
342 Followers 827 Following I ❤️ engineering biology to improve health & environment + businesses that realize the impact of science. 🤘@UTDiscoveries 🦑@tandem_repeat 🌱@Valley_DAOCrazy Universe @Crazy_Universe0
96 Followers 1K FollowingJanhavee Shinde @SJanhavee
56 Followers 2K FollowingHarshil Dadlani @harryshil926
14 Followers 29 FollowingRon Rocket @RonnyRumble
53 Followers 537 Following 🌍 Energy innovator |🌱Renewable energies | 🤖 AI & blockchain visionary | Shaper of the future #Sustainability #TechTrendsAnurag Mishra @anuragm75160136
112 Followers 801 Following Building Scalable AI Applications | Senior Data Scientist @ EY | CSE Btech @ NIT MN | Linkedin: https://t.co/pCmSV6FmOepengch fan @FanPengch
215 Followers 6K FollowingGe Gao @ggaonlp
55 Followers 27 Followingดาษิณ @0k8jX198yZdo6
56 Followers 1K Following เราเจอชะตากรรมแบบไหน ชอบติดตามไว้ก่อนได้นะครับ ผมจะส่งข้อมูลติดต่อไปที่หน้าแรกเป็นระยะๆครับZhaoyang Wang @wangwan83764204
302 Followers 4K Following CS PhD student at UoB in the United Kingdom. Research interests: Automated Machine Learning, Online Learning, and Reinforcement Learning 🏳️🌈Harsh Pareek @harshhpareek
713 Followers 3K Following ML @prodigaltech, ex-(@Meta|@UTAustin|@iitbombay), 1/sqrt(2) (e/acc+AINotKillEveryone)Pratyush Shukla @PratyushSh_
4 Followers 116 FollowingJohan S. Obando 👍�.. @johanobandoc
1K Followers 2K Following Graduate student @Mila_Quebec @UMontrealDIRO | RL/Deep Learning/AI | De Cali/Colombia pal’ Mundo 🇨🇴 | #JuntosProsperamos⚡#TogetherWeThrive| 🌱🌎Pranav Silimkhan @PranavSilimkhan
12 Followers 93 Followingliang @liang_cai
23 Followers 421 FollowingStefan Streichsbier @s_streichsbier
873 Followers 1K Following 🧑🏻🍳 fascinated by #AI's ability to bring ideas to life. Working on giving engineers security superpowers by combining #LLMs with #AppSec @Guardrailsio.Akshat shaw @Akshatshaw47
3 Followers 102 Following Do better than your best. Interested in NLP and GenAIAbdulrahman Tabaza @embed_dim
4 Followers 771 Following enjoyer of various vector spaces, encoders and modalitiesHarsh Kohli @hkohli14
32 Followers 219 Following PhD student at The Ohio State University Previously - Amazon, Microsoft, GeorgiaTech, BITS Pilani Research interests: Language Models and all things NLPShamik Bose @BoseShamik
356 Followers 510 Following PhD, Senior Researcher XAI | Will talk at length about the harms and considerations for the current state of AI | Views my own | he/himSasha Doubov @sashadoubov
477 Followers 1K Following research scientist @mosaicml | ML research + skiing | studied @uoft & @uwaterlootongqg1 @tongqg1
1 Followers 14 Followingzirui @zirui3
40 Followers 942 FollowingEO @EO84494235
63 Followers 1K FollowingJHU CLSP @jhuclsp
5K Followers 662 Following Center for Language and Speech Processing at @JohnsHopkins #NLProc #MachineLearning #AI https://t.co/6IXR5OSiDY @[email protected]Tianjun Zhang @tianjun_zhang
1K Followers 764 Following Project Lead of RAFT, Gorilla, Berkeley Function Calling Leaderboard, and member of LiveCodeBench, PhD student at Berkeley-AI-Researchmaharshi @mrsiipa
188 Followers 222 Following learning deeply about life one gradient step at a time.Krish Dasgupta @officialKrishD
874 Followers 4K Following Forever Learner | Building Reinforcement Learning Systems | Healthcare | Robots and Brains | Graph ML for HealthFelix Juefei Xu @felixudr
799 Followers 5K Following Research Scientist @AIatMeta (GenAI) | Passionate about Robust, Efficient, Multimodal, Generative AI | Adjunct Faculty @NYUniversity | PhD @CarnegieMellonVaidehi Patil @vaidehi_patil_
432 Followers 690 Following PhD student @unccs @uncnlp, advised by @mohitban47 | Undergrad @IITBombay | Prev: Intern @AdobeResearch @AmazonScienceYifeng Ding @YifengDing_
220 Followers 572 Following Ph.D. student @IllinoisCS. Interested in Large Language Models for Code.Chetan Dhembre @ichetandhembre
1K Followers 4K Following CTO, co-founder @getloconow, ex @unacademy, @crowdfirealamgirqazi @alamgirqazi
144 Followers 807 Following AI Research at @uniofgalway . previously senior engineer @digitalocean.Taishi @Setuna7777_2
2K Followers 3K Following CS M1 at @tokyotech_jp advised by @rioyokota 未踏TG23 Research intern: @SakanaAILabsPete @epwalsh
51 Followers 88 Following Research Engineer at @allen_ai. Lead engineer for OLMo pretraining.Xabi (@xezpeleta@mast.. @xezpeleta
607 Followers 2K Following Linux, Open Source, Hiking, Nature, Snow, Travel, PhotographyAkari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳AI at Meta @AIatMeta
531K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Ana Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzKayo Yin @kayo_yin
8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Christopher Manning @chrmanning
126K Followers 115 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (makin Dolma 🍇), open source science fan, @QueerInAI organizer 🤖☕️🍕they/themColin Raffel @colinraffel
30K Followers 654 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlpTim Dettmers @Tim_Dettmers
29K Followers 820 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Allen Institute for A.. @allen_ai
54K Followers 361 Following AI for the Common Good. › Join us: https://t.co/DqTs1G4bGO › Get our newsletter: https://t.co/tvb1VpySfLMMitchell @mmitchell_ai
80K Followers 1K Following Interdisciplinary researcher focused on shaping AI towards long-term positive goals. ML & Ethics. Same content in the Sky, Threads, & the Prehistoric ElephantWeijia Shi @WeijiaShi2
5K Followers 967 Following PhD student @uwcse @uwnlp | Visiting Researcher @MetaAI | Undergrad @CS_UCLA | https://t.co/eLBQmgkvymSwabha Swayamdipta @swabhz
6K Followers 461 Following Assistant Prof. @CSatUSC | Researcher in #NLProc | Previously with @uwnlp @allenai | she/herLucy Li @lucy3_li
4K Followers 2K Following @UCBerkeley PhD student + @allen_ai. Human-centered #NLProc, computational social science, AI fairness. she/her. https://t.co/rtSSUhWQnLMike Lewis @ml_perception
6K Followers 227 Following Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.Soumith Chintala @soumithchintala
186K Followers 877 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Sebastian Ruder @seb_ruder
80K Followers 1K Following Multilingual LLMs @cohere • Prev: @GoogleDeepMind • Newsletter: https://t.co/7JGh2qpG98Sergey Edunov @edunov
933 Followers 102 Following Director of Engineering @ GenAI, Meta. I work on LlamasMichal Valko @misovalko
5K Followers 2K Following Llama @AIatMeta Paris & Inria & MVA - Ex: Gemini and BYOL @GoogleDeepMindThomas Scialom @ThomasScialom
6K Followers 231 Following AGI Researcher @MetaAI -- Lead Llama 2 and Postraining Llama 3. Also CodeLlama, Galactica, Toolformer, Bloom, Nougat, GAIA, ..Naman Goyal @NamanGoyal21
1K Followers 560 Following Research engineer, LLM scaling at GenAI Meta | Worked on: llama2, llama, OPT, blenderbot, XLMR, Bart, RobertaConference on Languag.. @COLM_conf
2K Followers 6 Following https://t.co/GhGCMEoa4A Abstract submission: March 22, 2024Jascha Sohl-Dickstein @jaschasd
19K Followers 623 Following Member of the technical staff @ Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamics.Stella Biderman @BlancheMinerva
15K Followers 748 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/herXavier Bresson @xbresson
13K Followers 859 Following Prof @NUSingapore Distinguished Researcher @DiscoverElement #NRF Fellow, #GraphNNs #LLMs #DeepLearningTheory #MolecularMaterialScience #Teaching Opinions my ownMarques Brownlee @MKBHD
6.2M Followers 472 Following Web Video Producer | ⋈ | Pro Ultimate Frisbee Player | Host of @WVFRM @TheStudioVaishaal Shankar @Vaishaal
807 Followers 336 Following ML research @ apple. Trying to find artificial intelligence. Opinions are my own.The Thesis Review Pod.. @thesisreview
3K Followers 2K Following Each episode of The Thesis Review is a conversation centered around a researcher's PhD thesis, and how their research has evolved since. Hosted by @wellecks.Sharan Narang @sharan0909
2K Followers 254 Following LLMs and AI Research (Llama 2 & 3 lead) @Meta | ex @Google (PaLM lead, T5), ex @Baidu (Deep Speech 2, Sparse Neural Networks), ex @NvidiaMing-Wei Chang @mchang21
1K Followers 509 Following Research Scientist @GoogleDeepMind. BERT co-author. Gemini project.Diyi Yang @Diyi_Yang
14K Followers 2K Following Assistant Professor @Stanford CS @StanfordNLP @StanfordAILab. Formerly @GeorgiaTech. Computational Social Science & NLPMammothMountain @MammothMountain
54K Followers 1K Following Delivering the inside skinny on everything from weather conditions to secret stashes and line-less lift lines. #MammothStoriesBig Bear Mountain Res.. @BigBearMtResort
25K Followers 2K Following Official Twitter Account of Big Bear Mountain Resort. Follow for real-time operational updates.Andre Martins @andre_t_martins
2K Followers 397 Following NLP/ML researcher in Lisbon ([email protected])Charles 🎉 Frye @charles_irl
9K Followers 2K Following ai engineer at @modal_labs. he/him. ex @full_stack_dl, @weights_biases, phd Berkeley @Redwood_Neuro.weird medieval guys B.. @WeirdMedieval
686K Followers 159 Following by @olivia__ms // listen to my podcast with @aranptappers and ORDER MY BOOK OUT NOW // [email protected] for enquiriesAri Morcos @arimorcos
6K Followers 2K Following CEO and Co-founder @datologyai working to make it easy for anyone to make the most of their data. Former: RS @AIatMeta (FAIR), RS @DeepMind, PhD @PiN_Harvard.Sara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Davis Blalock @davisblalock
12K Followers 165 Following Research scientist + first hire @MosaicML. @MIT PhD. I write + retweet technical machine learning content. If you write a thread about your paper, tag me for RTLauren Klein | @laure.. @laurenfklein
11K Followers 2K Following Digital humanities, data science, AI, eating, professor of Quantitative Theory & Methods & English at Emory. Co-author #DataFeminism. PI #AIAInetwork. She/her.GPT-4/ChatGPT/GPT-3@R.. @realtimeqa
187 Followers 7 Following How well can GPT-3 answer your real-time questions? Examples from RealTime QA, a weekly-updated QA benchmark. Managed by @jungokasai and @KeisukeS_ .no context memes @weirddalle
1.7M Followers 460 Following memes and weird ai generations | dm for promo | @hardaipics | follow my IGSameer Singh @sameer_
7K Followers 2K Following Cofounder @SpiffyAI and Assoc Prof at @UCIrvine, working on reliable LLMs, explanations for AI+ML, adversaries for NLP, and debugging/evaluation.Victor Zhong @hllo_wrld
4K Followers 450 Following ML+NLP assistant prof @UWCheritonCS. Formerly @MSFTResearch @MetaAI, @SFResearch via @MetamindIO, @uwnlp, @StanfordNLP, @eceuoft.Abeba Birhane @Abebab
53K Followers 2K Following Senior Advisor, AI Accountability @Mozilla |Cognitive science PhD |Adjunct prof @tcddublinscss, @tcddublin |Ethiopian in Ireland |She/her @abeba.bsky.socialNWS Los Angeles @NWSLosAngeles
145K Followers 805 Following Official Twitter account for the National Weather Service Los Angeles. Details: https://t.co/XR2JbmS8plBigScience Large Mode.. @BigScienceLLM
9K Followers 1 Following Follow the training of "BLOOM 🌸", the @BigScienceW multilingual 176B parameter open-science open-access language model, a research tool for the AI community.Jason Weston @jaseweston
9K Followers 568 Following Research @MetaAI+NYU. Pretrain+FT: NLP from Scratch (2011). Multilayer attention+position embed+LLM: MemNets (2015). Recent (2023+):Sys 2 Attn, Self-Rewarding..Paul Poast @ProfPaulPoast
95K Followers 472 Following Was `Tweeting to teach', now `Posting for pedagogy'. International Relations and Foreign Policy. @UChicago Prof. @ChicagoCouncil Fellow. @WPReview Columnist.Bonaventure F. P. Dos.. @bonadossou
3K Followers 662 Following PhD Student @mcgillu @McGill_NLP | 🪑@WiNLPWorkshop | Research @Mila_Quebec @MasakhaneNLP @lelapaai | Co-founder @lanfrica | ex @GoogleAI @RocheCanadaSarah K. Dreier @SKDreier24
180 Followers 233 Following Assistant Prof @UNM, former @UW and @amprog. Running, biking, swimming, political science, mountain hiking, yoga, coffee, hatch chiles, snowy cactuses.Emily Kalah Gade @ekgade
703 Followers 820 Following whitewater, backcountry time, rowing, running, the study of political violence, data science, sci-fi, all science really, 🏳️🌈, Senior Researcher @forsmarshDr. Casey Fiesler @cfiesler
26K Followers 2K Following Professor who is on Twitter much less than she used to be. @[email protected] information science professor @cuboulder. PhD/JD.Jacob Eisenstein @jacobeisenstein
8K Followers 2K Following @[email protected]. Computational linguistics⚡ machine learning⚡ computational social science. Tweeting my own bad opinions since 2010.Snoqualmie Pass @SnoqualmiePass
117K Followers 134 Following Official WSDOT account for I-90/Snoqualmie Pass traffic. Questions? The answer might be here: https://t.co/FMsbewbq69…Distributed AI Resear.. @DAIRInstitute
23K Followers 405 Following AI is not inevitable. We DAIR to imagine, build & use AI deliberately. Follow us on Mastodon at @[email protected]Yejin Choi @YejinChoinka
19K Followers 330 Following professor at UW, director at AI2, adventurer at heartACL Rolling Review Pr.. @ARRPreprints
852 Followers 0 Following Bot that tweets anonymous monthly preprints from the ACL Rolling Review (@ReviewACL)PhDone!!!! 👨🎓 08/2019-04/2024 What a journey 🥳🚞 I especially feel lucky to share this once-in-a-life-time moment with people I love ❤️ . And seeing my passion-driven research efforts being acknowledged by researchers I deeply admire 🌞!! Special thanks to my awesome committee…
Or just bad writers. You probably want an LLM to delve into my writing and make suggestions. I’m like 90% typos and run on sentences.
Llama3-70B has settled at #5. With 405B still to come next... I remember when GPT-4 released in March 2023, it looked like it was nearly-impossible to get to the same performance. Since then, I've seen @Ahmad_Al_Dahle and the rest of the GenAI org in a chaotic rise to focus,…
Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥 We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an…
Feeling incredibly grateful for the entire team's dedication and hard work on the release of #Llama V3. It was a journey of long hours and immense effort, but we did it! Excited to finally put this in the hands of our amazing open source community.
Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥 We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an…
Can LLMs acquire meaning/semantics from just text? Some think it is a priori not possibile, I personally think it's a super interesting philosophical question which needs further investigation! Thoughts? arxiv.org/abs/2404.12145
📜 New preprint! Equipped with our multisense consistency method, we dive deep into an exploration of the semantic understanding of #LLMs. @eliabruni & @_dieuwke_ @metaai #NLProc [1/7]🧵 arxiv.org/abs/2404.12145
I am so excited with what we're cooking with Llama 3. Both 8B and 70B are really good, a lot better than anything else I worked with that's open-weights
People seem to over-index on the 15T number after Llama 3. While the number matters, what is even more important is the quality and diversity of those tokens. If there was a good way to measure those, that would have been an impressive result to report.
Llama3 was trained on 15 trillion tokens of public data. But where can you find such datasets and recipes?? Here comes the first release of 🍷Fineweb. A high quality large scale filtered web dataset out-performing all current datasets of its scale. We trained 200+ ablation…
Since everyone is piling on Chinchilla again, here’s a simple experiment you can run at home. Train any sized model you want with a token/param ratio of 20, then a double sized model for half as many steps, and a half sized model for double steps. Observe loss curves.
I am super excited about the release of our 8B & 70B LLaMA 3 models! Huge team effort, amazing learning experience, and we're not done - the 405B is still training! #Llama3
It’s here! Meet Llama 3, our latest generation of models that is setting a new standard for state-of-the art performance and efficiency for openly available LLMs. Key highlights • 8B and 70B parameter openly available pre-trained and fine-tuned models. • Trained on more…
Thrilled to share that our Llama 3 8B and 70B models are here. Happy to be part of this incredible journey, and excited for what's coming up next 🤩
It’s here! Meet Llama 3, our latest generation of models that is setting a new standard for state-of-the art performance and efficiency for openly available LLMs. Key highlights • 8B and 70B parameter openly available pre-trained and fine-tuned models. • Trained on more…
Wow, nearly 3K votes overnight -- A huge shoutout to our amazing community! Confidence intervals are narrowing, and Llama-3 remains strong! Big congrats to @AIatMeta for this incredible launch & contribution to open community. Full result coming out soon.
Early 1K votes are in and Llama-3 is on FIRE!🔥The New king of OSS model? Vote now and make your voice heard! Leaderboard update coming very soon.
The real king is still training 💪😝 But go go go 70B and 8B!
Early 1K votes are in and Llama-3 is on FIRE!🔥The New king of OSS model? Vote now and make your voice heard! Leaderboard update coming very soon.
Check out the generative vision related release too meta.ai/?icebreaker=im… Imagine Flash generates the image as you type You can also "Animate" your images! (technique based on Emu Video emu-video.metademolab.com) Kudos to the team for putting this out :)