rohan anil @_arohan_
Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own. Joined December 2017-
Tweets6K
-
Followers12K
-
Following2K
-
Likes20K
my mom is visiting us, and meeting both her grandchildren. These are precious these moments considering how much of a tax the immigrant journey is from distance and arcane rules. It’s actually a unique experience and kind of different from median exp where grandparents live close
MMLU is 79.5 after PT then improves over in IT. Hmm
MMLU is 79.5 after PT then improves over in IT. Hmm
Totally delightfully unpredictable genius
arxiv.org/abs/1711.00464 such a fun title, and a great intro! Some wild thoughts: What is a good representation? Will perplexity reduction lead to good representation? What is the inductive bias that leads to better representation? Why is there so much slack in our objective?
I must admit I am always uncertain what epistemic and aleatoric uncertainty is. I am not sure if my uncertainty is epistemic, endemic or aleatoric.
The interesting part of the Mistral plot that everyone keeps sharing with their own models is equating activated params to cost as someone mentioned to me. Picking my hill for today!
Compression Represents Intelligence Linearly LLMs' intelligence – reflected by average benchmark scores – almost linearly correlates with their ability to compress external text corpora repo: github.com/hkust-nlp/llm-… abs: arxiv.org/abs/2404.09937
Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Soumith Chintala @soumithchintala
185K Followers 871 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.typedfemale @typedfemale
23K Followers 480 Following a really exciting new account "have you ever though you might be like scott alexander? very smart, but can't do math" - anonDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Horace He @cHHillee
23K Followers 447 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pRosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRJeremy Howard @jeremyphoward
221K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordJeff Dean (@🏡) @JeffDean
296K Followers 6K Following Chief Scientist, Google DeepMind and Google Research. Co-designer/implementor of things like @TensorFlow, MapReduce, Bigtable, Spanner, Gemini .. (he/him)Gautam Kamath @thegautamkamath
44K Followers 502 Following Assistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Co-EiC @TmlrOrg. I lead @TheSalonML. Privacy, robustness, machine learning.Yi Tay @YiTayML
28K Followers 97 Following Chief scientist & Co-founder @RekaAILabs past: Research Scientist @Google Brain 🧠 currently learning to be a dad 🍼👶Kyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Tom Goldstein @tomgoldsteincs
23K Followers 2K Following Professor at UMD. AI security & privacy, algorithmic bias, foundations of ML. Follow me for commentary on state-of-the-art AI.Dan Roy @roydanroy
45K Followers 2K Following Research Director, @VectorInst. Canada CIFAR AI Chair. Associate Professor of Stats/CS @UofT. I study machine learning and AI, emphasis on theory.Ross Wightman @wightmanr
18K Followers 1K Following Computer Vision @ 🤗. Ex head of Software, Firmware Engineering at a Canadian 🦄. Currently building ML, AI systems or investing in startups that do it better.Sander Dieleman @sedielem
50K Followers 2K Following Research Scientist at Google DeepMind. I tweet about deep learning (research + software), music, generative models (personal account).Tanishq Mathew Abraha.. @iScienceLuvr
54K Followers 1K Following PhD at 19 | Founder and CEO at @MedARC_AI | Research Director at @StabilityAI | @kaggle Notebooks GM | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6QbShane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)Julien Chaumond @julien_c
46K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueOmkar Phatak @omkarphatak
630 Followers 4K Following Senior Program Manager, Amazon Alexa Skills. Helping build the voice economy. DMs welcome for any questions. Opinions = Personal. #AlexaDevs #VoiceFirstDheeraj Mekala @MekalaDheeraj
509 Followers 289 Following Ph.D. student at @UCSanDiego. Research Scientist Intern at FAIR @MetaAI Previously @msftresearch, @AmazonScience, @iitkanpur Data! Data! Data!Jack Reacher @JackReach516
67 Followers 696 FollowingVasco Rodrigues @vvro
267 Followers 3K FollowingItxaso Baskero Dorrea.. @IDorreak
3 Followers 228 FollowingCZ @MiaoMiaoBearZX
18 Followers 801 Followingpadma reddy @padmareddy34726
142 Followers 3K FollowingAgamdeep Singh @agammessi10
63 Followers 679 Following Trying to make a business out of RAG and training a foundational pose comparison model @ MOON lab, IISERB.MesubsetofRunionC @mesubsetof
24 Followers 327 FollowingAbdulrahman Tabaza @embed_dim
2 Followers 447 Following Enjoyer of various vector spaces and modalitiesAlberto Villafuerte @AlbertoVill1506
18 Followers 419 Following Seguir aguantando la bruja de casa porque nadie como ellaWill @solidwillity
118 Followers 1K FollowingRehm @Rehmudy
51 Followers 1K FollowingAnkit @ashah0052
1K Followers 5K Following LLM Arch Assoc Director - @Accenture Ph.D. - @LTIatCMU @SCSatCMU Previous: @GoogleAI, @merl_news, @Revive_Med, @ARM Smartly working hard to make things happen!Armand Joulin @armandjoulin
4K Followers 344 Following principal researcher, @googledeepmind. ex director of emea at fair @metaai. mostly work on open projects: fasttext, dino, llama, gemma.Tasour @TasourR
28 Followers 120 Following Error code: 0xF2024 (Lost in the virtual world). Backup failed. All data lost.Shawn Jain @darkmatter08
84 Followers 108 FollowingJames Leu @skydetainer
127 Followers 3K Following When you can understand and explain the universe,you’re a smart man.Alen Capalik @capalik
197 Followers 893 Following Founder of CounterTack (now GoSecure) & https://t.co/snWLnZolVI, Entrepreneur, Hacker, Computer Programmer, AI/ML, GPUs, Cybersecurity, Investing, Long Time Options TraderElon Musk @EMusk47849
22 Followers 82 FollowingæTreasury @aeTreasury
98 Followers 379 Following æTreasury is the first and leading Money Service Provider that has created a global Self-Custody Account of Money Solutions.Marc Demers @MarcDemers15
78 Followers 319 FollowingYatindra Indoria @yatindraindoria
25 Followers 100 FollowingJOEL @HowAboutNO4YOU
181 Followers 218 Following Creative & Technical Director eXercising a 27B/6 in Synthetic Media @ intersection of AI, Healthcare, & AEC; Virtual & Physical Augmented EnvironmentsJ @jydv
143 Followers 113 FollowingAbdullah Aziz @abdullahaziz03
5 Followers 59 FollowingKai-Fu Lee @kaiifulee
51 Followers 1K Following #AI Expert, CEO of @01ai_yi and Chairman of 创新工场 @sinovationvc , former President of Google China, Author of AI 2041 and NYT Bestseller AI SuperpowersJim B @jamesberkery1
368 Followers 659 Following Big fan of X…citizen journalism is amazing. Also being able to say what’s on your mind and hear what is on others minds is addictive. maybe therapeutic.dealonai @dealonai
2K Followers 189 Following Fueling marketing with rocket fuel AI. We build chatbots so smooth, they'll make Siri jealous & your leads swoon.David 🇪🇺 @DavidAntill4
917 Followers 3K FollowingAI Papers Podcast @aipaperspodcast
826 Followers 2K Following A digestible daily update on the latest AI Research Papers. Brought to you by @pocketpodappAlpay Ariyak @AlpayAriyak
1K Followers 2K Following 𝗔𝗜 @RunPod_io | 𝗟𝗲𝗮𝗱: @OpenChatDev (𝟲𝟬𝟬𝗸+ 𝗱𝗼𝘄𝗻𝗹𝗼𝗮𝗱𝘀 on HuggingFace🤗)Amir hasanpour @Amirhasanpourfn
6 Followers 65 FollowingLucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Peyman Milanfar @docmilanfar
67K Followers 261 Following Distinguished Scientist at Google Research. Computational Imaging, Machine Learning, and Vision. Tweets = personal opinions. May change or disappear over time.Soumith Chintala @soumithchintala
185K Followers 871 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.typedfemale @typedfemale
23K Followers 480 Following a really exciting new account "have you ever though you might be like scott alexander? very smart, but can't do math" - anonDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Horace He @cHHillee
23K Followers 447 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pAlfredo Canziani @alfcnz
86K Followers 269 Following Musician, math lover, cook, dancer, 🏳️🌈, and an ass prof of Computer Science at New York UniversityRosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRYi Ma @YiMaTweets
71K Followers 120 Following Chair Professor in AI, Director of IDS, Head of CS, HKU; Professor of EECS, Berkeley; Author of Book: High-Dim Data Analysis, https://t.co/gwaqMJp8av.Jeremy Howard @jeremyphoward
221K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @Stanford(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingKevin Patrick Murphy @sirbayes
42K Followers 328 Following Research Scientist at Google Brain / Deepmind. Interested in Bayesian Machine Learning.Clément Canonne @ccanonne_
31K Followers 926 Following Senior Lecturer @Sydney_Uni. Postdocs @IBMResearch, @Stanford; PhD @Columbia. Converts ☕ into puns: sometimes theorems. He/him. @[email protected]Jeff Dean (@🏡) @JeffDean
296K Followers 6K Following Chief Scientist, Google DeepMind and Google Research. Co-designer/implementor of things like @TensorFlow, MapReduce, Bigtable, Spanner, Gemini .. (he/him)Gautam Kamath @thegautamkamath
44K Followers 502 Following Assistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Co-EiC @TmlrOrg. I lead @TheSalonML. Privacy, robustness, machine learning.Yi Tay @YiTayML
28K Followers 97 Following Chief scientist & Co-founder @RekaAILabs past: Research Scientist @Google Brain 🧠 currently learning to be a dad 🍼👶François Fleuret @francoisfleuret
30K Followers 475 Following Prof. @Unige_en, Adjunct Prof. @EPFL_en, Research Fellow @idiap_ch, co-founder @nc_shape. AI and machine learning since 1994. I like reality.Will @solidwillity
118 Followers 1K FollowingRoshan Sumbaly @rsumbaly
1K Followers 677 Following Herding Llamas and Emus at @metaai. Prior life @coursera, @linkedIn, @stanfordAhmad Al-Dahle @Ahmad_Al_Dahle
3K Followers 47 Following #Girldad of twins. Leading GenAI @ Meta (llama, imagine, meta ai and more)Armand Joulin @armandjoulin
4K Followers 344 Following principal researcher, @googledeepmind. ex director of emea at fair @metaai. mostly work on open projects: fasttext, dino, llama, gemma.Logan Kilpatrick @OfficialLoganK
92K Followers 2K Following Lead product for @Google AI Studio and working on the Gemini API, helping developers build with AI, my views!Omead Pooladzandi @omead_p
275 Followers 772 Following I enjoy second order optimization Machine Learning PhD @UCLA. ex Research Scientist Intern @AIatMeta (ideas are my own)Arnaud Autef @arnaud_autef
185 Followers 272 Following Machine Learning Engineer @apple, ex @sisudata | studied @Polytechnique @Stanford | Tweeting about Machine Learning, views my ownDan Roberts @danintheory
4K Followers 570 Following I studied gravity. AI fellow @sequoia + researcher @mit physics. Co-founded @diffeo, acquired by @salesforce. Co-author "The Principles of Deep Learning Theory”Revant Himatsingka �.. @foodpharmer2
90K Followers 145 Following Trying to get 140 crore Indians become health literate | Wharton MBA | Fighting with major food brands to help improve India's health!Yangsibo Huang @YangsiboHuang
1K Followers 726 Following PhD candidate @Princeton. Prev: @GoogleAI @AIatMeta.Arthur Douillard @Ar_Douillard
3K Followers 2K Following Modular & Distributed Learning @ DeepMind, Continual Learning PhD @ Sorbonne⟁ndrew V @AndrewVoirol
3K Followers 5K Following GenAI | AI Research | AI Engineer. Tinkering with tenacity. Bridging bytes with biology. When life throws curveballs, I code the comeback. Building from 0 to 1.Jiquan Ngiam @JiquanNgiam
481 Followers 167 Following Building @Lutra_AI Previously: Google Brain, Coursera, Stanford ML GroupHao Liu @haoliuhl
4K Followers 154 Following machine learning, neural networks. phd student @berkeley_ai. https://t.co/ZNJawlrerSsandya mannarswamy @sandyasm
854 Followers 5K Following Natural Language Processing Researcher. https://t.co/oYoCTKS2HoAli Hatamizadeh @ahatamiz1
878 Followers 291 Following Senior Research Scientist @NVIDIA PhD @UCLA Views are my own.Lenka Zdeborova @zdeborova
13K Followers 420 Following Professor at EPFL. Une mathémaphysinformaticienne. Passionate mushroom hunter. Tamer of two little dragons.Dmytro Dzhulgakov @dzhulgakov
3K Followers 572 Following Co-founder and CTO @FireworksAI_HQ. PyTorch core maintainer. Previously FB Ads. Ex-Pro Competitive ProgrammerKavya Manohar (കാ.. @kavya_manohar
1K Followers 835 Following PhD in Speech and Language Processing, #Malayalam Language Technologist, Teacher, Student, Free Knowledge Enthusiast, Opentype Font Engineer, FeministDivy Thakkar @divy93t
5K Followers 2K Following Strategy, Programs & Product @GoogleAI , HCI Researcher. Ph.D @CityUniLondon Alumni @iift1963 @daiictofficial. Personal views.Edoardo Ponti @PontiEdoardo
2K Followers 389 Following Assistant Professor in #NLP at @EdinburghUni and affiliated lecturer @Cambridge_Uni | Modular deep learningKeran R @KeranRong
60 Followers 122 Following Allseas | MIT | Google AI | Deepmind Gemini MultimodalDiego de las Casas @diegolascasas
555 Followers 771 Following Working on LLMs at @MistralAI Past: @GoogleDeepMind 🇧🇷 in 🇬🇧Rohan Paul @rohanpaul_ai
12K Followers 428 Following ML Engineer (e/acc) 📌 https://t.co/x0IIWfnOt8 🚀 https://t.co/QEO4CKRl1b Open Large Language Models is Happiness 💡 Ex Deutsche & HSBCAbhi Venigalla @abhi_venigalla
5K Followers 1K Following Researcher @Databricks. Former @MosaicML, @CerebrasSystems. Addicted to all things compute.Walter Isaacson @WalterIsaacson
307K Followers 864 Following Author of Elon Musk, The Code Breaker, Leonardo da Vinci, Benjamin Franklin, Einstein, Steve Jobs. Professor @Tulane. Formerly @Time, @CNN, & @AspeninstituteJohn Schulman @johnschulman2
39K Followers 608 Following Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz musicNaman Jain @StringChaos
905 Followers 890 Following CS PhD @UCBerkeley | Projects - R2E, LiveCodeBench, Chatbot-Arena Coding, RAFT, Data Quality | Past: @AWS @MSFTResearch @iitbombayShreya Singh @ssingh_shreya
123 Followers 410 Following LLMs @GoogleCloud, CS @Stanford. Passionate about Machine Learning, NLP and Women in TechAlexander Chen @alexanderchen
8K Followers 1K Following Creative Director at Google Creative Lab, working on AI. Opinions are my own. https://t.co/bSOAeDObmzRamona Comanescu @ramona_crg
125 Followers 388 Following Research Engineer at @GoogleDeepMind working on Gemini, Fairness, Safety ♊️ Previously @Meta ML grad @Cambridge_Uni & CompSci @EdinburghUni @InfAtEdLuke Zettlemoyer @LukeZettlemoyer
8K Followers 2K FollowingGraham Neubig @gneubig
30K Followers 582 Following Associate professor at CMU, studying natural language processing and machine learning.Yifu Wang @foofoobuggy
79 Followers 114 FollowingYves Kalume @KalumeYves
1K Followers 870 Following Software Engineer • Android #GDE ⚠️ warning : may spontaneously discuss binary trees on a dateCraig Citro @craigcitro
1K Followers 237 Following i like math and puns | research engineer @anthropicai; previously: @GoogleColab, Google Bigquery, @sagemath, number theoristPengming Wang @PengmingWang
185 Followers 157 Following Founding team @poolsideai | prev @DeepMind, PhD @Cambridge_Uni, FunSearch co-authorBritney Muller @BritneyMuller
32K Followers 4K Following Making 'AI' accessible | Data Science | Machine Learning | SEO | Entrepreneur | Keynote Speaker | Prev @huggingface @Moz (she/her)Amin Barekatain @BarekatainAmin
976 Followers 258 Following Quant Dev @QuadratureLDN | Prev: Research Engineer @GoogleDeepMind | FunSearch, AlphaTensor, AlphaDev, MuZero@_arohan_ Too much work brother and then that’s a rabbit hole I’m avoiding at kids bedtime 😅
@Rainmaker1973 Researchers believe they are attempting to reshape the rocks into GPUs.
A rudimentary evaluation of subword fertility of llama3 and llama2 on the Indic subset of Flores dev set. First observations: 1. The fertilities of llama3 are in general better (lower), except in the case of Malayalam/Tamil. 2. Assamese and Bengali scores are comparable.
A fun backstory: we first demoed this breakthrough to @finkd, @Ahmad_Al_Dahle and others this February on a single node. There was so much excitement about this research that in a mere 2 months we scaled it up and today is in production available to billions via Meta AI #movefast
In addition to Llama 3, today we’re also publishing a new paper: Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation ➡️ go.fb.me/g4r584 This work from GenAI researchers is enabling new image generation features in Meta AI on @WhatsApp & web.
Imagine being OpenAI and your competition is giving away a $100m product for free.
I'm seeing a lot of questions about the limit of how good you can make a small LLM. tldr; benchmarks saturate, models don't. LLMs will improve logarithmically forever with enough good data.
🔥llm.c update: Our single file of 2,000 ~clean lines of C/CUDA code now trains GPT-2 (124M) on GPU at speeds ~matching PyTorch (fp32, no flash attention) github.com/karpathy/llm.c… On my A100 I'm seeing 78ms/iter for llm.c and 80ms/iter for PyTorch. Keeping in mind this is fp32,…
@danielsateler1 YES. I'm so bothered by this always, it causes me suffering to wait for my program to start. Computers are FAST. They have dozens of fancy cores capable of billions of instructions per second and a perfected memory hierarchy. What is even happening? I categorically refuse to wait…
@deepwhitman @AIatMeta @lmsysorg no. people misunderstand chinchilla. chinchilla doesn't tell you the point of convergence. it tells you the point of compute optimality. if all you care about is perplexity, for every FLOPs compute budget, how big model on how many tokens should you train? for reasons not fully…
It’s here! Meet Llama 3, our latest generation of models that is setting a new standard for state-of-the art performance and efficiency for openly available LLMs. Key highlights • 8B and 70B parameter openly available pre-trained and fine-tuned models. • Trained on more…
Congrats to @AIatMeta on Llama 3 release!! 🎉 ai.meta.com/blog/meta-llam… Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we'll see when the rankings come in @ @lmsysorg :)) 400B is still training, but already encroaching…
Huge moment for open source!
The upcoming Llama-3-400B+ will mark the watershed moment that the community gains open-weight access to a GPT-4-class model. It will change the calculus for many research efforts and grassroot startups. I pulled the numbers on Claude 3 Opus, GPT-4-2024-04-09, and Gemini.…
…after the initial release rush, I've got to say: 1.3M H100-hours on an 8B is kind of a waste. Like, thanks Meta, but it's clearly saturating on the hardest benchmarks like DROP. Asymptotically, hopelessly clawing to where L2-70B was. We need a deeper model. ≈20B, ≈60 layers.
getting riled up thinking about Meta wasting all those beautiful H100s
@SamuelMLSmith @_arohan_ @vivek7ue That’s a lot of unstated assumptions. Also, 4) all architectures are roughly the same.
I'm personally super excited to share the progress on the 400B+ training. So proud of the entire team that worked tirelessly to make this model a reality. There's lots more to come including a full research paper soon!🚀🦙🦙
Big congrats to @AIatMeta on Llama 3 release🔥 A huge week for open-source AI! Both Llama-3 70B & 8B are now in the Arena thanks to @togethercompute fast support. Let's see how well it does in real-world tests by Arena power users, come challenge Llama-3🧩!
Llama 3 8B & 70B models are just the beginning of what we’re working to release for Llama 3. Our largest models currently in the works are 400B+ parameters and while they’re still in active development, we’re excited about how this work is trending.