Ethan Caballero is busy @ethanCaballero
ML PhD student @Mila_Quebec ; previously @GoogleDeepMind ethancaballero.github.io Joined January 2015-
Tweets4K
-
Followers8K
-
Following2K
-
Likes24K
Why is llama 3 not mixture-of-experts?
it's unfortunate that the ideal time for good old fashioned engineering is 7am-11am, but the ideal time for tensor engineering is 8pm-1am
The Chinchilla scaling paper by Hoffmann et al. has been highly influential in the language modeling community. We tried to replicate a key part of their work and discovered discrepancies. Here's what we found. (1/9)
reasons to be optimistic about alignment: - even “emergent” capabilities arise continuously/gradually - current generation rlhf generalizes far better than anyone had guessed - iterative deployment is ever more iterative as labs inch progress publicly
I’m super excited to release our 100+ page collaborative agenda - led by @usmananwar391 - on “Foundational Challenges In Assuring Alignment and Safety of LLMs” alongside 35+ co-authors from NLP, ML, and AI Safety communities! Some highlights below...
very readable: arxiv.org/abs/2404.08136 'Exponentially Weighted Moving Models' - Eric Luxenberg, Stephen Boyd
Does jax support "hogwild" gradient update methods such as A3C?
livestream of new Justice live show at coachella starting at 10:15 P.M. PST today: youtube.com/watch?v=F6A3iD…
This chart from @paul_scharre is an underrated point about AI proliferation: training at the frontier gets expensive (line going up), but at any fixed capability level gets cheap (lines going down) due to better software+hardware (assumes historical scaling rates holds)
Discouraging pre-schoolers from submitting to NeurIPS is a disservice because they never will get the opportunity to submit to NeurIPS if they have to wait until high school. By the time they are in high school, all NeurIPS submissions will be by AIs rather than humans.
If I could go back in time, I would have worked on NeurIPS papers in high school.
Why isn't every consumer using poe.com? It gives you access to all the best LLMs simultaneously for $20/month. Meanwhile, if you use any of the LLMs on the LLM creator's website, you have to pay $20/month for each LLM.
Mistral is not confused when we enable bidirectionality whereas LLaMA goes off the rails 🤠. We may have unlocked one secret ingredient of why Mistral is better than LLaMA. We believe it is 💥Prefix LM💥. This side finding is exciting in itself!
Mistral is not confused when we enable bidirectionality whereas LLaMA goes off the rails 🤠. We may have unlocked one secret ingredient of why Mistral is better than LLaMA. We believe it is 💥Prefix LM💥. This side finding is exciting in itself!
"Foundation models" companies that require up-front investment for GPUs are more like like airline companies than software. Large capex, then compete against 50 other companies that all bought the same model of airplane as you.
Jim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Irina Rish @irinarish
9K Followers 994 Following prof UdeM/Mila; Canada Excellence Research Chair; AAI Lab head https://t.co/UzlrC7ZrGF; INCITE project PI https://t.co/0rV7szd7rH; CSO https://t.co/XDhj6MEtUjRichard Ngo @RichardMCNgo
35K Followers 1K Following What would we need to understand in order to design an amazing future? Figuring that out @openaiEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pMiles Brundage @Miles_Brundage
43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.Rosanne Liu @savvyRL
33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRHorace He @cHHillee
23K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleShane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)Dan Roy @roydanroy
45K Followers 2K Following ML / AI researcher, emphasis on theory. Research Director and Canada CIFAR AI Chair, @VectorInst Professor, @UofT (Statistics/CS)Behnam Neyshabur @bneyshabur
18K Followers 690 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingrohan anil @_arohan_
12K Followers 2K Following Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.David Krueger @DavidSKrueger
13K Followers 4K Following Cambridge faculty - AI alignment, deep learning, and existential safety. Formerly Mila, FHI, DeepMind, ElementAI, AISI.Stella Biderman @BlancheMinerva
15K Followers 748 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/herTom Goldstein @tomgoldsteincs
23K Followers 2K Following Professor at UMD. AI security & privacy, algorithmic bias, foundations of ML. Follow me for commentary on state-of-the-art AI.Guillotine @Guillot21684613
134 Followers 298 FollowingVeronica Aragon @AragonVero25558
16 Followers 136 FollowingAlain @Alain53078872
202 Followers 1K FollowingPurple Lightning @PurpleL79525297
214 Followers 270 FollowingNikita @nikitavoloboev
4K Followers 6K Following Make @LearnAnything_ Learn in public: https://t.co/GbFvuErkYn macOS course: https://t.co/JdbJWru6zG https://t.co/94R8ER7K2h https://t.co/ROkqhyhpEKNicoloz Tbileli @TbNicoloz6757
3 Followers 34 FollowingC. M. Rubin (Cathy) @CMRubinWorld
29K Followers 20K Following #Futurist #Founder https://t.co/MGXt0K9l0z #PlanetClassroom #Filmmaker #Producer #ArtsEd #AI #VR #ML #Innovation #entrepreneurship #SDG's #Culture #Youth2030 #Climatekite @AprilH14141
16 Followers 115 Followingสุภาหวา.. @iS23lIHBO3N46
74 Followers 1K Following คุณต้องการนัดเดทกับสาวไหมคะ เพิ่ม https://t.co/XDdTDPfnKQPatrick Mesana @patrickmesana
107 Followers 581 Following Researcher interested in software engineering, decision science and data management. I also love sushis 🍣 and AI debates.boom loop @b00ml00p
174 Followers 1K Following former tech turned homesteader. here to farm engagement with gay replies and retarded retweetshunter @pseudokami
38 Followers 198 Following i like math. calisthenics. rl (both). and anime // currently ml @paypal // prev @nasajplANUBHAV CHATURVEDI @anubhavchaturvd
242 Followers 4K FollowingTarik @TarikOzket
586 Followers 3K Following ship it with quality. software person, engineering manager, ex-bain, @techstars class #50Sathish Kasilingam @sathishisak
161 Followers 2K Following Interests lie in manufacturing, software, quality, CNC Machine analytics, Data analytics, product management and startupsVaultic Ventures @vaultic
104 Followers 270 FollowingChristopher Bradford @cbgrasshopper
566 Followers 886 Following Dad, CTO at Storied, armchair philosopher & futurist, performer, voracious reader, Mormon transhumanist. Goals: continuous improvementAaditya ; @Aaditya26082004
528 Followers 7K Following CS'26 • Machine Learning • Open-Source • Web Dev. • Algorithms • Jai Shree Krishna 🦚🪈Harsh Maheshwari @HarshMheshwari
1K Followers 1K Following Enthusiastic about #GenerativeAI #DataScience 🤖 | Constantly curious learner 🌱 | Applied scientist 2 at @amazon | Writer at @medium | @IITKGP GraduateVenkat Krish @govenkat
261 Followers 2K Followingberkan @YapcBerkan
47 Followers 1K FollowingVictor Belostotsky-Wo.. @v_belostotsky
39 Followers 708 Following The most polar Jew in the world ❄️ Ai & blockchain researchers🚀Founder and Tech💎 Chairman of Russian Far EastAmy Mikkelsen @MikkelsenA17326
84 Followers 468 FollowingBogdan Beldiman @bogdanonymous
837 Followers 4K Following Digital Renaissance Man | AI Artist & Enthusiast | Self-Educated Multidisciplinary ExperimenterNic Peterson @_nicpeterson
65 Followers 572 FollowingBlaze (Balázs Galamb.. @gblazex
1K Followers 975 Following A Smooth Guy; Developer of SmoothScroll for macOS, Windows & Google Chrome.Putra Manggala @pmangg
502 Followers 4K Following researcher @amlabuva, previously @shopify, @guavus, @adgear, @mcgillu. Not fun at parties.Jaffray Woodriff @JaffrayW
7K Followers 5K Following Chief Scientist & CEO QIM. UVa School of Data Science (Founder/Funder). Embrace Simplicity, Beware Complexity. Coding at CODE. Honesty. Sleep. Squash.Subramani @Mani1a111
33 Followers 677 FollowingDanik @Danikmath
4 Followers 31 FollowingStephanie Palazzolo @steph_palazzolo
8K Followers 3K Following Writing AI Agenda @theinformation, texan, & horror movie aficionado // reach me at [email protected] or on Signal at 979-599-8091Vasco Rodrigues @vvro
224 Followers 3K FollowingJakob Nikolas Kather @jnkath
4K Followers 3K Following Professor of Medicine and Computer Science | Clinical AI at @medizin_TUD @tudresden_de @katherlab | Medical Oncology at @NCT_UCC_DD & @NCT_HD 💻🧬🇪🇺🌍syatreikos @systolic1432
233 Followers 5K FollowingArjun Krishna @TheOneAndArjun
504 Followers 3K Following Engineering @uwaterloo | Building @dubbahco with my friend 🇮🇳 🇨🇦 https://t.co/SXGYxUy1QHNick Sullivan @grittygrease
22K Followers 8K Following security/networking/cryptography research and development ⟡ co-chair of the Crypto Forum Research Group ⟡ always learning, always teachingpadma reddy @padmareddy34726
174 Followers 4K Following光与肥肥 @wangxinhahaha
1 Followers 397 FollowingYi Lu @dwin_lu
30 Followers 241 FollowingSparsh Jain @sparshjain21
61 Followers 778 Following Research Intern @AI4Bharat, IIT Madras || Ex- Data Science Intern @Culinda || Data Science || ML enthusiastJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Irina Rish @irinarish
9K Followers 994 Following prof UdeM/Mila; Canada Excellence Research Chair; AAI Lab head https://t.co/UzlrC7ZrGF; INCITE project PI https://t.co/0rV7szd7rH; CSO https://t.co/XDhj6MEtUjRichard Ngo @RichardMCNgo
35K Followers 1K Following What would we need to understand in order to design an amazing future? Figuring that out @openaiEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pAnthropic @AnthropicAI
261K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.Soumith Chintala @soumithchintala
186K Followers 877 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Miles Brundage @Miles_Brundage
43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.Jack Clark @jackclarkSF
67K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futuresIlya Sutskever @ilyasut
370K Followers 2 Following towards a plurality of humanity loving AGIs @openaiPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistRosanne Liu @savvyRL
33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRHorace He @cHHillee
23K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleNeel Nanda @NeelNanda5
13K Followers 89 Following Mechanistic Interpretability lead @DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!AI Safety Institute @AISafetyInst
530 Followers 29 Following We’re building a team of world leading talent to tackle some of the biggest challenges in AI safety - come and join us.Joshua Batson @thebasepoint
2K Followers 707 Following trying to understand evolved systems (🖥 and 🧬) interpretability research @anthropicai formerly @czbiohub, @mit mathCognition @cognition_labs
123K Followers 19 Following Makers of Devin, the first AI software engineer. We are an applied AI lab focused on reasoning, and code is just the beginning. Join us: https://t.co/tpfZwEwGiqPhysical Intelligence @physical_int
4K Followers 8 Following Physical Intelligence (Pi), bringing AI into the physical world.DatologyAI @datologyai
965 Followers 17 Following DatologyAI builds tools to automatically select and optimize the best data on which to train AI models, leading to better models which train faster.Sholto Douglas @_sholtodouglas
15K Followers 857 Following Scaling Gemini @Deepmind - working towards intelligence too cheap to metermanifest ai @manifest__ai
153 Followers 0 FollowingMagnus Vinding @MagnusVinding
978 Followers 1K Following Author of Suffering-Focused Ethics + Reasoned Politics Co-founder of the Center for Reducing Suffering Working to reduce extreme suffering for all beingsThe Variational Book @TheVariational
88 Followers 90 Following The topic of Variational Inference unites key concepts in machine learning. Follows us to learn more about all things AI. https://t.co/F56BECcy1NJesse Hoogland @jesse_hoogland
857 Followers 1K Following Researcher and decel working on developmental interpretability. Executive Director @ TimaeusChenlin Meng @chenlin_meng
8K Followers 833 Following Co-founder & CTO @pika_labs | ex @StanfordAILab @StanfordThe Information @theinformation
96K Followers 697 Following The leading publication high-powered tech executives and founders read daily.Arthur Mensch @arthurmensch
40K Followers 872 Following Co-founder and CEO @MistralAI. Apply https://t.co/yHGRZAtjcxFutureHouse @FutureHouseSF
2K Followers 3 Following Philanthropically-funded moonshot building semi-autonomous AI to accelerate the pace of scientific discovery in biology.Robin Rombach @robrombach
6K Followers 397 Following Generative enthusiast and long-term PhD Student @LMU_Muenchen. Author of VQGAN, Latent Diffusion, Stable Diffusion.Nikhil Vyas @vyasnikhil96
236 Followers 510 Following Postdoc at Harvard ML Foundations. Previously @MITEECS.Krueger AI Safety Lab @kasl_ai
227 Followers 47 Following We are a research group at the University of Cambridge focused on avoiding catastrophic risks from AI.Gal Kaplun @GalKaplun
60 Followers 13 FollowingMiles Cranmer @MilesCranmer
12K Followers 903 Following Assistant Prof @Cambridge_Uni, works on AI for the physical sciences. Previously: Flatiron, DeepMind, Princeton, McGill.Physics In History @PhysInHistory
577K Followers 0 Following Photos from the history of physics | © with mentioned Archives. Shared for educational purposes. Einstein portrait © Ullsteinbild. Subscribe for curated papers.Sharan Narang @sharan0909
2K Followers 254 Following LLMs and AI Research (Llama 2 & 3 lead) @Meta | ex @Google (PaLM lead, T5), ex @Baidu (Deep Speech 2, Sparse Neural Networks), ex @NvidiaDylan Patel @dylan522p
39K Followers 682 Following SemiAnalysis Boutique AI & Semiconductor Research and Consulting DMs are open for consulting, quotes, or to talk shopIdeogram @ideogram_ai
39K Followers 0 Following Helping people become more creative. It's pronounced eye-diogram. Join our lovely community at https://t.co/aKDNl4OOQf.Dwarkesh Patel @dwarkesh_sp
54K Followers 699 Following Being pretrained Host of Dwarkesh Podcast https://t.co/3SXlu7fy6N https://t.co/rEhnfYywXY https://t.co/hQfIWdM1UnL'Impératrice @Imperatrice__
8K Followers 8 Following « Tako Tsubo », new album out now !!! 🙃🐙💙 Order a vinyl or CD and win 2 lifetime tickets to our shows 👇 https://t.co/NfbBE6RykMMIDI Lab @lab_midi
77 Followers 15 Following The Machine Intelligence and Decision-making through Interaction Lab @UTAustin led by @yayitsamyzhangxAI @xai
997K Followers 36 FollowingReka @RekaAILabs
11K Followers 13 Following An AI research and product company 🫠. We are a team of scientists and engineers building state-of-the-art multimodal language models 😻Mistral AI @MistralAI
90K Followers 0 Following Fast, open-source and secure language models. Join us https://t.co/INALdNGvCPConner Vercellino @connerver
96 Followers 2K FollowingMarius Hobbhahn @MariusHobbhahn
2K Followers 994 Following Director/CEO at Apollo Research @apolloaisafety Ph.D. student of Machine Learning @PhilippHennig5; AI safety/alignmentSang Michael Xie @sangmichaelxie
3K Followers 709 Following PhD student @StanfordAILab @StanfordNLP @Stanford advised by Percy Liang and Tengyu Ma. Prev: visiting @GoogleAI Brain, BS, MS Stanford ‘17Planned Obsolescence @plannedobs
384 Followers 7 Following Thinking ahead to a future where AI decides everythingJan Stühmer @JanStuehmer
92 Followers 388 Following I tweet on Machine Learning. Assistant Professor at KIT. Group Leader at Heidelberg Institute for Theoretical Studies. Previously SamsungAI, MSR, MIT CSAIL, TUMStanford AI Alignment @SAIA_Alignment
108 Followers 21 Following Stanford AI Alignment is a community of students and researchers focused on technical and governance research to mitigate risks from advanced AI systems.Max More, Proactionar.. @MaxMore1964
329 Followers 83 Following Recognized for the Proactionary Principle, the Principles of Extropy, advocacy of life extension, founding modern transhumanism. blog: Extropic Thoughts.AGI House @agihouse_org
13K Followers 412 Following Accelerating humanity's transition to AGI & honoring the greatest AI founders and researchers of our time @ https://t.co/1lJUc58gZJFAR AI @farairesearch
1K Followers 19 Following Ensuring AI systems are trustworthy and beneficial to society by incubating new AI safety research agendas.Together AI @togethercompute
27K Followers 303 Following The future of AI is open-source. Let's build together.Ari Morcos @arimorcos
6K Followers 2K Following CEO and Co-founder @datologyai working to make it easy for anyone to make the most of their data. Former: RS @AIatMeta (FAIR), RS @DeepMind, PhD @PiN_Harvard.It was lovely having @shivon in New York, with her twins, for the art opening of her Toronto childhood friend @RMacFarlaneArt at the @HollisTaggart gallery 😍😍 My grandchildren loved the paintings, and the cars and trucks outside 😂😂 #AWomanMakesAPlan 📖 Advice for a Lifetime…
Trying to find a good way to present how to deduce the ELBO in Variational Inference. Does this look straightforward?
According to Axios HSS Mayorkas met with Mr Altman at OpenAI headquarters and personally asked him to join the board. Mayorkas also said he intentionally excluded Elon and Zuck because they run 'social media companies'. axios.com/2024/04/26/alt…
This morning the Department of Homeland Security announced the establishment of the Artificial Intelligence Safety and Security Board. The 22 inaugural members include Sam Altman, Dario Amodei, Jensen Huang, Satya Nadella, Sundar Pichai and many others.
@ccanonne_ @NeurIPSConf NeurIPS has become an AI Expo long time ago! I go to NeurIPS mainly for workshops which are still research focused.
I brought a friend from college to an SF party recently and he went through an entire conversation thinking EAs were executive assistants gone rogue. “What are the executive assistants doing now? And why?”
Last week I learned why people do cold exposure. I was jet lagged & sleep deprived. Did cryotherapy for 3 min at -112°C. Immediately had a clean coffee high; wiping out my sluggishness. Doing it before when well rested, I couldn't feel any difference.
The vector field of a diffusion model has some interesting properties. Early on samples are mostly noise, so the model's best guess is to predict the mean. Later on the model has more information and the vector field starts to reveal the local structure of the distribution.
once @ylecun told me (heavily paraphrased), it's not F=ma but \min (F-ma)^2. i didn't realize its importance, but it is perhaps the most enlightning perspective i've ever heard.
🧵) We unexpectedly reach 🥇 on the leaderboard of #WebArena. While 25% is still far from human performance it is a large jump compared to the next best result. The performance gain is largely attributed to #BrowserGym github.com/ServiceNow/Bro… leaderboard: bit.ly/3QjOL5r
Reminder that Quintin predicted this (h/t x.com/1a3orn/status/…)
New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…
Meta has 350,000 H100 equivalents, is using approx 35,000 of them to train Llama models, rest for running them (and other models but you get it) Open sourcing Llama means that if the community gets Llama 10% more efficient then the training is basically free. #AImath
SO INCREDIBLY PROUD to share 2 HUGE updates: 1) The first baby was born using @OrchidInc technology — and he’s super cute 🥰 2) I tested my own embryos with Orchid — we got SO much information & l feel confident now 🚀 This is the future of how babies will be born!
@_arohan_ I was thinking some semi-related thoughts on how nowadays in ML research you can basically come up with any bullshit idea and it will *work* (because "neural nets want to work"), even if not better than sota, so you can always massage the results to fit into a publication
We have just released 🍷 FineWeb: 15 trillion tokens of high quality web data. We filtered and deduplicated all CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile and SlimPajama!
Visualizing the expressive power of different MLP activation functions... interesting how SiLU seems to converge faster than GELU.
I love when people who started thinking about AI around 1.5 years ago come onto the scene and say stuff like yeah it’s all incremental progress from here we hit the top