Nikhil Anand @nikhil_anand91
Cambridge, MA Joined December 2019-
Tweets5
-
Followers36
-
Following105
-
Likes37
Really cool work led by Devin Kwok (McGill/Mila) on making sense of example difficulty. Addresses some key ?s: E.g, How consistent is measured difficulty across inits and for different architectures? Can we fingerprint models using a few key sensitive/hard examples?
Really cool work led by Devin Kwok (McGill/Mila) on making sense of example difficulty. Addresses some key ?s: E.g, How consistent is measured difficulty across inits and for different architectures? Can we fingerprint models using a few key sensitive/hard examples?
Happy to share our EMNLP paper w/ @jtan189 where we apply Variance of Gradients (VoG) – originally developed by @_cagarwal, @mrdanieldsouza, and @sarahookr – for selecting important data in language-based tasks. At EMNLP? Let's connect to discuss data quality and/or LLMs! #EMNLP
Happy to share our EMNLP paper w/ @jtan189 where we apply Variance of Gradients (VoG) – originally developed by @_cagarwal, @mrdanieldsouza, and @sarahookr – for selecting important data in language-based tasks. At EMNLP? Let's connect to discuss data quality and/or LLMs! #EMNLP
Anna @Anna18347
37 Followers 44 Following You must love something,🌱 just like the love of plants and trees for time. Life without hobbies is too boring🌸Margo Decinti @MargDecinti
72 Followers 5K FollowingAngela✨✨✨ @Angela0la
2K Followers 4K Following It is not ability but choice that determines a person's success.Carlotta Scampoli @CarlottaSc95426
41 Followers 5K FollowingCorene Croke @cro_core
56 Followers 5K FollowingAleeza Opray @AOpray417
85 Followers 5K FollowingSofia Boldman @SoBoldman
41 Followers 5K FollowingMable Pfeil @MablePfeil51806
73 Followers 5K FollowingTamekia Copelan @tamek_copel
46 Followers 5K FollowingMurryn Marinello @MarinelloM44518
81 Followers 5K FollowingMillie-rose Montono @MontonoRos68650
90 Followers 5K Following_SoulRae @SoulRae253965
9 Followers 359 FollowingMartha_US_ @MarthaUS54600
9 Followers 1K FollowingSimon Bird @SimonBird148205
105 Followers 3K FollowingSusan @susan13larry
114 Followers 3K FollowingTothos @Tothos342325
11 Followers 1K Following The garden is full of spring scenery, with a few red flowers falling all over the groundCarolin Holtermann @CarolinHolterm
32 Followers 47 Following PhD Candidate at the University of HamburgMiao Li @oaimli
99 Followers 661 Following PhD candidate in NLP/AI @UniMelb, Visiting @EdinburghNLP @EdinburghUni. Machines should do what humans cannot do. He/him 🤗Mukund Srinath @MukundSrinath3
152 Followers 337 Following PhD candidate @ISTatPennState | NLP, IR and Trustworthy AIChirag Agarwal @_cagarwal
970 Followers 395 Following On academic job market for Fall'24; Postdoctoral Fellow @Harvard; @ml_collective; @trustworthy_ml; Increasing the sample size of my thoughtsJosh Tan @jtan189
73 Followers 395 FollowingSara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Daniel D'souza @mrdanieldsouza
573 Followers 906 Following Research Engineer @CohereForAI 💙 | @UMichECE Alum 〽️ | 🇮🇳✖️🇺🇸 💫"The Universe Works in Mysterious Ways"💫Janet Song @jhtsong
266 Followers 556 Following @HHMINews Helen Hay Whitney Fellow in @ChrisAWalsh1 lab @BostonChildrens. PhD in Kingsley lab @Stanford. Genetics of human brain evolution.Zhilei Xu @ZhileiXu
157 Followers 199 Following Postdoc Associate at MIT Kavli Institute @MITKavli Member of CLASS, ACT, SO, CMB-S4, and HERA collaborationsSean A. Cantrell, PhD @errantdata
208 Followers 190 Following Transitioned from particle theory to deep learning and AI. First principles, math, and the occasional bad joke. MLE at Spotify; my opinions are my own.Jonathan Frankle @jefrankle
16K Followers 684 Following Chief Scientist, Neural Networks @Databricks via MosaicML. PhD @MIT_CSAIL. BS/MS @PrincetonCS. DC area native. Making AI efficient for everyone at @DbrxMosaicAIcdhuang @cdhuang2
17 Followers 34 FollowingW. D’Arcy Kenworthy @astrodarcy
132 Followers 146 Following Graduate student studying astronomy at Johns Hopkins University. I research supernova cosmology and the Hubble constant. Also @darcykenworthy. He/himDuncan Watts @dncnwtts
257 Followers 686 Following Postdoctoral researcher at University of Oslo. Cosmic Microwave Background Data AnalysisNora Belrose @norabelrose
8K Followers 124 Following Working toward a free and fair future powered by friendly AI. Head of interpretability research at @AiEleuther, but tweets are my own views, not Eleuther’s.andy jones @andy_l_jones
4K Followers 326 Following engineering & research at @AnthropicAI. DC, SF, LondonLucas Beyer (bl16) @giffmana
56K Followers 446 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Sierra @SierraPlatform
1K Followers 2 Following We help companies elevate their customer experience with AI.Denny Zhou @denny_zhou
9K Followers 420 Following @GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.Zohar Komargodski @ZoharKo
206 Followers 62 FollowingShunyu Yao @ShunyuYao12
7K Followers 858 Following Language agents (ReAct, Reflexion, Tree of Thoughts) for digital automation (WebShop, SWE-bench, SWE-agent)Austin Huang @austinvhuang
3K Followers 1K Following General intelligence as personal computing. Past: @GoogleDeepMind, MIT, Harvard, Berkeley.Cohere For AI @CohereForAI
15K Followers 177 Following We are a research lab and open science initiative that seeks to solve complex machine learning problems. Join us in exploring the unknown, together.Horace He @cHHillee
24K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleSusan Zhang @suchenzang
20K Followers 505 Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for compute.Arash Vahdat (hiring) @ArashVahdat
8K Followers 807 Following Principal scientist and research manager @nvidia research, leading forward-looking fundamental generative AI research efforts, views are my own.Sergey Levine @svlevine
80K Followers 122 Following Associate Professor at UC Berkeley Co-founder, Physical IntelligenceGennady Korotkevich @que_tourist
11K Followers 0 FollowingSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.John Schulman @johnschulman2
39K Followers 611 Following Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz musicBarret Zoph @barret_zoph
10K Followers 882 Following @openai Past: Research Scientist at Google Brain.Noam Brown @polynoamial
34K Followers 612 Following Researching reasoning @OpenAI | Co-created Libratus/Pluribus, the first superhuman no-limit poker AIs | Co-created CICERO | PhD from @SCSatCMUMukund Srinath @MukundSrinath3
152 Followers 337 Following PhD candidate @ISTatPennState | NLP, IR and Trustworthy AICarolin Holtermann @CarolinHolterm
32 Followers 47 Following PhD Candidate at the University of HamburgChia-Chien Hung @cc_hung_
66 Followers 120 Following Research Scientist @NECLabsEU (Human-Centric AI) | NLP Researcher @dwsunimaEmmy Liu @_emliu
920 Followers 440 Following PhD student @LTIatCMU, working with @gneubig on NLP || Interested in robust reasoning, language and cognitive science || UofT ‘21 🇨🇦 ||🤖✨🔡Miao Li @oaimli
99 Followers 661 Following PhD candidate in NLP/AI @UniMelb, Visiting @EdinburghNLP @EdinburghUni. Machines should do what humans cannot do. He/him 🤗Alex Gu @minimario1729
2K Followers 2K Following phd @MIT_CSAIL, llm for math and code. intern @MetaAI and analyst @pillar_vc. prev @BigCodeProject, @MITIBMLab, @JaneStreetGroup, @PonyAI_techChris Olah @ch402
91K Followers 173 Following Reverse engineering neural networks at @AnthropicAI. DMs open! Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account.Misha Laskin @MishaLaskin
8K Followers 175 Following Staff Research Scientist @DeepMind. Previously @berkeley_ai. YC alum.Josh Tan @jtan189
73 Followers 395 FollowingDaniel D'souza @mrdanieldsouza
573 Followers 906 Following Research Engineer @CohereForAI 💙 | @UMichECE Alum 〽️ | 🇮🇳✖️🇺🇸 💫"The Universe Works in Mysterious Ways"💫Chirag Agarwal @_cagarwal
970 Followers 395 Following On academic job market for Fall'24; Postdoctoral Fellow @Harvard; @ml_collective; @trustworthy_ml; Increasing the sample size of my thoughtsSara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Colin Raffel @colinraffel
30K Followers 655 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlpDwarkesh Patel @dwarkesh_sp
55K Followers 700 Following Being pretrained Host of Dwarkesh Podcast https://t.co/3SXlu7fy6N https://t.co/rEhnfYywXY https://t.co/hQfIWdM1Unwill grathwohl @wgrathwohl
4K Followers 237 Following Lover of raccoons and machine learning. Research Scientist at @GoogleDeepMind in NYC. Ph.D. from @UofTCompSciTri Dao @tri_dao
19K Followers 365 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.Stella Biderman @BlancheMinerva
15K Followers 748 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/herCaglar Gulcehre @caglarml
4K Followers 1K Following ML Researcher Prof @ EPFL, PI @ CLAIRE lab Ex: Staff Research Scientist @ Deepmind, MSR, IBM Research Follow me on Mastodon: https://t.co/LZ5sWt7AsjSoumith Chintala @soumithchintala
187K Followers 883 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Nando de Freitas 🏳.. @NandoDF
97K Followers 659 Following I research intelligence to understand it and to harness it wisely. Part of AlphaGo tuning, AlphaCode, learning to learn, Lyria, Imagen2, Gato, rGemmaTimothy Nguyen @IAmTimNguyen
7K Followers 414 Following Machine learning researcher at @GoogleDeepMind, mathematician, quantum physicist. Host of The Cartesian Cafe podcast. All opinions are my own.Siddharth Mishra-Shar.. @kdqg1
2K Followers 2K Following Physicist and machine learning researcher @MIT, @iaifi_news, and @Harvard. Probabilistic modeling/inference, AI for astro/physics/science, and cats. he/him.EMNLP 2024 @emnlpmeeting
12K Followers 41 Following EMNLP 2024 - The 2024 Conference on Empirical Methods in Natural Language Processing, November 12 –16, 2024 Hashtag: #EMNLP2024Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzDavid Pfau @pfau
22K Followers 1K Following Knowledge manifests itself in radiant dreams that shimmer like the wild sun Views are my own pfau at sigmoid dot social on 🦣 https://t.co/xqtVHHVI17 on 🦋Hyung Won Chung @hwchung27
18K Followers 231 Following Research Scientist @OpenAI. Past: @Google Brain / PhD @MITInternal Tech Emails @TechEmails
526K Followers 900 Following Internal tech industry emails that surface in public records. 🔍Yejin Choi @YejinChoinka
19K Followers 330 Following professor at UW, director at AI2, adventurer at heartIlya disappeared. Roon disappeared. Sama is courting the Saudis for trillions. We’re about to hit a post-timeskip arc I can feel it, everything is gonna get weird as fuck in aprox. 3 months
📢Introducing CRUXEval, a benchmark to measure Python code execution! 🏠Homepage: crux-eval.github.io 📜Paper: crux-eval.github.io/paper/cruxeval… 🏆Leaderboard: crux-eval.github.io/leaderboard.ht… 🔎Sample Explorer: crux-eval.github.io/demo.html 📊HF Dataset: huggingface.co/datasets/cruxe…
In collaboration with our friends at @huggingface, Colab managed runtime images now include transformers installed by default. `import transformers` is all you need. We regularly update our image, but you can always force an upgrade with `!pip install transformers --upgrade`
Great to see a practical use case of VoG in language-based tasks!!
This work by @AmazonScience combines our VoG method with data pruning successfully. It fun because VoG we proposed a few years ago in a computer vision context -- fun to see it generalize to language-based tasks using pretrained models. successfully. amazon.science/publications/i…
VoG(Variance Of Gradients) is a such a simple idea with far reaching implications! 🔥 Exciting to see this applied to a whole different modality successfully 🤗 Learn more here : varianceofgradients.github.io
This work by @AmazonScience combines our VoG method with data pruning successfully. It fun because VoG we proposed a few years ago in a computer vision context -- fun to see it generalize to language-based tasks using pretrained models. successfully. amazon.science/publications/i…
This work by @AmazonScience combines our VoG method with data pruning successfully. It fun because VoG we proposed a few years ago in a computer vision context -- fun to see it generalize to language-based tasks using pretrained models. successfully. amazon.science/publications/i…
A new online-to-batch conversion technique recovers linear decay schedules as the "right" thing to do! We also did a lot of experiments - you might want to consider linear decay over cosine as your default schedule in practice.
🚨 New Paper 🚨 A new approach to learning rate scheduling! Our refinement theory gives schedules that include warmup and annealing-to-zero automatically. arxiv.org/abs/2310.07831 It improves on strong baseline schedules across a majority of deep learning problems!
Great collaboration with Trygve, Nik, and Rebecca @AllenInstitute and @DExpositoAlonso (also in @ChrisAWalsh1 lab)! We compare scRNA-seq from human, chimp, gorilla, macaque, and marmoset brains to explore what makes us human. Check it out👇science.org/doi/full/10.11…
What makes us human? Today in @ScienceMagazine, a large research consortium funded by @NIH #studyBRAIN initiative shared new insights on how human brain cells differ from other humans and our closest primate relatives. 🔗alleninstitute.org/news/what-make… More on some of the papers. 🧵⬇️
Excited to share our new preprint from @ChrisAWalsh1 lab examining the contribution of noncoding variation to autism spectrum disorder (ASD)! Joint work with @OldManTae and with Ryan Doan’s lab 1/ medrxiv.org/content/10.110…
Super excited to bring some more optimization theory to practice! Pytorch code here: github.com/optimizedlearn…. Take any torch.optim.Optimizer and automatically learn the learning rate! JAX implementation (which was used for most of the experiments) coming soon!
Take your favorite optimizer (Adam, SGD, Lion) and feed it into Mechanic to get the learning rate. You give it the direction, it gives you the magnitude. The new optimizer has been tested on a wide range of deep learning problems: ViT, LSTM, ResNet, etc. arxiv.org/abs/2306.00144
@nikhil_anand91 The submission link for main conference papers is now available! openreview.net/group?id=EMNLP…
@emnlpmeeting I might have missed it but on the page 2023.emnlp.org/calls/main_con…, we can't find the link to submission site. Is it on openreview?
Interested in sparse neural networks? Generalization? Pruning algorithms? Come to our NeurIPS poster this afternoon where we present our empirical study on how pruning affects generalization.
I'll present our work about pruning's effect on generalization @NeurIPS this Tuesday at 4pm (located at Hall J #715)! Pruning removes unimportant weights in a neural network. Practitioners have long noticed that pruning improves generalization, how does this happen? 1/n
Today at 11am CT, Hall J #806 we are presenting our paper on infinite width neural network kernels! We have methods to compute NTK/NNGP for extended set of activations + sketched embeddings for efficient approximation (100x) for compute intensive conv kernels! See you there!
Most infinitely wide NTK and NNGP kernels are based on the ReLU activation. In arxiv.org/pdf/2209.04121…, we propose a method of computing neural kernels with *general* activations. For homogeneous activations, we approximate the kernel matrices by linear-time sketching algorithms.
LLMs are for everyone! Own a GPT-3 trained on your data rather than renting a GPT-3 trained on a web crawl of Reddit. The price is $450K. [email protected] to try it. This is just the start: this doesn't use MosaicML speedups. Our goal is to do this for $100K soon. 🧵
We have exciting news! In our latest and greatest LLM blog, we show how MosaicML Cloud can help you train LLMs from 1B - 70B parameters, and for the first time, publish transparent times + costs for doing so. It's a lot cheaper than you think! (1/9) mosaicml.com/blog/gpt-3-qua…
We have exciting news! In our latest and greatest LLM blog, we show how MosaicML Cloud can help you train LLMs from 1B - 70B parameters, and for the first time, publish transparent times + costs for doing so. It's a lot cheaper than you think! (1/9) mosaicml.com/blog/gpt-3-qua…
How much does it *really* cost to train GPT? There's speculation and (mis-)info out there that might make you think it's out of reach. It isn't. @MosaicML is laser focused on making it easy and accessible. This is Part 1 of a series introducing Mosaic GPT. mosaicml.com/blog/billion-p…
1/ Super excited to introduce #Minerva 🦉(goo.gle/3yGpTN7). Minerva was trained on math and science found on the web and can solve many multi-step quantitative reasoning problems.
Very excited to present Minerva🦉: a language model capable of solving mathematical questions using step-by-step natural language reasoning. Combining scale, data and others dramatically improves performance on the STEM benchmarks MATH and MMLU-STEM. goo.gle/3yGpTN7
Very excited to announce a significant milestone in expanding reasoning capabilities of language models! 🎉🎉 We introduce #Minerva🦉: a language model that can solve mathematical questions using step-by-step natural language reasoning: bit.ly/3OBj2d5 🧵 1/
Very excited to present Minerva🦉: a language model capable of solving mathematical questions using step-by-step natural language reasoning. Combining scale, data and others dramatically improves performance on the STEM benchmarks MATH and MMLU-STEM. goo.gle/3yGpTN7