Sasha Rush @srush_nlp
Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGz rush-nlp.com New York, NY Joined December 2015-
Tweets6K
-
Followers51K
-
Following462
-
Likes3K
I'm thrilled to join Princeton's faculty as an assistant professor in the ECE department starting Fall 2025 🐯 Stay tuned for the launch of my lab. We will develop generally helpful robots that learn and plan 🤖
my version would have stuff like: "empathy", "the first law of robotics", "libertarian ideals" in the outer bubble - really show the reader what's at stake
@srush_nlp Somehow the disentangled arch also makes the gradients cleaner and 'interpretable'
today was an intense day of research. running commands, writing code, checking numbers, being confused, looking at notes, running more commands, taking a nap, literally going 🤔 then at the end of the day i figured out one (1) Very Important Thing. so today counts as a good day
✨Excited to finally drop our new paper: SSMs “look like” RNNs, but we show their statefulness is an illusion🪄🐇 Current SSMs cannot express basic state tracking, but a minimal change fixes this! 👀 w/ @jowenpetty, @Ashish_S_AI arxiv.org/abs/2404.08819
wow this must feel good when training (from Megalodon arxiv.org/abs/2404.08801 )
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length abs: arxiv.org/abs/2404.08801 repo: github.com/XuezheMax/mega…
(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Soumith Chintala @soumithchintala
185K Followers 869 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistKyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Lucas Beyer (bl16) @giffmana
56K Followers 442 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]clem 🤗 @ClementDelangue
89K Followers 5K Following Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform to build machine learningRosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Horace He @cHHillee
23K Followers 445 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleGraham Neubig @gneubig
30K Followers 582 Following Associate professor at CMU, studying natural language processing and machine learning.Julien Chaumond @julien_c
46K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueSara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Thomas Wolf @Thom_Wolf
67K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceZachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Jay Alammar @JayAlammar
35K Followers 1K Following Machine learning and language models R&D. Builder. Writer. Visualizing AI, ML, and LLMs one concept at a time. @Cohere. https://t.co/TquuQXlLOJTim Dettmers @Tim_Dettmers
28K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Jacob Andreas @jacobandreas
13K Followers 955 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwYoav Artzi @yoavartzi
13K Followers 163 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / https://t.co/3VmRSyYm2d / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaCbtree @btree24
32 Followers 153 FollowingRahul @iameahulshah
1 Followers 33 FollowingDavid Orrego-Carmona @dorrego
3K Followers 3K Following #Translation @Warwick_Transla @Warwickuni • @UFSweb •🇨🇴• #Subtitles • Open Science • #MT • Climbing • @JostransT @TranslSpaces @[email protected]Darius @radiuskia
40 Followers 127 Following Incoming @nvidia; Prev. @microsoft; AI/ML Research @umdcs; 🇰🇷🇮🇷Karl Stratos @karlstratos
81 Followers 19 FollowingMitsuo Nishizawa @nishizawa
1K Followers 5K Following 医学/画像診断/医用工学/数学/物理学/論理学/プログラミング言語/外国語学/国際情勢/経済学/世界史/海外小説/iPhone Jailbreakなど。マイブームは統計学的手法。Gilbert Gómez @gagb94
24 Followers 193 Following Estudio Ing. Informática. Amante de la tecnología y del fútbolMichael Lohaus @LohausMichael
68 Followers 178 FollowingXya_cerX_!3 @3Cerx
0 Followers 210 FollowingSerghei @Sergheibuhbuh
107 Followers 552 FollowingPete @epwalsh
7 Followers 66 Following Research Engineer at @allen_ai. Lead engineer for OLMo pretraining.Yuliya @mjulcik
0 Followers 60 Followingsimrat hanspal @simsimsandy
78 Followers 417 Following Data scientist with a curious engineering mind| Tech Evangelist @HasurahqSreenija @nija7k
1 Followers 70 FollowingWhisperAiML @whisperaiml
1 Followers 88 FollowingAl Buterol @alanbuterol
7 Followers 24 FollowingItxaso Baskero Dorrea.. @IDorreak
2 Followers 151 FollowingJeffrey Asare @jeffreyasare23
30 Followers 1K Following How to generate passive Income Online (5 Ways)dzhwinter @idadong6
2 Followers 85 Followingsivav @GenZSiv
47 Followers 481 Followingreturn10x @return10x
62 Followers 2K Followingupteronext @upteronext
50 Followers 163 FollowingRag @Beingsissyphus1
0 Followers 63 FollowingHaeji Jung @haejiness_ai
33 Followers 94 Following 👩🎓M.S. student. Looking for Ph.D. position/ 💻Studying AI / 🏫CSE, Korea University / ❤️Multimodal(Vision-Language), Multiligual LM, Representation LearningBatman.jedi @Batmanjedi90
85 Followers 369 FollowingDavid @theawely
62 Followers 855 Following Interests: software dev., RISC, many cores & neuromorphic computing, lte-m, wikipedian, neuroregeneration, physicalism ⏤ @[email protected]Priya Goyal @priy2201
1K Followers 494 Following Founding member @datologyai, ex-Google Deepmind, ex-Facebook AI Research (FAIR).Alberto Montero @alberto_1619
2 Followers 111 Followingraj das @rajdas1947790
13 Followers 285 FollowingDaqwes @DaQWeS
52 Followers 371 Following🆇 Joe @AI_Joe_
3K Followers 774 Following 𝕏 | Conservative | Anti-Woke | Warlock | My account is a meme ☠...._̴̡̧̧̢̧̭̺̣͖̬̲̹̬̭̰̘̫͎̹͓̹̮͔̩̦̫̮̰͕̝̝̲̺͚̪̝̰̲̩̹Marcel Schöckel @marcelschoeckel
31 Followers 153 Followingsongyq @songyq4
26 Followers 135 FollowingYiyi Chen @YiyiChen
141 Followers 392 FollowingTasour @TasourR
23 Followers 54 Following Error code: 0xF2024 (Lost in the virtual world). Backup failed. All data lost.sanjana prasad @sanjanpra2k01
255 Followers 607 Following Grad @UTAustin | ML | Systems | Researcher | Lifelong Learner | Computational Scientist👩💻| Growth MindsetJames Leu @skydetainer
127 Followers 3K Following When you can understand and explain the universe,you’re a smart man.Mushin @Mushin_J
86 Followers 197 FollowingEvangeline @Evangeljy
1 Followers 87 FollowingAnirudh Bharadwaj @abharadwaj01
1 Followers 20 FollowingBbb @Bibb1234567
28 Followers 116 Following_Devil_ @Devil473114
3 Followers 296 FollowingAlen Capalik @capalik
182 Followers 862 Following Founder of CounterTack (now GoSecure) & https://t.co/snWLnZolVI, Entrepreneur, Hacker, Computer Programmer, AI/ML, GPUs, Cybersecurity, Investing, Long Time Options Trader(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingSoumith Chintala @soumithchintala
185K Followers 869 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistKyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).clem 🤗 @ClementDelangue
89K Followers 5K Following Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform to build machine learningRosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Christopher Manning @chrmanning
126K Followers 114 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋Horace He @cHHillee
23K Followers 445 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleGraham Neubig @gneubig
30K Followers 582 Following Associate professor at CMU, studying natural language processing and machine learning.Julien Chaumond @julien_c
46K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueJia-Bin Huang @jbhuang0604
51K Followers 285 Following Associate Professor @umdcs; Part-time Research Scientist @Meta. I like pixels.Thomas Wolf @Thom_Wolf
67K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceTim Dettmers @Tim_Dettmers
28K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Jacob Andreas @jacobandreas
13K Followers 955 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwYoav Artzi @yoavartzi
13K Followers 163 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / https://t.co/3VmRSyYm2d / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaCTal Linzen @tallinzen
16K Followers 893 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAIMark Riedl @mark_riedl
32K Followers 1K Following AI for storytelling, games, explainability, safety, ethics. Professor @GeorgiaTech. Associate Director @MLatGT. Time travel expert. Geek. Dad. he/himNaomi Saphra @nsaphra
7K Followers 1K Following Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.Jason Lee @jasondeanlee
10K Followers 3K Following Associate Professor at Princeton and Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learningvLLM @vllm_project
668 Followers 11 Following A high-throughput and memory-efficient inference and serving engine for LLMsGreg Leppert @leppert
2K Followers 558 Following Director at Harvard working on AI and access to knowledge. Affiliate @BKCHarvard. “Mildly humorous” —New-York Gazette. https://t.co/gMXJUgOgcTtypedfemale @typedfemale
23K Followers 480 Following a really exciting new account "have you ever though you might be like scott alexander? very smart, but can't do math" - anonSonglin Yang @SonglinYang4
1K Followers 2K Following PhD student @MIT_CSAIL. Prev. @ShanghaiTechUni @SUSTechSZ. Working on scalable and principled methods in #ML & #NLProc. INTP | 5w4 | sx/sp | she/herKweku Opoku-Agyemang,.. @KwekuOA
8K Followers 6K Following CEO @mlxdoing AI. @DevEconX The next generation. Affiliate @The_IGC. Ex-Prof @UCBerkeley, @cornell_tech. PhD @UWMadison 👉 https://t.co/ywmCg4QU5mGeorgi Gerganov @ggerganov
38K Followers 243 Following Not AI | 0x0e59 0x2550 24th at the Electrica puzzle challengeArthur Mensch @arthurmensch
40K Followers 868 Following Co-founder and CEO @MistralAI. Apply https://t.co/yHGRZAtjcxPicoCreator (🇸🇬.. @picocreator
2K Followers 160 Following Builds Attention-Free Transformer (https://t.co/YL7CbNYKBs) from scratch - CEO @ https://t.co/kQHiGtzJWr Also built k8s tools, uilicious & GPU.js (https://t.co/OIfnI1EPU7)Yuntian Deng @yuntiandeng
3K Followers 3K Following #NLProc Postdoc @ai2_mosaic | Assistant Professor @UWaterloo '24 | Faculty Affiliate @VectorInst '24 | PhD @HarvardConference on Languag.. @COLM_conf
1K Followers 6 Following https://t.co/GhGCMEoa4A Abstract submission: March 22, 2024Niklas Muennighoff @Muennighoff
5K Followers 319 Following @ContextualAI | Interests: AI/LLM Research & Health ❤️ | Past: @huggingface @PKU1898Yao Fu @Francis_YAO_
13K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep runningCollin Burns @CollinBurns4
11K Followers 274 Following Superalignment @OpenAI. Formerly @berkeley_ai @Columbia. Former Rubik's Cube world record holder.Darek Kłeczek @dk21
3K Followers 2K Following Machine Learning, Kaggle and occasional pictures from Poland. Growth MLE at Weights & Biases.Aman Sanger @amanrsanger
15K Followers 640 Following building @cursor_ai at @anysphere https://t.co/EdcQJ2dv0J | https://t.co/vJ5zNuT6WOHendrik Strobelt @hen_str
4K Followers 462 Following Visualization and Interactive Human Centered AI. Explainability lead @MITIBMLab, @VISxAI, OE Chair @NeurIPSConf, Chair @ieeevis -- own views. #NLProc #AIJan-Willem van de Mee.. @jwvdm
2K Followers 1K Following Associate Professor (UHD) at the University of Amsterdam; Probabilistic programming and its applications.Nathan Lambert @natolambert
25K Followers 684 Following Figuring out AI @allen_ai, "rl boi" DM me papers. Writes @interconnectsai, talks @retortai Has phd and some credentialsRémi Leblond @RemiLeblond
2K Followers 155 Following Research Scientist @GoogleDeepMind. #Gemini, #AlphaCode, #AlphaStar. Working on solving hard problems with machine learning.Stability AI @StabilityAI
188K Followers 31 Following We are building the foundation to activate humanity's potential.Sergey Levine @svlevine
79K Followers 122 Following Associate Professor at UC Berkeley Co-founder, Physical IntelligenceChenlin Meng @chenlin_meng
8K Followers 834 Following Co-founder & CTO @pika_labs | ex @StanfordAILab @StanfordRafael Rafailov @rm_rafailov
3K Followers 637 Following Ph.D. Student at @StanfordAILab. I work on Foundation Models and Decision Making. Previously @GoogleDeepMind @UCBerkeleyOleksii Kuchaiev @kuchaev
477 Followers 604 Following AI model alignment and customization @NVIDIA. I love riding motorcycles and all things ocean - surfing, sailing, diving.Pika @pika_labs
116K Followers 52 Following Video on command. Website: https://t.co/G5bjmrMQsx Discord: https://t.co/bX68ThPTQH About: https://t.co/atvdcgbe9SJonathan Ho @hojonathanho
4K Followers 151 FollowingDavid Pfau @pfau
22K Followers 1K Following Knowledge manifests itself in radiant dreams that shimmer like the wild sun Views are my own pfau at sigmoid dot social on 🦣 https://t.co/xqtVHHVI17 on 🦋Wenting Zhao @wzhao_nlp
771 Followers 343 Following PhD student @cornell_tech Food for life, NLP for soul!Princeton PLI @PrincetonPLI
1K Followers 19 Following Princeton University initiative enhancing fundamental understanding of AI, enabling its use in academic disciplines, and examining AI's societal implications.Denny Zhou @denny_zhou
9K Followers 416 Following @GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai, open source science fan, @QueerInAI organizer 🤖☕️🍕they/themTengyu Ma @tengyuma
25K Followers 510 Following Assistant professor at Stanford; Co-founder of Voyage AI (https://t.co/wpIITHLgF0) ; Working on ML, DL, RL, LLMs, and their theory.Sanchit Gandhi @sanchitgandhi99
4K Followers 36 Following Open-source speech @huggingface 🤗. Previously Masters' at @Cambridge_Uni.Sean Welleck @wellecks
3K Followers 222 Following Assistant Professor at CMU. Marathoner, @thesisreview.Suchin Gururangan @ssgrn
4K Followers 247 Following he/him Research Scientist @meta GenAI prev: PhD @uwcse + @uwnlpAnil Ananthaswamy @anilananth
8K Followers 3K Following Sci journalist/TED speaker/MIT KSJ Fellow/Books: The Edge of Physics, The Man Who Wasn't There, Through Two Doors at Once Mastodon: @[email protected]Miles Cranmer @MilesCranmer
12K Followers 899 Following Assistant Prof @Cambridge_Uni, works on AI for the physical sciences. Previously: Flatiron, DeepMind, Princeton, McGill.Quentin Lhoest @qlhoest
3K Followers 229 Following Open Source ML Engineer @huggingface | Maintainer of 🤗DatasetsPatrick Lewis @PSH_Lewis
4K Followers 655 Following London-based AI/NLP Research Scientist. I co-lead the RAG & tool use team at Cohere w/ @s_hofstaetter. Previous Fundamental AI Research at Meta AI, FAIR, UCL AIElizabeth Salesky @esalesk
1K Followers 656 Following PhD student @jhuclsp more commonly known as Liz ☀️ Friend of @NLPwithFriends ☀️ I like bubbles, bicycles, and language variationBoaz Barak @boazbaraktcs
17K Followers 415 Following Computer Scientist. See also https://t.co/EXWR5k634w, https://t.co/SEVX6it6z3 ( @[email protected] , boaz.barak in threads ). Opinions my own.Mark Yatskar @yatskar
2K Followers 473 Following Assistant Professor at UPenn @PennEngineers. NLP/CV/Fairness. Phd @UWCSE, Formerly @allen_aiHanna Hajishirzi @HannaHajishirzi
6K Followers 326 Following Associate professor at @uw_cse; senior director at @allen_ai co-leading @allenNLP; AI/NLP researcher at @uw_nlpNoah Snavely @Jimantha
7K Followers 842 Following 3D vision fanatic. Professor @cornell_tech & Researcher @GoogleAI. He or they.Saulnier Lucile @LucileSaulnier
4K Followers 431 Following AI Specialist @ Mistral AI | Former ML @ Hugging Face | ENS Paris-Saclay (MVA) | Centrale ParisUrvashi Khandelwal @ukhndlwl
2K Followers 610 Following Research Scientist @GoogleDeepMind, Stanford CS PhD @stanfordnlpMaarten Sap (he/him) @MaartenSap
4K Followers 642 Following Working on #NLProc for social good. Currently at @LTIatCMU, previously at @UWNLP, @MSFTResearch, and @allen_ai. 🏳🌈LLaMA 3 is testing the limits of @harmdevries77's Law (viz: huggingface.co/spaces/lvwerra… using 8B param & 15T tokens)
I'm thrilled to join Princeton's faculty as an assistant professor in the ECE department starting Fall 2025 🐯 Stay tuned for the launch of my lab. We will develop generally helpful robots that learn and plan 🤖
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…
Llama 3 has arrived! Taaa-daaam! ai.meta.com/blog/meta-llam…
@typedfemale transformers and linear ssms require chain of thought to attain libertarian ideals
my version would have stuff like: "empathy", "the first law of robotics", "libertarian ideals" in the outer bubble - really show the reader what's at stake
@srush_nlp Somehow the disentangled arch also makes the gradients cleaner and 'interpretable'
@lambdaviking @srush_nlp Reminds me of this arxiv.org/abs/2309.07412
@srush_nlp @carrigmat i really wanna learn about the decision making process that resulted in the continuation of the run when it’s 250b tokens in and it looked unclear if there would be a catching up 🫢
✨Excited to finally drop our new paper: SSMs “look like” RNNs, but we show their statefulness is an illusion🪄🐇 Current SSMs cannot express basic state tracking, but a minimal change fixes this! 👀 w/ @jowenpetty, @Ashish_S_AI arxiv.org/abs/2404.08819
@srush_nlp I don't think AI can solve the "Your Notation Is Shit" problem.
@srush_nlp Somewhat tangential, but in hindsight it's amazing that so much pre-training work in the BERT-era focused on the objective/architecture while never changing the training data (almost always multiple epochs Wiki + Book Corpus) (including what I worked on in 1st year phd..)
@srush_nlp I think the correct answer is only partially technical and consists of 3 factors: 1. Decoder only has broader use cases 2. LLMs used to cost tens of millions 3. Sutskever was willing to make a bet on large decoder models
@srush_nlp Lot of great answers here. Not a direct answer but an answer why people prefer decoders (from a talk I gave recently). But it doesn't take a lot of training to transform a pretrained decoder to an encoder (just 500 steps), so it is unclear why one wants an encoder from scratch.
I remain unconvinced that encoders don’t scale, and it’s unfortunate that there isn’t more work on this given how much more efficient they are at a lot of things than decoders. The move away from decoders seems more sociological than scientific.
Lazy twitter: A common question in NLP class is "if xBERT worked well, why didn't people make it bigger?" but I realize I just don't know the answer. I assume people tried but that a lot of that is unpublished. Is the theory that denoising gets too easy for big models?
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length abs: arxiv.org/abs/2404.08801 repo: github.com/XuezheMax/mega…
@srush_nlp Using pronoun resolution as a case study, we hypothesize a casual mechanism & show empirically, that denoising objs are generally less underspecified, less vulnerable to spurious correlations / hallucinations, w AR comps ranging up to GPT-4 turbo preview. ojs.aaai.org/index.php/AAAI…
@srush_nlp I've thought about this a lot, and have assumed it's more of a cultural/groupthink issue than anything else. BERT was eclipsed by GPT 2, and so the masses moved in that direction. I'd personally use an xBERT-flavored approach if I were interested in pushing SOTA.