Will Merrill @lambdaviking
Ph.D. student @ NYU🗽 Theoretical aspects of NLP and LMs /nætʃɹəl/🇮🇸 + formal🤵 languages + TCS🧮 lambdaviking.com New York, NY Joined October 2011-
Tweets1K
-
Followers2K
-
Following570
-
Likes4K
Was fun working on this! The cool takeaway imo is that we can characterize the type of reasoning that blank tokens can help with… it’s reduced compared to CoT but experiments show it’s likely more than with no extra tokens
Was fun working on this! The cool takeaway imo is that we can characterize the type of reasoning that blank tokens can help with… it’s reduced compared to CoT but experiments show it’s likely more than with no extra tokens
Do models need to reason in words to benefit from chain-of-thought tokens? In our experiments, the answer is no! Models can perform on par with CoT using repeated '...' filler tokens. This raises alignment concerns: Using filler, LMs can do hidden reasoning not visible in CoT🧵
The cost of linear dynamics.
here are the slides I made about Understanding, some keywords: entailment, the appearance of understanding, systematicity, pragmatics, reference drive.google.com/file/d/1dfT9p-…
here are the slides I made about Understanding, some keywords: entailment, the appearance of understanding, systematicity, pragmatics, reference drive.google.com/file/d/1dfT9p-…
RNNs are not dead yet‼️ In fact, they are coming back with a vengeance recently. Very nice paper about “The Illusion of State in State-Space Models” (arxiv.org/abs/2404.08819) and thread 👇
RNNs are not dead yet‼️ In fact, they are coming back with a vengeance recently. Very nice paper about “The Illusion of State in State-Space Models” (arxiv.org/abs/2404.08819) and thread 👇
Turns out SSMs like S4 and S6 don't quite get the best of both worlds -- sequential and parallel -- and struggle to track state just like Transformers. Excited to share the "Illusion of State" paper w/ @lambdaviking, @jowenpetty ! arxiv.org/abs/2404.08819
Turns out SSMs like S4 and S6 don't quite get the best of both worlds -- sequential and parallel -- and struggle to track state just like Transformers. Excited to share the "Illusion of State" paper w/ @lambdaviking, @jowenpetty ! arxiv.org/abs/2404.08819
ICYMI @benbenbrubaker wrote an eloquent Quanta article✍️covering our ICLR-2024 paper (w/ @lambdaviking) on how the expressive power of transformers changes with the length of CoT! Recently updated paper📜at arxiv.org/abs/2310.07923
ICYMI @benbenbrubaker wrote an eloquent Quanta article✍️covering our ICLR-2024 paper (w/ @lambdaviking) on how the expressive power of transformers changes with the length of CoT! Recently updated paper📜at arxiv.org/abs/2310.07923
arxiv.org/abs/2404.08819 a nice study by Merrill, Petty & Sabharwal. it looks like i won't have to wait too much longer for the reinvention of LSTM/GRU by LLM bros.
How does Galois theory help show that the state-tracking capabilities of current (!) SSMs are illusions? What makes S5 & A5 “hard”? And why do we consider A5 & friends here instead of S5? A thread on the algebra behind our paper! x.com/lambdaviking/s…
How does Galois theory help show that the state-tracking capabilities of current (!) SSMs are illusions? What makes S5 & A5 “hard”? And why do we consider A5 & friends here instead of S5? A thread on the algebra behind our paper! x.com/lambdaviking/s… https://t.co/dAMf4ltV3B
(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingTal Linzen @tallinzen
16K Followers 894 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAIDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Jacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Naomi Saphra @nsaphra
7K Followers 1K Following Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Ana Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (Dolma 🍇), OSS is fun, @QueerInAI organizer 🤖☕️🍕they/them (views mine, not my employer’s)Kayo Yin @kayo_yin
8K Followers 560 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Christopher Potts @ChrisGPotts
11K Followers 620 Following Stanford Professor of Linguistics and, by courtesy, of Computer Science, and member of @stanfordnlp and @StanfordAILab. He/Him/His.Najoung Kim 🫠 @najoungkim
2K Followers 493 Following At @BULinguistics and visiting @GoogleAI part-time. 🤖🔠🐱Sarah Wiegreffe @sarahwiegreffe
4K Followers 984 Following At @allen_ai @ai2_aristo @uwnlp. Research in language model transparency & interpretability. PhD from @mlatgt @icatgt @gtcomputing. Views my own.rishi @RishiBommasani
4K Followers 2K Following Stanford CS PhD @StanfordCRFM @StanfordNLP @StanfordAILab @StanfordHAI Advisers: @percyliang @jurafsky Previous: @CornellCIS @clairecardie #FoundationModelsOfir Press 🖋 @OfirPress
10K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Yonatan Belinkov @boknilev
4K Followers 1K Following Assistant professor of computer science @TechnionLive. #NLProcItzel Roegge @it_roeg
72 Followers 5K FollowingRenzo Carbonara @renzocarbonara
51 Followers 151 Following Author of @atopbook, building software with types and functional programming.ernesto @unterernst
568 Followers 1K Following prodigal bronxite | byte plumber specializing in @elixirlang at @reactmobile | ex-@join_papa, @simplebet, @daivergenthq, @KelvinCoversItj1ouhar @j1ouhar
0 Followers 19 FollowingHenil Shah @henil_shah_
1 Followers 17 FollowingZachary Rudolph @ZacharyRudolph
71 Followers 337 FollowingDaniel Levi-Minzi @dlvmnz
49 Followers 149 Followingcamhowe @camhowe1729
7 Followers 222 Following full-time techbro, part-time anon/undergrad. love explaining tech stuff.Landry Rocheleau @LandryRoch59156
75 Followers 5K FollowingAakanksha Chowdhery @achowdhery
7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to changedino_dna @dino_dna_
501 Followers 4K FollowingNitarshan Rajkumar @nitarshan
806 Followers 1K Following Adviser to the Secretary of State @scitechgovuk. Co-founder @aisafetyinst. Co-created AI Safety Summit and UK AI Research Resource. PhD @cambridge_clCallum McLean @__dunderhead__
3 Followers 77 Following world renowned computer expert (locked because I can't be bothered to block all the bots individually)Viv @Vtrivedy10
276 Followers 1K Followingj 🦊 METAL @bboyatime_
173 Followers 1K Following Top 0.2% most normal Bora stans | virgin-adjacent | 🔥 BABYMETAL 🔥Matthew Retchin @MatthewRetchin
20 Followers 117 FollowingThanh Do @ThanhDo63247526
52 Followers 633 FollowingGautham @ALongDeadStar
325 Followers 2K Following A tiny speck of star dust suspended in an infinite cosmos 💫🪐gfodor.id @gfodor
24K Followers 2K Following Anti-physist technology brother. UAP bunker. Problems soluble, potential to improve invariant. e/acc, #bitcoin @web_spaces, @MozillaHubsAadi Rajesh @TheAadiAadi
88 Followers 682 Following Found something I wrote interesting?DM! :) Intersection of intelligence, uncertainty, risk, truth, leverage, kindness and people🤖Lucas @SansGravitas
786 Followers 1K Following ☕️ | 💻 |♟️| 📉📈 | leading llm research @jumptrading | prev. deepmind, google, d.e. shawAnnalisa Fernandez @BecauseCulture
10K Followers 4K Following Tech culture and language in UX, AI, LLMs, data, social media, and privacy. Speaker on cultural differences and global inclusion. Ex Latam M&A 🇺🇸🇧🇷🇪🇸Jacob X. Li @jacobli99
11 Followers 142 Following MS @BrownUniversity / BS @UCSanDiego / Master @LeagueOfLegends / GigaBin fansgirish sastry @girishsastry
2K Followers 3K Following AI & other things | work: policy research @openaiNoraWallis @2ia3y53z4AhgibL
0 Followers 89 FollowingMykola Pechenizkiy @pechenizkiy
638 Followers 3K Following Professor @TUeindhoven, Trustworthy AI, Fairness, Transparency and Auditing ADMAI Safety Events and .. @AISafetyEvents
197 Followers 914 Following Newsletter listing upcoming AI safety events and training programs, weekly. https://t.co/8GbW14fJxWJacob Gadikian 🐳�.. @gadikian
25K Followers 7K Following CEO @notionaldao contributor @cosmos @cosmosibc @cosmos_sdk @cometbft @osmosiszone @akashnet_ @quicksilverzone @cyber_devs @composablefin gender: babumyron koch @myronkoch
466 Followers 2K Following Saxophone | Technology | Film | Andromeda Labs | Other stuffKelly Wu @tttttiam_real
4 Followers 155 FollowingNik Samoylov @NikSamoylov
443 Followers 299 Following A market researcher who also does campaigning for AI safety. By that I mean "safety from overly powerful AI", not "responsible AGI development". ❤️ AI ethics.Fly Away @flyfelda
4 Followers 324 FollowingWentao Li @waterlee23
95 Followers 1K FollowingCosine sigma @cosinesigma
230 Followers 600 FollowingRohan Potdar @RohanPotdar138
143 Followers 614 Following 2 x Intern @anyscalecompute || ECE @Purdue || RL, climbing, and San Francisco!Sean McDonald @seanmcdonaldxyz
2K Followers 2K Following building synthetic people at @semanticlife. 💍@CKdreambig 👨👩👧👦 @allonethingxyzJongho Park @jon_ghoh
82 Followers 413 Following AI research (Krafton; @PUBG) MS/PhD student (@WisconsinCS; on leave)Tian Yun @tianyunnn
159 Followers 336 Following Current PhD student at @BrownUniversity, co-advised by Ellie Pavlick @Brown_NLP and Chen Sun @jesu9. NLP & Multimodal Learning & Interpretability.Afroz Mohiuddin @afrozenator
1K Followers 5K Following Research Engineer at Google Brain. Interested in Science, Psychology, Investing, Design and generally almost everything. Good Thoughts, Good Words, Good Deeds.(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingTal Linzen @tallinzen
16K Followers 894 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAIDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Jacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwChristopher Manning @chrmanning
127K Followers 116 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋Yoav Artzi @yoavartzi
13K Followers 163 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / https://t.co/3VmRSyYm2d / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaCKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Naomi Saphra @nsaphra
7K Followers 1K Following Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Ana Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzNathan Schneider @complingy
4K Followers 1K Following Computational Linguist and Professional Nerd at Georgetown University he/him pronouns, ALL the prepositions @[email protected] @complingy.bsky.socialJim Fan @DrJimFan
230K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (Dolma 🍇), OSS is fun, @QueerInAI organizer 🤖☕️🍕they/them (views mine, not my employer’s)Kayo Yin @kayo_yin
8K Followers 560 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Christopher Potts @ChrisGPotts
11K Followers 620 Following Stanford Professor of Linguistics and, by courtesy, of Computer Science, and member of @stanfordnlp and @StanfordAILab. He/Him/His.Graham Neubig @gneubig
31K Followers 588 Following Associate professor at CMU, studying natural language processing and machine learning.Najoung Kim 🫠 @najoungkim
2K Followers 493 Following At @BULinguistics and visiting @GoogleAI part-time. 🤖🔠🐱Simran Arora @simran_s_arora
2K Followers 212 Following CS PhD student at @StanfordAILab @hazyresearchVaishnavh Nagarajan @_vaishnavh
2K Followers 530 Following Research scientist at Google || CS PhD at Carnegie Mellon. Interested in the theory of AI & Machine Learning. he/him 🏳️🌈Jeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordJade @Euclaise_
2K Followers 351 Following ⋅ Video game statistician ⋅ Soclib cyberanarchist? ⋅ C, Plan 9, LLMs, etc ⋅ Researcher w/ @NousResearch ⋅ she/theyHongxun Wu @HongxunWu
505 Followers 202 Following Was a happy undergrad at Yao Class, Tsinghua Now a grad student at BerkeleyPaul Schofield @pschofie79
14K Followers 7K Following Associate Professor of Philosophy, Bates College. Ethics, Politics, Film. Author of DUTY TO SELF (OUP, 2021). Researching housing justice and homelessness.Pete @epwalsh
52 Followers 88 Following Research Engineer at @allen_ai. Lead engineer for OLMo pretraining.Asad Sayeed @asayeed@.. @asayeed
2K Followers 833 Following Computational psycholinguist in @OfClasp, Senior Lecturer at the University of Gothenburg. Decided to become thought leader. Ready to lead your thoughts.Alex Infanger @alexinfanger
162 Followers 672 Following Currently thinking about AI alignment and consciousness. I've also worked on theory and algorithms for Markov chains. applied math @ICMEStanford, physics @ucscnoahdgoodman @noahdgoodman
2K Followers 109 Following Professor of natural and artificial intelligence @Stanford. Research Scientist at @GoogleDeepMind. (@StanfordNLP @StanfordAILab etc)Ben Brubaker @benbenbrubaker
2K Followers 1K Following Staff writer @QuantaMagazine covering computer science. Former freelance physics writer, ex-physicist. Same handle at bluesky and mastodon.Nicholas Lourie @NickLourie
155 Followers 313 Following I build things. 🤖 Doing a PhD at @nyuniversity (@CILVRatNYU) on better empirical methods for deep learning and data science. Advised by @kchonyc and @hhexiy.Cognition @cognition_labs
123K Followers 19 Following Makers of Devin, the first AI software engineer. We are an applied AI lab focused on reasoning, and code is just the beginning. Join us: https://t.co/tpfZwEwGiqderek guy @dieworkwear
848K Followers 963 Following Menswear writer. Editor at @putthison. Creator of @RLGoesHard. Bylines at The New York Times, The Washington Post, The Financial Times, Esquire, and Mr. PorterNate Gruver @gruver_nate
524 Followers 256 Following Machine learning PhD student at NYU BS & MS @StanfordAILab Industry @AIatMeta @WaymoFrog and Toad @frogandtoadbook
39K Followers 10K Following 🐸Dive into the delightful world of Frog and Toad! 🌟 Join us for excerpts, illustrations, and adventures! 📚✨Prithviraj (Raj) Amma.. @rajammanabrolu
5K Followers 519 Following Interactive & grounded AI, RL, NLP. Assistant Prof @UCSanDiego. Research Scientist @DbrxMosaicAI. Prev: @allen_ai, @GeorgiaTechDan Roy @roydanroy
45K Followers 2K Following ML / AI researcher, emphasis on theory. Research Director and Canada CIFAR AI Chair, @VectorInst Professor, @UofT (Statistics/CS)Nassim Nicholas Taleb @nntaleb
1.0M Followers 2K Following Flaneur: probability (philosophy), probability (mathematics), probability (real life),Phoenician wine, deadlifts & dead languages. Greco-Levantine.Canaan. #RWRINouha Dziri @nouhadziri
3K Followers 676 Following Research Scientist @allen_ai / @ai2_mosaic, PhD in NLP/Dialogue 🤖 UofA. Ex Visiting researcher @Mila_Quebec Ex Research intern at @GoogleDeepMind @MSFTResearchBodhi @BodhiCogSci
1K Followers 5K Following PhD @NeuroGeneva & @LndsNeuro. 🏳️🌈 computational neurobiology and cognitive sciences. philosophy, politics and aesthetics.Zonghan Yang @yang_zonghan
738 Followers 2K Following PhD student at Tsinghua NLP & AIR, obsessed with LLM ∩ Control (alignment and agent; and they are equivalent!); Two drifters with the world to see.Ananya Harsh Jha @AnanyaHarsh
393 Followers 2K Following Predoctoral Young Investigator at @ai2_allennlp @allen_aiNathan Lambert @natolambert
25K Followers 690 Following Figuring out AI @allen_ai, "rl boi" DM me papers. Writes @interconnectsai, talks @retortai Has phd and some credentialsJiacheng Liu (Gary) @liujc1998
991 Followers 188 Following 🎓 PhD student @uwcse @uwnlp. 🛩 Private pilot. Previously: 🧑💻 @oculus, 🎓 @IllinoisCS. 📖 🥾 🚴♂️ 🎵 ♠️jack morris @jxmnop
11K Followers 767 Following getting my phd in nlp @cornell_tech 🚠 // academic optimist // tweeting from the snack aisle at trader joesTyler Austin Harper @Tyler_A_Harper
27K Followers 2K Following Contributing Writer @TheAtlantic. Teaching @BatesCollege. Professional Doom-Monger. Co-host of Time to Say Goodbye @ttsgpod. Red-Blooded Lacanian Male.Reza Salehi @mrezasal1
140 Followers 1K Following PhD Student @uwcse & @uwnlp working on multimodal learning | ex. AIML intern @AppleAmanda Bertsch @abertsch72
1K Followers 673 Following PhD student @LTIatCMU / @SCSatCMU, researching text generation + summarization | she/her | also @ abertsch on bsky or https://t.co/L4HBUh0R9f or by email (https://t.co/bsHqwIMFPL)ethersam @coolgrlsonly
823 Followers 679 Following you met me at a very strange time in my life. @uniswapTianwei Yue @VyvyenYue
889 Followers 442 Following @MathGPTPro CEO/Founder | @YCombinator W24 | @LTIatCMUCharles Foster @CFGeek
2K Followers 258 Following 🪄 Tensor-enjoyer 🧪 @FinetuneLearn. Mastodon: @[email protected] Bluesky: https://t.co/rJecB0pvkANoam Razin @noamrazin
251 Followers 197 Following Computer Science PhD candidate at Tel Aviv University & @Apple Scholar in AI/ML | Interested in machine/deep learning theory and applicationsAkkal-AOE @AkkalAOE2
152 Followers 233 Following Tournament organiser, streaming aoe2 events at https://t.co/4DC7Td0mq4, all my events on Challonge:https://t.co/tnOgXkclwyBenjamin Paassen @bpaassen1
354 Followers 157 Following Junior Professor "Knowledge Representation and Machine Learning" Bielefeld University; preferred pronouns: they/them; views are my own; @[email protected]Leonie Weissweiler @LAWeissweiler
788 Followers 315 Following Visiting Researcher with @adelegoldberg1 at @Princeton | prev. @cislmu @LTIatCMU @CambridgeLTLArchiki Prasad @ArchikiPrasad
975 Followers 820 Following PhD student @uncnlp, advised by @mohitban47 | Undergrad @iitbombay | Prev: @allenai_org @AdobeResearch; Research interests: #NLProc #MLCory Shain (coryshain.. @coryshain
2K Followers 745 Following Language in minds, brains, and machines. Postdoc, BCS@MIT. Faculty, Linguistics@Stanford (fall 2024). He/him. https://t.co/U9ulwMUitLJane Pan @JanePan_
78 Followers 123 Following CS PhD at @nyuniversity, @NSF GRFP, @Deepmind Fellowship, @SiebelScholars | @Princeton @Princeton_nlp '23 | @Columbia '21.Ritchie Torres @RitchieTorres
193K Followers 9K Following A personal account in which I speak for myself rather than for the federal government. For official announcements, please follow @RepRitchie.Saurabh Shah @saurabh_shah2
494 Followers 989 Following ML Engineer @Apple /Siri NLU, prev @allen_ai @Penn …. 🎤dabbler in standup comedy and music 🎸… 🐈⬛enjoyer of cats 🐈 and mountains🏔️ …he/himAndy J Yang @pentagonalize
101 Followers 778 FollowingLeshem Choshen 🤖�.. @LChoshen
4K Followers 547 Following 🥇 Collaborative LLMs 🥈 Opinionatedly sharing #ML & #NLP 🥉 Propagating us underdogs we owe science an alternative hype @IBMResearch & @MIT_CSAIL@srush_nlp @aryaman2020 @tanshawn @lambdaviking @akyurekekin In our case, we don't argue the recall task is sufficient but (independent of the Olsen blog) show it is an empirically important slice of language modeling. We performed Pile / downstream error analysis and showed recall accounted for 80%+ of the overall ppl difference between…
@jowenpetty @sigfig @lambdaviking do you have any idea how much easier this would have made semantics
@sigfig @dickcheneyavi hmmm @lambdaviking okay maybe I’m coming around on the idea that we should actually require all semantics classes to be co-taught in haskell, if only to improve the job prospects of ling grads
In a recent talk, the speaker referred to human behavior as "agentic" which made me realize I have no idea what that word means.
costar and anduril both being haskell shops is one of the funniest PL horseshoe theory things i can imagine
@srush_nlp The first two sentences weren't in the training data
only country to ever win the super bowl 🫡🇺🇸🦅
@AlwaysLegitWhit Lol, give me one single metric by which the USA is the 'greatest country in the world'.
@DimitrisPapail @lambdaviking It's not obvious that such tokens should help. For example, in practice, we found mixed results if we were to simply finetune with such dummy tokens. (also see our argument in the intro here arxiv.org/abs/2310.02226 where we explain why it may help/hurt/do nothing)
Time is a rebranded flat circle moment: -it is 2017, I am writing about how text simulations are necessary for AI training -it is 2024, I am writing about how text s̵i̵m̵u̵l̵a̵t̵i̵o̵n̵ synthetic data is necessary for AI training
Why circuit complexity theory for LLMs might matter, why it might not
@BrianKitano good to remember that there's only so much magic that can come out of ~O(n*d+d^2) flops per generated token. well unless you generate a lot of them i guess.
A Catholic group launched an AI priest but quickly defrocked “Father Justin” after he claimed to be a real member of the clergy. The Padre advised one member of his flock that it was okay to baptize his child in Gatorade. catholicherald.co.uk/ai-priest-gets…
@lambdaviking @DimitrisPapail oh, yeah, totally agreed! we don't touch upon length generalization at all: we fix the task complexity to be something finite and argue how that's implementable iff proportional number of dummy tokens are introduced.
@lambdaviking @DimitrisPapail (ha, np, we added it during the ICLR review process in Nov but uploaded it to arxiv only in March and didn't post on twitter yet. I'd have missed it too!)
@DimitrisPapail @lambdaviking Wonder what you think of the initial theory in Sec J of the pause tokens paper (arxiv.org/abs/2310.02226). The argument is that if the attention layer is sufficiently parameterized, additional dummy tokens can help the model realize a larger number of parallel computations.
Talk: "OLMo: Findings of Training an Open LM" from Hanna Hajirshizi at AI2 from OSGAI. Extremely interesting overview of the 4 parts (Data, Training, Adaptation, Eval) of the OLMo open LLM project. Rare insight into how these processes work at scale. youtube.com/watch?v=qFZbu2…
Life update: today is my first day as a Member of Technical Staff at @cohere!
@lambdaviking but why not call it "Chain of Dots?"