Colin Raffel @colinraffel
nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlp colinraffel.com Joined March 2017-
Tweets2K
-
Followers30K
-
Following655
-
Likes3K
🚀 Introducing Pile-T5! 🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from the Pile using the Llama tokenizer. ✨ Featuring intermediate checkpoints and a significant boost in benchmark performance. Work done by @lintangsutawika, me…
I love music most when it’s live, in the moment, and expressing something personal. This is why I’m psyched about the new “DJ mode” we developed for MusicFX: aitestkitchen.withgoogle.com/tools/music-fx… It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,…
{UCSB|AI2|UW|Stanford|MIT|UofT|Vector|Contextual AI} present a survey on🔎Data Selection for LLMs🔍 Training data is a closely guarded secret in industry🤫with this work we narrow the knowledge gap, advocating for open, responsible, collaborative progress arxiv.org/abs/2402.16827
Crowd-sourcing human feedback for open-source LLMs? 💬🤖 Let's make it happen together! 💪 chromewebstore.google.com/detail/sharelm… W @LChoshen @AbendOmri
Crowd-sourcing human feedback for open-source LLMs? 💬🤖 Let's make it happen together! 💪 chromewebstore.google.com/detail/sharelm… W @LChoshen @AbendOmri
T5 Reunion! (@NoamShazeer was replaced by a sentinel token)
I'll be at #NeurIPS2023 supporting my collaborators who are presenting arxiv.org/abs/2306.01708, arxiv.org/abs/2305.16264, arxiv.org/abs/2302.00674, and neurips.cc/virtual/2023/p…. Find me to chat about decentralizing/democratizing/de-risking ML!
New preprint! Introducing MaTS - a new framework for merging individual task models into a multitask model by matching them in their task subspace Work done w/ @mohitban47 @colinraffel 📄 arxiv.org/abs/2312.04339 💾 github.com/r-three/mats 🧵 ⬇️
New blog post where I argue that "large language model development" can be considered a new subfield that grew out of deep learning, NLP, etc. and reflect on what to do when your field of study gives birth to a new one: craffel.github.io/blog/language-…
Also, I am 1000% hiring PhD students this round! If you want to work on - open models - collaborative/decentralized training - building models like OSS - coordinating model ecosystems - mitigating risks you should definitely apply! Deadline is Friday 😬 web.cs.toronto.edu/graduate/how-t…
Also, I am 1000% hiring PhD students this round! If you want to work on - open models - collaborative/decentralized training - building models like OSS - coordinating model ecosystems - mitigating risks you should definitely apply! Deadline is Friday 😬 web.cs.toronto.edu/graduate/how-t…
Presenting ComPEFT 🗜! We compress parameter updates to facilitate efficient communication of expert models for compositional generalization. ComPEFT improves perf. 📈, while reducing storage/communication costs 📉 buff.ly/49Qaryo @LChoshen @colinraffel @mohitban47 🧵
Introducing RAD, a cheap and efficient method for using an auxiliary reward model for controlling text generation that can match the performance of methods that update the LM. 📝arxiv.org/abs/2310.09520 💾github.com/haikangdeng/RAD 🧵⬇️ 1/
Looking for a full-time role in AI / large language models (research/eng potentially focused on health). Know anyone to chat with? Please RT/forward my CV (jaan.io/cv)/DM/connect me. I've built large language models, have 1000+ citations (NeurIPS, ICML, AISTATS).
Our work on Data Augmentation for Learning from Limited Data has been accepted to #TACL! We are presenting it at #ACL2023 on Wed 11:00-12:30 in Session 7. Paper: transacl.org/index.php/tacl… Poster + Video: virtual2023.aclweb.org/paper_T4291.ht… @jiaao_chen @colinraffel @mohitban47 @Diyi_Yang
Our work on Data Augmentation for Learning from Limited Data has been accepted to #TACL! We are presenting it at #ACL2023 on Wed 11:00-12:30 in Session 7. Paper: transacl.org/index.php/tacl… Poster + Video: virtual2023.aclweb.org/paper_T4291.ht… @jiaao_chen @colinraffel @mohitban47 @Diyi_Yang
We just pushed a new update adding support for the (very impressive) safetensors library from our friends at @huggingface! Git-Theta's plug-in system meant that we spent more time waiting on CI/CD than actually adding support (I'll get off my soapbox now 🧼📦).
We just pushed a new update adding support for the (very impressive) safetensors library from our friends at @huggingface! Git-Theta's plug-in system meant that we spent more time waiting on CI/CD than actually adding support (I'll get off my soapbox now 🧼📦).
(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingSoumith Chintala @soumithchintala
185K Followers 871 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pSasha Rush @srush_nlp
51K Followers 463 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzKyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Rosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRKevin Patrick Murphy @sirbayes
42K Followers 328 Following Research Scientist at Google Brain / Deepmind. Interested in Bayesian Machine Learning.Yi Tay @YiTayML
28K Followers 97 Following Chief scientist & Co-founder @RekaAILabs past: Research Scientist @Google Brain 🧠 currently learning to be a dad 🍼👶Gautam Kamath @thegautamkamath
44K Followers 502 Following Assistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Co-EiC @TmlrOrg. I lead @TheSalonML. Privacy, robustness, machine learning.Dan Roy @roydanroy
45K Followers 2K Following Research Director, @VectorInst. Canada CIFAR AI Chair. Associate Professor of Stats/CS @UofT. I study machine learning and AI, emphasis on theory.Sander Dieleman @sedielem
50K Followers 2K Following Research Scientist at Google DeepMind. I tweet about deep learning (research + software), music, generative models (personal account).Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Sara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Graham Neubig @gneubig
30K Followers 582 Following Associate professor at CMU, studying natural language processing and machine learning.Tim Dettmers @Tim_Dettmers
28K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.DataQu @DataQuChile
131 Followers 484 Following We are a Machine Learning, AI, and Software Development company based in Santiago Chile and Toronto Canada. Building the tomorrow solutions today!Josh Estelle @jestelle
1K Followers 997 Following Google Search. Search Labs. Engineer. Previously: Google Translate, Material Design. https://t.co/csni6mN2Sepadma reddy @padmareddy34726
141 Followers 3K FollowingAbdulrahman Tabaza @embed_dim
1 Followers 447 Following Enjoyer of various vector spaces and modalitiesWill @solidwillity
117 Followers 1K FollowingKiriaki Carmelita @KiriakiCar55362
102 Followers 2K Following Check out EA Gacha Drops – a collection of exclusive NFTsA.I. Walker @andreww35798698
48 Followers 56 FollowingAndrei Nicolicioiu @anNicolicioiu
416 Followers 602 Following PhD student @Mila_Quebec researching robustness to distribution shifts. Previously: @MPI_IS, ML Researcher at @Bitdefender. https://t.co/Cyy1hSjGlKVitamins DK @DonkeyVitamin
231 Followers 3K Following offer vitamins and supplements https://t.co/6lX91v1au9 https://t.co/zo3V69PMED https://t.co/sQmtttMy3w https://t.co/4AsuEubyqN https://t.co/wNZJnwTH4t https://t.co/QTd4LCNYKe https://t.co/tlG6l5uHd5 https://t.co/P7PbTJ3ntJ 🌱Tomasz @ketomke
39 Followers 92 FollowingLorenzo Cesconetto @l_cesconetto
61 Followers 168 Following cesconetto.eth Web2 + Web3 Developer #blockchain #DeFi #solidity #EVM #smartContractseigenome @eigenome
38 Followers 54 FollowingJan Gray @jangray
5K Followers 2K Following computing with FPGAs; ex MS dev tools architect; https://t.co/raSpGdhYZ3; https://t.co/xpX0gKABG6; https://t.co/i8BjreGEmx; co-chair RVI SIG-SOFT-CPURohan Dharurkar @RohanCodez
1 Followers 148 Following Software Engineer 🧑💻| Avid Learner 📚 | .Net Full Stack DeveloperFantastic_618 @Fantastic68190
6 Followers 798 FollowingBot man @botyetnotbot
28 Followers 274 FollowingChristina Baek @_christinabaek
772 Followers 228 Following PhD student @mldcmu | Past: intern @GoogleAIAbdullah Aziz @abdullahaziz03
5 Followers 59 FollowingAnkur Sikarwar @sikarwar_ank
65 Followers 309 Following Researcher @astar_research. Prev RA at Kreiman Lab @Harvard | @MSFTResearch. Vision + Language | Cognitive Science.Zhang Lei @zhangl
22 Followers 319 FollowingZhenting Wang @wang1999_zt
62 Followers 211 Following PhD Student @RutgersCS. Trustworthy and Responsible Generative Artificial Intelligence. Intern @SonyAI_global (current) @Meta GenAI (incoming)daniel (e/acc) ⚡ @luckfvoursme
368 Followers 6K Following Creador / Mecatrónico 🌌🛰️🌱 / Comerciante de valor 🐱👤Peter @Jolene45697641
132 Followers 3K FollowingDaniel Varab @DanielVarab
252 Followers 482 Following NLP Researcher currently at @DFKI Berlin exploring the design side of LLMs. ex. @ITUkbh, @NLPNorth, @novonordisk, @EdinburghNLPMattlc @mattlecauchois
136 Followers 776 FollowingAitor Ormazabal @aormazabalo
231 Followers 149 Following Member of Technical Staff @RekaAILabs. Prev. PhD @ixaGroup, Research Intern @Aiatmeta FAIRAhmed Moubtahij @TheAyenem
186 Followers 1K Following ing., NLP scientist @CRIM_ca ''Focus is a matter of deciding what things you're not going to do'' - John Carmack. Opinions my own.Thomas Schranz 🍄 @__tosh
9K Followers 9K Following building 🍓 Jam, an open source Clubhouse (@jam_systems)Wei Liu @WeiLiu99
163 Followers 329 Following NLP ML Studying NLP now at @ShanghaiTechUni #NLProc | Incoming PhD @hkust | Prev. @AlibabaGroupMallika @Mallika014
0 Followers 164 FollowingDailyHealthcareAI @aipulserx
19 Followers 204 Following 🚀 Daily AI healthcare updates compiled from 100+ sources (and growing)Priya Goyal @priy2201
1K Followers 497 Following Founding member @datologyai, ex-Google Deepmind, ex-Facebook AI Research (FAIR).Chelsea🤍♍️ @SkincareLashart
78 Followers 1K Following •Business Owner•Master Esthetician•Lash Artist•Cosmotologist -add me on Insta: @esthelashhair @chelsealiebermanYutong Zhang @YutongZhan56829
3 Followers 76 FollowingJenna Russell @jennajrussell
2 Followers 77 Following Incoming Cs PhD Student @umass advised by @MohitIyyer, currently @BankofAmerica NLP, formerly @CornellCISTenthLine49 @TenthLine49
5 Followers 2K Following(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingSoumith Chintala @soumithchintala
185K Followers 871 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pSasha Rush @srush_nlp
51K Followers 463 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzKyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Rosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRKevin Patrick Murphy @sirbayes
42K Followers 328 Following Research Scientist at Google Brain / Deepmind. Interested in Bayesian Machine Learning.Yi Tay @YiTayML
28K Followers 97 Following Chief scientist & Co-founder @RekaAILabs past: Research Scientist @Google Brain 🧠 currently learning to be a dad 🍼👶Gautam Kamath @thegautamkamath
44K Followers 502 Following Assistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Co-EiC @TmlrOrg. I lead @TheSalonML. Privacy, robustness, machine learning.Dan Roy @roydanroy
45K Followers 2K Following Research Director, @VectorInst. Canada CIFAR AI Chair. Associate Professor of Stats/CS @UofT. I study machine learning and AI, emphasis on theory.Sander Dieleman @sedielem
50K Followers 2K Following Research Scientist at Google DeepMind. I tweet about deep learning (research + software), music, generative models (personal account).Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Sara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Graham Neubig @gneubig
30K Followers 582 Following Associate professor at CMU, studying natural language processing and machine learning.Tim Dettmers @Tim_Dettmers
28K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Zachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Daniel Johnson @_ddjohnson
1K Followers 564 Following Researcher at @GoogleDeepMind. PhD student at @VectorInst / @UofT. Building tools to study neural nets and find out what they know. He/him.OpenBMB @OpenBMB
650 Followers 101 Following OpenBMB (Open Lab for Big Model Base), founded by @TsinghuaNLP & ModelBest Inc (面壁智能), aims to build foundation models and systems towards AGI.SynthLabs @synth_labs
12K Followers 43 Following AI Aligned with Your Vision. We’re doing cutting edge research for transparent, auditable AI alignment.nisten @nisten
10K Followers 5K Following fullstack-dev democratizing intelligence @skunkworks_ai | 🦝.ai | prev https://t.co/68jAlAVBKR |Arash Ahmadian @aahmadian_
934 Followers 532 Following Preference Training & RL @Cohere @CohereForAI, researcher @VectorInst ece @uoftJoel Jang @jang_yoel
931 Followers 478 Following PhD student @uwcse. Research Intern at @nvidiaai robotics. Prev: @allen_aiYeganeh Kordi @yeganekordi
166 Followers 374 Following PhD Student @BrownCSDept | NLP/AI/ML | Exploring alignment, instruction-following & GeneralizationRoss Wightman @wightmanr
18K Followers 1K Following Computer Vision @ 🤗. Ex head of Software, Firmware Engineering at a Canadian 🦄. Currently building ML, AI systems or investing in startups that do it better.Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai, open source science fan, @QueerInAI organizer 🤖☕️🍕they/themVolodymyr Kuleshov �.. @volokuleshov
8K Followers 998 Following AI Researcher. Prof @Cornell & @Cornell_Tech. Co-Founder @afreshai. PhD @Stanford.Nathan Lambert @natolambert
25K Followers 685 Following Figuring out AI @allen_ai, "rl boi" DM me papers. Writes @interconnectsai, talks @retortai Has phd and some credentialsOmar Sanseviero @osanseviero
31K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽Maxime Labonne @maximelabonne
12K Followers 433 Following Author of Hands-On Graph Neural Networks https://t.co/Q8victWUmR • Machine Learning ScientistNaomi Saphra @nsaphra
7K Followers 1K Following Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.Birchlabs @Birchlabs
4K Followers 172 Following ML Engineer at Anlatan (@novelaiofficial). co-author of HDiT (Hourglass Diffusion Transformers). works on diffusion models and LLMs. 日本語を勉強してる。Abhi Venigalla @abhi_venigalla
5K Followers 1K Following Researcher @Databricks. Former @MosaicML, @CerebrasSystems. Addicted to all things compute.Jacob Steinhardt @JacobSteinhardt
7K Followers 67 Following Assistant Professor of Statistics, UC BerkeleyDan Fu @realDanFu
4K Followers 176 Following CS PhD Candidate at Stanford, systems for machine learning. Sometimes YouTuber/podcaster. Academic Partner, @togethercompute.Simran Arora @simran_s_arora
2K Followers 216 Following CS PhD student at @StanfordAILab @hazyresearchmain @main_horse
8K Followers 464 Following AGI Believer. Haven't applied @OpenAI. Likes are not always endorsement.Jonathan Frankle @jefrankle
16K Followers 685 Following Chief Scientist, Neural Networks @Databricks via MosaicML. PhD @MIT_CSAIL. BS/MS @PrincetonCS. DC area native. Making AI efficient for everyone at @DbrxMosaicAIWeijia Shi @WeijiaShi2
5K Followers 963 Following PhD student @uwcse @uwnlp | Visiting Researcher @MetaAI | Undergrad @CS_UCLA | https://t.co/eLBQmgkvymOla Piktus @olapiktus
1K Followers 395 FollowingDemi Guo @demi_guo_
22K Followers 694 Following Co-founder & CEO @pika_labs | ex @StanfordAILab @HarvardMark Dredze @mdredze
4K Followers 786 Following John C Malone Professor at @JohnsHopkins @JHUCompSci @jhuclsp @jhumceh; Part time @techatbloomberg (tweets my own) Mastodon @[email protected]Edoardo Ponti @PontiEdoardo
2K Followers 389 Following Assistant Professor in #NLP at @EdinburghUni and affiliated lecturer @Cambridge_Uni | Modular deep learningTeknium (e/λ) @Teknium1
28K Followers 3K Following Cofounder @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE Support me on Github SponsorsAlignment Lab AI @alignment_lab
11K Followers 3K Following Devoted to addressing alignment. We develop state of the art open sourced AI. https://t.co/6aJDLUvuU5Eric @ericmitchellai
4K Followers 487 Following I like AI & music. Working on making LLMs easier & safer to use. Final year PhD student at Stanford advised by Chelsea Finn & Chris Manning.typedfemale @typedfemale
23K Followers 480 Following a really exciting new account "have you ever though you might be like scott alexander? very smart, but can't do math" - anonSwaroop Mishra @Swarooprm7
5K Followers 894 Following Research Scientist @GoogleDeepMind (Gemini). Pioneering LLM Research 🔥. Instruction tuning, Factuality, Reasoning and next gen Product. Opinions my own.Elizabeth Salesky @esalesk
1K Followers 656 Following PhD student @jhuclsp more commonly known as Liz ☀️ Friend of @NLPwithFriends ☀️ I like bubbles, bicycles, and language variationJessy Li @jessyjli
3K Followers 898 Following Associate Professor @UT_Linguistics, computational linguistics and #NLProcJoan Puigcerver @joapuipe
865 Followers 375 Following Software Engineer in Research at Google DeepMind, Zürich.Ana Marasović @anmarasovic
4K Followers 602 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Robin Jia @robinomial
3K Followers 751 Following Assistant Professor @CSatUSC | Previously Visiting Researcher @facebookai | Stanford CS PhD @StanfordNLPLianhui Qin @Lianhuiq
4K Followers 393 Following Incoming Assistant Professor at UCSD CSE. Currently postdoc at AI2 Mosaic. NLP, ML, AI. I’m recruiting PhD students.Polina Kirichenko @polkirichenko
3K Followers 1K Following PhD student at New York University, Visiting Researcher at @MetaAI FAIR Labs 🇺🇦Bernhard Schölkopf @bschoelkopf
14K Followers 60 FollowingGintare Karolina Dziu.. @gkdziugaite
4K Followers 107 Following Sr Research Scientist at Google DeepMind, Toronto. Member, Mila. Adjunct, McGill CS. PhD Machine Learning & MASt Applied Math (Cambridge), BSc Math (Warwick).Yao Fu @Francis_YAO_
13K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep runningVedant Misra @vedantmisra
2K Followers 292 Following AI researcher @DeepMind (Gemini, Minerva, PALM) | Alum @OpenAI (Codex, Grokking) | @HubSpot | Founder/CEO Kemvi (acq HUBS) | Physics @ColumbiaAdams Wei Yu @AdamsYu
900 Followers 448 Following Work on large language models @GoogleDeepMind (prev. Brain) and lead the multimodality efforts in Bard (a.k.a. Multibard); PhD @mldcmu.🚀 Introducing Pile-T5! 🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from the Pile using the Llama tokenizer. ✨ Featuring intermediate checkpoints and a significant boost in benchmark performance. Work done by @lintangsutawika, me…
Excited to announce that I’ll be joining @LTIatCMU this fall to work with @gneubig. Huge thanks to my recommendation writers @BlancheMinerva, @colinraffel, @arankomatsuzaki and @rmahendrarm and to my collaborator @haileysch__ at @AiEleuther for her invaluable support.
*thinking real hard about what data structure i should use in pytorch* hmm... i think i'll go with a tensor this time
✨New #CHI2024 Paper How might we empower communities to curate evaluation datasets for AI that impacts them? We present Wikibench, a system that enables communities to collaboratively curate AI datasets, while navigating ambiguities and disagreements through discussion. (1/9)
I love music most when it’s live, in the moment, and expressing something personal. This is why I’m psyched about the new “DJ mode” we developed for MusicFX: aitestkitchen.withgoogle.com/tools/music-fx… It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,…
{UCSB|AI2|UW|Stanford|MIT|UofT|Vector|Contextual AI} present a survey on🔎Data Selection for LLMs🔍 Training data is a closely guarded secret in industry🤫with this work we narrow the knowledge gap, advocating for open, responsible, collaborative progress arxiv.org/abs/2402.16827
🚨Which 𝘽𝙖𝙨𝙚 LLM is the best? Presenting 📊𝗨𝗥𝗜𝗔𝗟-𝗕𝗲𝗻𝗰𝗵! Alignment fine-tuning adds too many factors (e.g., lr, data, adapters,...) and it's hard to control all of them for fairly comparing various base LLMs. MMLU/BBH can indeed test base models but their examples…
High-quality human feedback for RLHF is expensive 💰. AI feedback is emerging as a scalable alternative, but are we using AI feedback effectively? Not yet; RLAIF improves perf *only* when LLMs are SFT'd on a weak teacher. Simple SFT on a strong teacher can outperform RLAIF! 🧵->
Finally, very grateful for feedback/discussions w/ @ShayneRedford @AkariAsai @jxmnop @colinraffel @srush_nlp @xiamengzhou @gaotianyu1350 @danqi_chen @Fluke_Ellington @lateinteraction @tengyuma @HongLiu9903 & others ♥️ 12/12
You likely missed it if you only follow ML Twitter but there's a series of mind-blowing tech reports and open-source models coming from China (DeepSeek, MiniCPM, UltraFeedback...) with so much lesson learned and experiments openly shared together with models, data, etc This…
PHATGOOSE colinraffel.com/blog/phatgoose… Imagine as the AGI system is interacting with the environment, it is slowly training a new gated LoRA for the day. At certain intervals the gated LoRA is added to the pool to incorporate new knowledge. This seems much better than a vector db.
How can we recycle specialized PEFT modules to create a generalist MoE-style model? We introduce PHATGOOSE, which learns a post-hoc routing scheme and significantly improves zero-shot generalization. 📜 arxiv.org/abs/2402.05859 📝 colinraffel.com/blog/phatgoose… 💾 github.com/r-three/phatgo…
@colinraffel @Muqeeth10 @liu_haokun Wow very interesting direction - definitely going to try this out! :D
*Learning to Route Among Specialized Experts for Zero-Shot Generalization* by @Muqeeth10 @liu_haokun @colinraffel Cool paper! A simple strategy to combine PEFT updates (e.g., LoRA) to obtain potential zero-shot generalization. arxiv.org/abs/2402.05859
@colinraffel Congrats on the amazing work🎉! I was eagerly anticipating something like this when we were building LoraHub. So thrilled to see it finally here!
Smart idea, similar vision: “Our vision is to establish a platform for LoRA modules, empowering users to share their trained LoRA modules. This collaborative approach facilitates the seamless application of LoRA modules to novel tasks” (🔗LoraHub: arxiv.org/abs/2307.13269)
How can we recycle specialized PEFT modules to create a generalist MoE-style model? We introduce PHATGOOSE, which learns a post-hoc routing scheme and significantly improves zero-shot generalization. 📜 arxiv.org/abs/2402.05859 📝 colinraffel.com/blog/phatgoose… 💾 github.com/r-three/phatgo…
@andersonbcdefg @colinraffel I’m just here for the phat geese
Mmh could we use that for SD as well 🤔 Let’s make a single "Mixture of CivitAI" model!
How can we recycle specialized PEFT modules to create a generalist MoE-style model? We introduce PHATGOOSE, which learns a post-hoc routing scheme and significantly improves zero-shot generalization. 📜 arxiv.org/abs/2402.05859 📝 colinraffel.com/blog/phatgoose… 💾 github.com/r-three/phatgo…
Time for foie gras distillation.(Great paper btw)
How can we recycle specialized PEFT modules to create a generalist MoE-style model? We introduce PHATGOOSE, which learns a post-hoc routing scheme and significantly improves zero-shot generalization. 📜 arxiv.org/abs/2402.05859 📝 colinraffel.com/blog/phatgoose… 💾 github.com/r-three/phatgo…
huge news next step literally all of arxivs adapters
How can we recycle specialized PEFT modules to create a generalist MoE-style model? We introduce PHATGOOSE, which learns a post-hoc routing scheme and significantly improves zero-shot generalization. 📜 arxiv.org/abs/2402.05859 📝 colinraffel.com/blog/phatgoose… 💾 github.com/r-three/phatgo…