Artidoro Pagnoni @ArtidoroPagnoni
PhD student in NLP at UW with Luke Zettlemoyer Seattle, WA Joined September 2015-
Tweets219
-
Followers799
-
Following427
-
Likes1K
Easily Fine-tune @AIatMeta Llama 3 70B! 🦙 I am excited to share a new guide on how to fine-tune Llama 3 70B with @PyTorch FSDP, Q-Lora, and Flash Attention 2 (SDPA) using @huggingface build for consumer-size GPUs (4x 24GB). 🚀 Blog: philschmid.de/fsdp-qlora-lla… The blog covers: 👨💻…
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length abs: arxiv.org/abs/2404.08801 repo: github.com/XuezheMax/mega…
When augmented with retrieval, LMs sometimes overlook retrieved docs and hallucinate 🤖💭 To make LMs trust evidence more and hallucinate less, we introduce Context-Aware Decoding: a decoding algorithm improving LM's focus on input contexts 📖 arxiv.org/pdf/2305.14739… #NAACL2024
I'm excited that my PhD student, Brendon Boldt (not on X) will be presenting his paper “XferBench: a Data-Driven Benchmark for Emergent Language” in the main session at NAACL this year.
Happy to share REPLUG🔌 is accepted to #NAACL2024 We introduce a retrieval-augmented LM framework that combines a frozen LM with a frozen/tunable retriever. Improving GPT-3 in language modeling & downstream tasks by prepending retrieved docs to LM inputs. 📄:…
Excited to share a new preprint on the 🩴FlipFlop Effect. We prompt LLMs with a classification task, and challenge the model by following up with “Are you sure?”. The model can confirm or flip its answer. The results? More flips than a gymnastics competition! 🤸♂️ 1/N
Expert language models go multilingual! Introducing ✨X-ELM✨(Cross-lingual Expert Language Models), a multilingual generalization of the BTM paradigm to efficiently and fairly scale model capacity for many languages! Paper: arxiv.org/abs/2401.10440
Happy to share In-Context Pretraining 🖇️ is accepted as an #ICLR2024 spotlight. We study how to pretrain LLMs with improved context understanding ability paper📄: arxiv.org/pdf/2310.10638… code: github.com/swj0419/in-con…
Happy to share In-Context Pretraining 🖇️ is accepted as an #ICLR2024 spotlight. We study how to pretrain LLMs with improved context understanding ability paper📄: arxiv.org/pdf/2310.10638… code: github.com/swj0419/in-con…
We made a QLoRA promo video for @UWITNews. It is a very nice summary of the motivation behind QLoRA and what the environment was like to develop this research. @uwcse is a perfect place for doing such research! Article: itconnect.uw.edu/making-languag… Youtube: youtube.com/watch?v=G8mn5S…
Today, I will give a talk about "The making of QLoRA" at the LLM Efficiency Challenge at 2:30pm, Room 356. I will also talk a bit about how I go about doing research, running experiments and figuring out "what works". neurips.cc/virtual/2023/c…
Is Gemini as good at complex factual reasoning as GPT4? 🤔 We use our SummEdits benchmark recently presented at #EMNLP2023 to find out! Overall, not bad. ✍️ Gemini-pro gets 75.5%, beating Claude 2.1 (74.4) & Bison (69.0), but still underperforms GPT4 (82.4) and human perf (90.9)
#NeurIPS2023 Join us at the RegML Workshop (📅 Sat, Dec 16, 1:00-1:35 PM, Room 215-216). @YangsiboHuang and @xiamengzhou will present our work "Detecting Pretraining Data in Large Language Models". 🔗: swj0419.github.io/detect-pretrai…
#NeurIPS2023 Join us at the RegML Workshop (📅 Sat, Dec 16, 1:00-1:35 PM, Room 215-216). @YangsiboHuang and @xiamengzhou will present our work "Detecting Pretraining Data in Large Language Models". 🔗: swj0419.github.io/detect-pretrai…
Extremely honored and grateful to MIT Tech Review for including me in the 2023 cohort of Innovators Under 35 from the MENA region. Big thanks to my mentors @dkroy and Jackie Cheung, my collaborators, friends and family who have supported me along the way! technologyreview.ae/%D8%A7%D9%84%D…
Catch the QLoRA oral today Hall C2 3:55pm or find @ArtidoroPagnoni @universeinanegg and me at the QLoRA poster session Hall B1+B2 #524 at 5:15pm. Looking forward to seeing you there! x.com/tim_dettmers/s…
Catch the QLoRA oral today Hall C2 3:55pm or find @ArtidoroPagnoni @universeinanegg and me at the QLoRA poster session Hall B1+B2 #524 at 5:15pm. Looking forward to seeing you there! x.com/tim_dettmers/s… https://t.co/NYEYkQZ84h
Ok I’m at NeurIPs to talk about our in person competition workshop on LLM efficiency on Friday Dec 15 between 1:30 - 4:30 pm CST Competitors had to fine tune 1 LLM in 1 day on 1 GPU and the reception was incredible. This was one of the most popular ML competitions of the year.…
Now that Mixtral 8x7b is available in🤗Transformers, you might be wondering what the heck is a Mixture of Experts? We wrote an explainer on: ❓ How MoEs differ to conventional Transformers 🏋️♀️ How they're trained 🏎️ Subtleties with inference Enjoy! huggingface.co/blog/moe
Come to our QLoRA poster tomorrow at 5:15! If you're not already using QLoRA you're missing out—you can finetune huge models on a single GPU in a day. Details: neurips.cc/virtual/2023/p…
Come to our QLoRA poster tomorrow at 5:15! If you're not already using QLoRA you're missing out—you can finetune huge models on a single GPU in a day. Details: neurips.cc/virtual/2023/p…
Shruti Rijhwani @shrutirij
4K Followers 499 Following * Research Scientist @GoogleDeepMind * #NLProc research * PhD from @LTIatCMU * Amateur woodworker, scuba diver, foosball playerKayo Yin @kayo_yin
8K Followers 560 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Siddharth Dalmia @siddalmia05
1K Followers 445 Following Research Scientist @GoogleDeepmind | #SpeechProc and #NLProc | PhD from @LTIatCMU @SCSatCMU | Ex-intern @GoogleAI, @AWSCloud, @FacebookAIGraham Neubig @gneubig
31K Followers 588 Following Associate professor at CMU, studying natural language processing and machine learning.Ofir Press @OfirPress
10K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Tim Dettmers @Tim_Dettmers
29K Followers 821 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Language Technologies.. @LTIatCMU
9K Followers 233 Following The Language Technologies Institute in Carnegie Mellon University's @SCSatCMUZachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Luyu Gao @luyu_gao
1K Followers 241 Following PhD candidate @CarnegieMellon @LTIatCMU On the job market for full-time industry position.Greg Durrett @gregd_nlp
6K Followers 752 Following CS professor at UT Austin. I do NLP most of the time. he/himShaily @shaily99
5K Followers 2K Following PhD @LTIatCMU Prev: @GoogleAI @MSFTResearch. Working on ethics and evaluation in #NLProc. Usually ranting, often about research & DEI. 📚 @readsndrantsMike Lewis @ml_perception
6K Followers 227 Following Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.Antonis Anastasopoulo.. @anas_ant
3K Followers 2K Following Assist. Prof at George Mason CS #nlproc MT, ASR, and documentation of endangered languages.Pasquale Minervini �.. @PMinervini
7K Followers 4K Following Researcher in ML/NLP at the University of Edinburgh (faculty @InfAtEd @EdinburghNLP), @ELLISforEurope, @UCL_NLP, PI for @Clarify2020, https://t.co/WydvfU8ugz he/theyMengzhou Xia @xiamengzhou
3K Followers 621 Following PhD student @princeton_nlp, MS @CarnegieMellon, Undergrad at Fudan.Melanie Sclar @melaniesclar
2K Followers 412 Following PhD student @uwnlp @uwcse | Visiting Researcher @MetaAI FAIR Labs | Prev. Lead ML Engineer @asapp, intern @LTIatCMU | 🇦🇷Bearllairt @bearllairt29457
0 Followers 107 FollowingTearthew @TearthewptNJ
0 Followers 74 FollowingLetitiaMaud @k2JjdHs8K0e9kQY
0 Followers 96 FollowingMillicent @ishikos36911530
0 Followers 748 FollowingBeatrice 🍌 @Beatrice7325
7 Followers 492 Following Prоmisсuous siren driven bу an unstoppаblе lust fоr рlеasurеLinz @lin72h
178 Followers 4K Following Someday I'm gonna make great machines that fly. And me and my friends are gonna go flying together, into the forever and beautiful sky.Nathan Benaich @nathanbenaich
51K Followers 32K Following solo member of investment staff @airstreet, brewing ambition @airstreetcafe, next token predictor @airstreetpressPensé FFun @inftyCategory
99 Followers 6K FollowingSteven Saito @ssaito
56 Followers 799 Following due to great firewall and laziness, i post more on weibo http://t.co/8cSSghiPD1Rashmi🌱 @iamrashminagpal
388 Followers 5K Following (Machine Learning) Engineer & Research Affiliate at USF 👩💻📚💜 Founder @WomenWhoGoDelhiArif Ahmad @arif_ahmad_py
276 Followers 7K Following All things AI, Computer Science and Circuits! Prev. @GoogleAIYiting Qiang @qiangyt0526
1 Followers 21 FollowingJoe Stacey @_joestacey_
569 Followers 1K Following PhD student at Imperial and Apple Scholar. I love running, NLP and travelling (in no particular order). Ex teacher and PwC Consultant. #NLProcY. Asmara @Y_asmara23
4 Followers 281 Following Passionate about embracing new experiences and pushing boundaries. Join me as I navigate life's adventures and share my insights along the way.403-error @SiyuYeAndy
2 Followers 231 Following LLM Engineer @ByteDance; AI Gamer; Backend Developer. 现役字节跳动工程师,用加班间隙的时间刷推Vincenzo Incutti @enzo__inc
149 Followers 970 Following Home: https://t.co/8OR9H2kE9N - EF LD Grad 23 @join_ef - AI & SWE @EdinburghUni - FinTech @uclcbt - Previously @polyaivoiceAlexander Wan @alexwan55
475 Followers 944 Following CS at Berkeley; @BerkeleyML @BerkeleyNLP; NLP researchEmanuel Steger @em4nue1
91 Followers 337 Following Developing web and voice apps for startups and corporates. This includes Web/iOS/Android apps and Alexa Skills.Aria-rose Prazeres @AriaPrazer95439
93 Followers 5K FollowingRegena Cauthron @CauthrRegen
53 Followers 5K FollowingLearning in Public - .. @motherofdata
162 Followers 3K Following #notetaking account. Retweets educational threads on Python, ML & Data analysis; post videos of my learnings & revisions. #learninginpublic #publicnotesShelan de Livera @shelandelivera
23 Followers 1K Following Cyber security and DevOps professional with a love for programming and machine learning: IT Architectures | Deployments | Linux | SIEM/SOC | PythonNick @2187Nick
621 Followers 1K Following Builder, Trader, Always Learning, Mystic Live GEX: https://t.co/zbw0eBVzC6 https://t.co/EcOgCjEjQ1 Goal: Build and Ship a project every week.daniel leo @doubletao
0 Followers 766 FollowingMadeline Ka @madeline_k91630
81 Followers 5K FollowingSeesmesm @seesmesm68908
64 Followers 185 Followingjuan vásquez @juanmvsa
326 Followers 2K Following inside of me there are two (gay) wolves : phd student in computer science at @CUBoulder / letterboxd filmbro (https://t.co/GAOYaQWm4Z)Matt Valoatto @mvaloatto
2K Followers 646 Following Entrepreneur, designer, investor @huggingface 🤗, @deforum_art, @talktomem1, wingmate / interested in AI, design, art, tech, science / happy dad of 2Nirmal Senthilnathan @NirmalSenthil00
4 Followers 17 Followinglove my life @xirideqinwen
8 Followers 220 Followingkonichiwa @konichiwaai
9 Followers 2K FollowingRoel Van de Paar @RoelVandePaar
710 Followers 302 FollowingSoose @Soose1159612
0 Followers 25 FollowingTrista @thysheth84717
1 Followers 257 Following You're the only person in the world who understands meMagno Felipe {{ softw.. @magnokf
108 Followers 378 Following Bombeiro-Militar🚒 ,PHP Developer 🐘 & VueJs | ReactJs, 👨🎓Estudante de Engenharia de Software e surfista quando tem onda 🌊🏄♂️sidharthtalia @sidharthtalia
22 Followers 59 Following C.S. Ph.D. with a focus on robotics at the University of Washington, advised by Dr. Siddhartha Srinivasa. Interested in getting robots to really work.Shitian Zhao @zst96687522
9 Followers 313 Following Senior Undergrad, ECNU @ECNUER Previous Intern @ CCVL @JohnsHopkins Intern @ Shanghai AI LabJinghua Zhong @zhongjinghua
0 Followers 3K FollowingShruti Rijhwani @shrutirij
4K Followers 499 Following * Research Scientist @GoogleDeepMind * #NLProc research * PhD from @LTIatCMU * Amateur woodworker, scuba diver, foosball player(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingKayo Yin @kayo_yin
8K Followers 560 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Danish Pruthi @danish037
7K Followers 628 Following Faculty at Indian Institute of Science, Bangalore. PhD from @LTIatCMU.Yoav Artzi @yoavartzi
13K Followers 162 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / https://t.co/3VmRSyYm2d / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaCYann LeCun @ylecun
712K Followers 719 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Siddharth Dalmia @siddalmia05
1K Followers 445 Following Research Scientist @GoogleDeepmind | #SpeechProc and #NLProc | PhD from @LTIatCMU @SCSatCMU | Ex-intern @GoogleAI, @AWSCloud, @FacebookAIGraham Neubig @gneubig
31K Followers 588 Following Associate professor at CMU, studying natural language processing and machine learning.Leo Boytsov @srchvrs
7K Followers 2K Following Sr. Research Scientist @AWS Labs (ph-D @LTIatCMU) working on unnatural language processing, speaking πtorch & C++. Opinions sampled from MY OWN 100T param LM.Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Divyansh Kaushik @dkaushik96
4K Followers 3K Following Emerging tech and national security. DC/PGH. “An imported Indian immigrant,” @BreitbartNews.Ofir Press @OfirPress
10K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Luca Soldaini 🎀 @soldni
6K Followers 1K Following I like tokens! Lead for OLMo data team at @allen_ai (Dolma 🍇), OSS is fun, @QueerInAI organizer 🤖☕️🍕they/them (views mine, not my employer’s)AK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxTim Dettmers @Tim_Dettmers
29K Followers 821 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Andrej Karpathy @karpathy
979K Followers 905 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Matt Shumer @mattshumer_
51K Followers 1K Following CEO @HyperWriteAI, @OthersideAI - I make AIs do the impossible.Liwei Jiang @liweijianglw
2K Followers 452 Following 姜力炜 • Ph.D. student @uwnlp 💻 student researcher @allen_ai 🧊 advance AI & understand humans 📖 lifetime adventurerMatt Gardner @nlpmattg
9K Followers 121 Following Researcher at Scaled Cognition. Formerly at Semantic Machines, @allenai (@ai2_allennlp, #nlphighlights).Jerry Wei @JerryWeiAI
5K Followers 262 Following 🧐 Improving and aligning large language models 🧠 Research Engineer @GoogleDeepMind ⏰ Past: @Stanford, @Google BrainJulia Mendelsohn @jmendelsohn2
1K Followers 812 Following PhD candidate @UMSI & @Google PhD fellow. NLP, computational social science, linguistics, polcomm; past CS + Lx @StanfordKai-Wei Chang @kaiwei_chang
6K Followers 711 Following Associate Professor @UCLAengineering/@UCLA. Area: #NLProc/#ML/#AI https://t.co/zj1ssZj9oxMachel Reid @machelreid
2K Followers 1K Following Research Scientist @GoogleDeepMind Working on LLMs on the Gemini Team; did gemini 1.5 proCognition @cognition_labs
123K Followers 19 Following Makers of Devin, the first AI software engineer. We are an applied AI lab focused on reasoning, and code is just the beginning. Join us: https://t.co/tpfZwEwGiqNorthwest Avalanche C.. @nwacus
9K Followers 125 Following Northwest Avalanche Center: providing avalanche and mountain weather forecasts for the PNW. #NWAC Find us on Instagram: @nwacusTri Dao @tri_dao
19K Followers 365 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.william @wgussml
4K Followers 440 Following prev CMU PhD, OpenAI research scientist, helped build copilot & MineRL. working on a new startup. https://t.co/kz3WdDeyfyLior⚡ @AlphaSignalAI
84K Followers 898 Following Covering the latest in AI R&D • ML Engineer • Ex-Mila researcher • MIT Lecturer • Building AlphaSignal, a technical newsletter read by 180,000+ ML experts.Wenhu Chen @WenhuChen
11K Followers 520 Following AI researcher @UWaterloo @GoogleAI @VectorInst. Interested in natural language processing, diffusion models. I direct TIGER-Lab at UWaterloo.Mark Zuckerberg @finkd
760K Followers 748 FollowingEmanuele Rodolà @EmanueleRodola
688 Followers 82 Following Head of GLADIA @SapienzaRoma and fellow of @SSAS_Sapienza, @ELLISforEurope and @yacadeuro. Lover of crazy ideas and anything passion-driven. #EUFundedRulin Shao @RulinShao
615 Followers 396 Following PhD @UWNLP | MS @SCSatCMU | ex-Applied Scientist @AWSJeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @Stanford@[email protected].. @smolix
21K Followers 121 Following AutoML with https://t.co/xqkK2q7L02 - learn ML with https://t.co/9W8dBWESkW - join us at https://t.co/uY2XbWTgaTAakanksha Chowdhery @achowdhery
7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to changeShayne Longpre @ShayneRedford
4K Followers 997 Following PhD @MIT. Prev: @Google Brain, @apple ML, @stanfordnlp. 🇨🇦 Interests: AI/ML/NLP, Data-centric AI, transparency & societal impactJacob Portes @JacobianNeuro
670 Followers 1K Following Research Scientist @MosaicMLxDatabricks. I like it when neuroscience inspires AI 🧠+🖥️Galen @G413N
324 Followers 122 FollowingUiPath @UiPath
105K Followers 5K Following We envision a world with a 🤖 for every person. Dedicated to accelerating human achievement via an #AI-powered end-to-end #automation platform.Zhiting Hu @ZhitingHu
3K Followers 352 Following Assist. Prof. at UC San Diego; Artificial Intelligence, Machine Learning, Natural Language ProcessingMark Saroufim @marksaroufim
9K Followers 654 Following @pytorch dev broadly interested in performance https://t.co/6KJ328JUwvPasquale Minervini �.. @PMinervini
7K Followers 4K Following Researcher in ML/NLP at the University of Edinburgh (faculty @InfAtEd @EdinburghNLP), @ELLISforEurope, @UCL_NLP, PI for @Clarify2020, https://t.co/WydvfU8ugz he/theyAndreas Vlachos @vlachos_nlp
5K Followers 1K Following Professor in NLP/ML at @Cambridge_CL, Fellow of @FitzwilliamColl, @ELLISforEurope memberPika @pika_labs
116K Followers 53 Following Video on command. Website: https://t.co/G5bjmrMQsx Discord: https://t.co/bX68ThPTQH About: https://t.co/atvdcgbe9SBenjamin Muller @ben_mlr
815 Followers 2K Following Research in AI. Focusing on scaling models to the largest number of languages. Postdoc at FAIR @metaai.Madrona @MadronaVentures
13K Followers 1K Following Investing in seed, early, and acceleration stage technology entrepreneurs and companies in the Pacific Northwest and beyond since 1995.Nina Beguš @ninabegus
3K Followers 2K Following Researcher @UCBerkeley Founder @Interpret_AI #ArtificialHumanitiesArsen Ostrovsky 🎗�.. @Ostrov_A
279K Followers 6K Following International Human Rights Lawyer, CEO at @The_ILF, Proud Zionist, Father of girls! Opinions mine. #BringThemHome🎗️Israel War Room @IsraelWarRoom
308K Followers 6K Following Israel’s enemies do not sleep. Neither do we.Sneha Kudugunta @snehaark
2K Followers 747 Following addicted to tpus @GoogleDeepMind @uwcse | varying proportions of AI and mediocre jokes (not mutually exclusive) | she/her/hersSeattle Police Depart.. @SeattlePD
578K Followers 1K Following Seattle PD news/events. Not Monitored. Call 911 to report emergencies. Privacy Policy: https://t.co/T5EaWoa7EZ * Preliminary Info Subject To ChangeTeven Le Scao @Fluke_Ellington
2K Followers 551 Following Researcher @MistralAI, producer @ my bedroom, no BLOOM slander authorized on this accountShizhe Diao @shizhediao
1K Followers 928 Following On job market actively seeking industry positions ML NLP PhD | Intern @BytedanceTalk @sinovationvc Finetune your own LLMs with LMFlow: https://t.co/UTykmQAYPTChip Huyen @chipro
92K Followers 444 Following Data processing on GPUs @VoltronData Designing ML Systems: https://t.co/G81hL2dWmr @designmlsys #AI x #GPUSoumith Chintala @soumithchintala
186K Followers 883 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Databricks Mosaic Res.. @DbrxMosaicAI
30K Followers 115 Following We remove the barriers to state-of-the-art generative AI model development and make data + AI available to all.Patrick Fernandes @psanfernandes
534 Followers 237 Following PhD Student @LTIatCMU & @istecnico Previously research @Google, @Microsoft & @UnbabelRowan Zellers @rown
6K Followers 877 Following Researcher at @OpenAI studying multimodality - vision&language&sound. he/him. website: https://t.co/5Er4j3qN91 , mastodon: @[email protected]WA Fire News @WAFireNews
3K Followers 54 Following Washington State Fire News - Monitoring Fire & Severe weather related incidents in WA State. - Posting Aggragated Feeds - #Wa #FireNews🏆Thrilled to share that VideoCon won the Best Paper Award at the Data Problems for Foundation Models #ICLR2024! I will present the work in 🇦🇹 Also, happy to share that I will be interning at @GoogleDeepMind w/ @kazemi_sm this summer! Happy to connect with folks in ICLR.
📢 📽✍️We introduce VideoCon, a video-text dataset for training SOTA alignment model. It resolves a typical issue in video-text alignment models that struggles with robustness. w/ @YonatanBitton, Idan Szpektor, @kaiwei_chang , @adityagrover_ video-con.github.io 🧵 1/
Will your paper catch the eye of @_akhaliq? I built a demo that predicts if AK will select a paper. It has 50% F1 using DeBERTa finetuned on data from past year. As a test, our upcoming WildChat arXiv has a 56% chance. Hopefully not a false positive🤞 🔗huggingface.co/spaces/yuntian…
I don't think it's productive or effective for a PhD student to ever lead more than 1 project simultaneously. If anything, I think leading 0.5 projects is even better (see SWE-bench & SWE-agent which Carlos and John co-led) Focusing is really important.
Out of curiosity, do AI PhDs normally work (lead) on several projects simultaneously? I have never managed to work on more than one project during my PhD and I tried to convince my students not to do so. The paradigm might have already changed, so I am asking here.
🆕I'm excited to share that I'll start my Ph.D. at @UChicago within @UChicagoCI under Prof. @MinaLee__ 's guidance and Prof. Ari Holtzman (@universeinanegg)'s co-advise! I hope to bring my LLM generation and evaluation works to a more human-centered and interactive stage.
i really hope phi 3 proves us wrong about evaluation doping and it is actually an amazing model. But, being an outlier on log compute <-> MMLU plots is a little sus.
There is a really nice community of researchers developing transformer alternatives. Want to highlight these impressive folks. Simran Arora (@simran_s_arora), Chunting Zhou (@violet_zct), Dan Fu (@realDanFu), and Songlin Yang (@SonglinYang4)
Our team in FAIR (at Meta) is hiring researchers (RS & PostDoc) to work on the broad topics of text and multimodal LLMs. Location: NY, Seattle or Menlo Park for RS, and Seattle for PostDocs. PostDoc: metacareers.com/jobs/968496244… Research Scientist, AI (PhD): metacareers.com/jobs/752169417…
While wrapping up my FSDP + QLora Llama 3 blog, I noticed that ~80 samples (stacked to 3k sequence length) are enough for Llama 3 70B to "converge" to the instruction templates. Thats impressive!🤯 If you are using fine-tuned GPT-3.5 models, you might want to look into Llama 3…
Great lineup of speakers at our second Disinformation Day at UT Austin! Registration open to all!
Join us on May 2 for Disinformation Day 2024! This virtual event brings together researchers and thought leaders from a variety of disciplines and sectors to discuss approaches to curbing the spread of digital disinformation. Learn more and register: disinfoday.github.io
Llama3-70B has settled at #5. With 405B still to come next... I remember when GPT-4 released in March 2023, it looked like it was nearly-impossible to get to the same performance. Since then, I've seen @Ahmad_Al_Dahle and the rest of the GenAI org in a chaotic rise to focus,…
Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥 We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an…
Easily Fine-tune @AIatMeta Llama 3 70B! 🦙 I am excited to share a new guide on how to fine-tune Llama 3 70B with @PyTorch FSDP, Q-Lora, and Flash Attention 2 (SDPA) using @huggingface build for consumer-size GPUs (4x 24GB). 🚀 Blog: philschmid.de/fsdp-qlora-lla… The blog covers: 👨💻…
Wow 15T tokens of open data! Imagine the amount of 💸💸 I’d need to burn to bring this to infini-gram … 🤣
Data is all we need! 👑 Not only since Llama 3 have we known that data is all we need. Excited to share 🍷 FineWeb, a 15T token open-source dataset! Fineweb is a deduplicated English web dataset derived from CommonCrawl created at @huggingface! 🌐 TL;DR: 🌐 15T tokens of cleaned…
Yes, both the 8B and 70B are trained way more than is Chinchilla optimal - but we can eat the training cost to save you inference cost! One of the most interesting things to me was how quickly the 8B was improving even at 15T tokens.
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…
Check out @LiyanTang4's great work! Using very clever synthetic data generation schemes, he trained a very strong fact-checking model, which can get GPT4-level accuracies, while being 400x cheaper. The model which is on HF will be very useful in RAG/summarization settings.
🔎📄New model & benchmark to check LLMs’ output against docs (e.g., fact-check RAG) 🕵️ MiniCheck: a model w/GPT-4 accuracy @ 400x cheaper 📚LLM-AggreFact: collects 10 human-labeled datasets of errors in model outputs arxiv.org/abs/2404.10774 w/ @PhilippeLaban, @gregd_nlp 🧵
Llama3-8B and 70B have dropped!! Extremely grateful to have been part of this journey. More coming soon :) llama.meta.com/llama3/
✨Excited to finally drop our new paper: SSMs “look like” RNNs, but we show their statefulness is an illusion🪄🐇 Current SSMs cannot express basic state tracking, but a minimal change fixes this! 👀 w/ @jowenpetty, @Ashish_S_AI arxiv.org/abs/2404.08819
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length abs: arxiv.org/abs/2404.08801 repo: github.com/XuezheMax/mega…