Nikos Pappas @nik0spapp
Senior Applied Scientist at @awscloud #NLProc #ML 🤖 Previously Postdoc @uwcse, @Idiap_ch, PhD @epfl_en. nik0spapp.github.io Seattle, Washington Joined June 2009-
Tweets583
-
Followers731
-
Following667
-
Likes4K
Our 2020 paper "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" with @angeloskath @apoorv2904 and @nik0spapp reached 1000 citations! proceedings.mlr.press/v119/katharopo…
We're recruiting research interns to work on next-generation conversational modeling in AWS AI Labs @awscloud 🤖💬 DM me or apply directly if interested to join us! #NLProc #deeplearning #conversationalAI #internship
Today we are excited to announce a new partnership with @awscloud! 🔥 Together, we will accelerate the availability of open-source machine learning 🤝 Read the post 👉 huggingface.co/blog/aws-partn…
Updating ML models can introduce unseen errors, such as a virtual assistant 🤖 suddenly not understanding your often used command. How to avoid this? ✨ Backward Compatibility During Data Updates by Weight Interpolation ✨ 📜 arxiv.org/abs/2301.10546 ⌨️ github.com/amazon-science…
Linear-complexity models are cool, but shouldn't they work best on loooong documents? We tried RFA on doc-level translation, and got >2x speedup with memory savings, >7x when memory is controlled, and >19x on CPU. Similar/better BLEU; some consistency scores are slightly hurt 1/2
We also found that adding a gate to control information flow helps, which can be easily done with the RFA formulation. #emnlp2022 findings. Come check us out in the @SustaiNLP2022 workshop on Wednesday! With @haopeng_nlp @nik0spapp @nlpnoah arxiv.org/abs/2210.08431 2/2
I am looking for PhD students to join my lab @UMRobotics @UMich in Fall 2023! You'll already find my name on the #robotics dept/application website. Deadline is Dec 1, GRE not required, and there are application fee waivers!
Introducing 📑 The Stack - a 3TB dataset of permissively licensed code in 30 programming languages. hf.co/datasets/bigco… You want your code excluded from the model training? There is an opt-out form and data governance plan: bigcode-project.org/docs/about/the… Let's take a tour🧵
12X faster transformer model, possible? Yes, with @OpenAI Triton kernels! We release Kernl, a lib to speedup inference of transformer models. It's very fast (sometimes SOTA), 1 LoC to use, and hackable to match most transformer architectures. github.com/ELS-RD/kernl 🧵
It's a joke that all NLP talks must include this graph. But if you are a student it is a bit intimidating. How can you become an expert in where we are going if you can barely run BERT? I asked twitter for specific advice that you might focus on:
Paper accepted to @NeurIPSConf 🎉 Big credit goes to @deng_cai who led the work during his internship at @awscloud and special acknowledgment to my collaborators! You can check out the arXiv pre-print in the meantime arxiv.org/abs/2202.02976
We release the public beta for bnb-int8🟪 for all @huggingface 🤗models, which allows for Int8 inference without performance degradation up to scales of 176B params 📈. You can run OPT-175B/BLOOM-176B easily on a single machine 🖥️. You can try it here: docs.google.com/document/d/1Jx… 1/n
Can’t wait to attend #NAACL2022 with Lex AWS AI and meet colleagues in person next week. If you are into efficiency, out-of-domain generalization and calibration/robustness topics, let’s chat!
We are hiring and we are attending ACL 2022, please find me to chat if you are interested! #ACL2022
Does Vocabulary Selection cause human perceived translation quality degradations not visible in BLEU? Yes! Find out more in our #NAACL2022 paper: arxiv.org/abs/2205.06618 Code: github.com/awslabs/sockey… joint work with @EvaHasler @sonytrenous @ketran @hifelix84 @unattributed
Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Tim Dettmers @Tim_Dettmers
29K Followers 818 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.William Wang @WilliamWangNLP
14K Followers 716 Following UCSB NLP Lab + ML Center. https://t.co/6TOnqbk6YT https://t.co/KJYhnav3Et Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS. Areas: #NLProc, Machine Learning, AI.Ana Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Sebastian Ruder @seb_ruder
80K Followers 1K Following Multilingual LLMs @cohere • Prev: @GoogleDeepMind • Newsletter: https://t.co/7JGh2qpG98Ofir Press @OfirPress
10K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Nathan Schneider @complingy
4K Followers 1K Following Computational Linguist and Professional Nerd at Georgetown University he/him pronouns, ALL the prepositions @[email protected] @complingy.bsky.socialNaomi Saphra @nsaphra
7K Followers 1K Following Waiting on a robot body. ML/NLP. All opinions are universal and held by both employers and family. Same username on every lifeboat off this sinking ship.Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Thomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceAlireza Mohammadshahi @alireza_mshi
545 Followers 790 Following Co-founder of @LeerooAI | Ex-@MetaAI, @EPFLSuraj Srinivas @Suuraj
927 Followers 984 Following Postdoc @harvard / PhD @epfl_en / Bangalorean 🇮🇳 / trying to understand deep learningjerry @gerasimoss
1K Followers 1K Following he/him, assist. prof. @MaastrichtU data miner+machine learner,☕ addict,📺 🎬 binge-watcher, #avgeek ✈️, G&T🍸, proud geek- views my ownIdiap Research Instit.. @Idiap_ch
3K Followers 168 Following The Idiap Research Institute is among the most active independent #research institutions in #InformationTechnology. #ArtificialIntelligence #techtransfer #aiNathan Benaich @nathanbenaich
51K Followers 32K Following solo member of investment staff @airstreet, brewing ambition @airstreetcafe, next token predictor @airstreetpressChryssa Zerva @chryssaZrv
378 Followers 325 Following Assistant Professor @informatica_IST, @ist_tecnico. Interested in understanding uncertainty in data, models, life. NLP, ML and climbing fan.Eva Louise Marie Gabr.. @e681554349
8 Followers 3K FollowingImperial NLP @imperial_nlp
70 Followers 336 Following We are the Natural Language Processing community here at Imperial College London. Looking forward to sharing more of our work over the coming months! #NLProcJoe Stacey @_joestacey_
574 Followers 1K Following PhD student at Imperial and Apple Scholar. I love running, NLP and travelling (in no particular order). Ex teacher and PwC Consultant. #NLProcHeidi Kiehn @kiehn_heid65154
80 Followers 5K FollowingPensé FFun @inftyCategory
114 Followers 6K FollowingShamik Roy @ShamiikRoy
136 Followers 230 Following Applied Scientist at Amazon Web Services (AWS), NLP researcher, Ph.D. in Computer Science.Cha_rity @Charity1171852
14 Followers 652 FollowingAndrew Curran @AndrewCurran_
10K Followers 7K Following Atypically Friendly - I write about AI and human creativity. Will periodically make extremely unusual arguments.Allan Zhou @AllanZhou17
1K Followers 442 Following Final-year AI PhD student @Stanford. NN architecture design, learned optimizers, and hparam optimization.pigiou @pigiou
64 Followers 703 FollowingGiorgio Giannone @georgosgeorgos
194 Followers 3K Following PhD Candidate @DTU_Compute • Generative Models for Engineering and ScienceAI Deeply @AiDeeply
405 Followers 5K Following AI is reshaping the world. Who are the people and companies driving the change? Visit our website to search more than 5,000 profiles.Lorenzo @ragazzogagaga
112 Followers 2K Followingtanmay patil @tanmay_patil
392 Followers 4K Following Software engineering server less , payments, open source, API's , open banking , data structures and algorithms ,architecture , scalabilitySaeed @pastprestige
18 Followers 168 Followingjimcaptain @dkapt_
157 Followers 2K FollowingMátyás Vincze @vinczematyas_
12 Followers 981 Following PhD Student focusing on Cooperative AI @UniTrento_DISI @FBK_researchTomáš Daniš @tmdanis
615 Followers 356 Following Natural intelligence researching artificial intelligence. Tweets about AI, alignment, tech, language learning and anything else I find interestingTealet @Tealet129371
108 Followers 2K FollowingPratik Ramesh @pratiksiyal
59 Followers 485 Following ML PhD student at Georgia Tech | Working on Model Merging and Efficient Deep Learning.Moshetha @moshetha11611
258 Followers 3K Following I'm new to Twitter accounts so I tried the messaging feature and it's great to meet you.Foune @Foune181044
114 Followers 3K FollowingMaël Jullien @Mael_Jullien
143 Followers 991 Following Ph.D. Student in CS @UnivManchester | MSc in CS & AI @UnivNottingham | Exploring Neuro-Symbolic Models in NLI for Clinical Applications @explAIn_lab 🇫🇷 🇬🇧Lotisar @Lotisar176494
64 Followers 922 FollowingYuntian Deng @yuntiandeng
3K Followers 3K Following #NLProc Postdoc @ai2_mosaic | Assistant Professor @UWaterloo '24 | Faculty Affiliate @VectorInst '24 | PhD @HarvardTeteshesm @teteshesm53131
9 Followers 1K Following When you come to this world, you must make an effort to see all the beauty.Matthieu Perrot @MatthieuPerrot
209 Followers 2K Following AI Research Scientist & Tech Director @Loreal Research | previously @PhilipsHealth @CEA_Officiel | @scikit_learnHarish Tayyar Madabus.. @harish
2K Followers 1K Following Lecturer (~Assistant Professor) in Artificial Intelligence. Work on Deep Learning for #NLProc and Deep Contextual Meaning RepresentationsZijian Wang @zijianwang30
303 Followers 272 Following Senior Scientist at AWS AI Labs @amazonscience. LLM for code and beyond. DM for research internship opening. Past: @StanfordNLP @StanfordSymSys @UMich. #NLProcThoalloor @thoalloor79464
0 Followers 33 FollowingHarpreet Singh @harpreet_utd
2 Followers 714 FollowingCynthia Testa @tes95035
95 Followers 955 FollowingParham Aghdasi @parhamaghdasi
599 Followers 4K Following Building AI-Powered OS for Concrete, Founder & CEO @AICRETE, Ph.D. from @UCBerkeley #ai #concrete 🇺🇸Don Metzler @metzlerd
3K Followers 602 Following Research Scientist at Google Research. Research interests: Large Language Models, Machine Learning, Information Retrieval.Fedro Christian @fedrochristian_
209 Followers 5K FollowingRuyuan Wan @Ruyuan_Wan
675 Followers 1K Following PhD student @ND_CSE | HCI + NLP + Social Computing | Pre: @UMNComputerSci @UMNLinguistics @UMNStatistics Intern @Samsung_RA, @Tencent | Opinions are my ownCarmen @Carmen12372280
20 Followers 1K FollowingShreshth Saini @shreshthsaini
143 Followers 865 Following PhD @ UT Austin | Engineer | My live revolves around ML/DL, CV, Video Engineering, and AI for Healthcare.Berivan Isik @BerivanISIK
3K Followers 2K Following PhD @StanfordAILab. Scalable & trustworthy ML, transfer learning, language models, federated learning, privacy | prev: @Google @AWSCloud @VectorInst(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingAkari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Andrej Karpathy @karpathy
978K Followers 904 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Yann LeCun @ylecun
710K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistAK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxTim Dettmers @Tim_Dettmers
29K Followers 818 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.William Wang @WilliamWangNLP
14K Followers 716 Following UCSB NLP Lab + ML Center. https://t.co/6TOnqbk6YT https://t.co/KJYhnav3Et Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS. Areas: #NLProc, Machine Learning, AI.Ana Marasović @anmarasovic
4K Followers 604 Following Asst prof @UUtah · Ex @allen_ai @uwnlp postdoc @HD_NLP PhD · she/her 🇭🇷Yi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼Leshem Choshen 🤖�.. @LChoshen
4K Followers 550 Following 🥇 Collaborative LLMs 🥈 Opinionatedly sharing #ML & #NLP 🥉 Propagating us underdogs we owe science an alternative hype @IBMResearch & @MIT_CSAILDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Sebastian Raschka @rasbt
266K Followers 905 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.Sebastian Ruder @seb_ruder
80K Followers 1K Following Multilingual LLMs @cohere • Prev: @GoogleDeepMind • Newsletter: https://t.co/7JGh2qpG98Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzFelix Hill @FelixHill84
9K Followers 777 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else'sJan Leike @janleike
44K Followers 321 Following ML Researcher, co-leading Superalignment @OpenAI. Optimizing for a post-AGI future where humanity flourishes.Alexa R. Tartaglini @ARTartaglini
429 Followers 478 Following Research scientist @ the Human & Machine Learning Lab, NYU // Interested in representational alignment, interpretability, & more esoteric thingslmsys.org @lmsysorg
36K Followers 171 Following Large Model Systems Organization. We created Vicuna and Chatbot Arena! Compare 30+ LLMs (GPT-4/Claude/Llamas) side-by-side at https://t.co/IDFeIDIOtmKeunwoo Choi @keunwoochoi
6K Followers 800 Following AI x {LLM Engineer @PrescientDesign @genentech, Advisor @gaudiolab}. music, audio, language, AI. Prev: @BytedanceTalk, @spotify, @c4dm @qmul.Dilip Krishnan @dilipkay
750 Followers 61 Following https://t.co/eJ1xsnaAff Working on next-gen LLM's @ GoogleGrant Sanderson @3blue1brown
365K Followers 362 Following Pi creature caretaker. Contact/faq: https://t.co/brZwdQfdifDavid W. Romero @davidwromero
2K Followers 338 Following Research Scientist, Efficient Generative AI @NVIDIA. PhD Candidate Efficient DL @VUAmsterdam. Prev: @GoogleAI, @Qualcomm, @Merl_news. Opinions my own.Aaron Defazio @aaron_defazio
6K Followers 362 Following Research Scientist at Meta working on optimization. Fundamental AI Research (FAIR) teamArmand Joulin @armandjoulin
4K Followers 344 Following principal researcher, @googledeepmind. ex director of emea at fair @metaai. mostly work on open projects: fasttext, dino, llama, gemma.Shamik Roy @ShamiikRoy
136 Followers 230 Following Applied Scientist at Amazon Web Services (AWS), NLP researcher, Ph.D. in Computer Science.Hung-yi Lee (李宏�.. @HungyiLee2
3K Followers 20 Following Hung-yi Lee is currently a professor at National Taiwan University. He owns a YouTube channel teaching deep learning in Mandarin.Avner May @avnermay
127 Followers 202 Following Staff Research Scientist at https://t.co/WEMkSSRVeZ. Formerly research scientist at Google, postdoc at Stanford, and PhD student at Columbia.Andrew Curran @AndrewCurran_
10K Followers 7K Following Atypically Friendly - I write about AI and human creativity. Will periodically make extremely unusual arguments.Prof. Anima Anandkuma.. @AnimaAnandkumar
25K Followers 2K Following Bren Professor @caltech, Fmr Sr Director of #AI research @nvidia, Fmr Principal Scientist @awscloud, AI+Science, PDE, Neural operators. Views my own.Jennifer Hu @_jennhu
2K Followers 97 Following Research Fellow at @Harvard and incoming Asst Prof at @JohnsHopkins interested in language, computation, and cognition. @jennhu.bsky.socialKonstantina Palla @konstantina_pll
2K Followers 1K Following ML & all things research. Senior research scientist @SpotifyResearch. Affinity workshops chair @NeurIPSConf 2023. Previously @MSFTResearch. All opinions my own.Allan Zhou @AllanZhou17
1K Followers 442 Following Final-year AI PhD student @Stanford. NN architecture design, learned optimizers, and hparam optimization.Peter Hase @peterbhase
2K Followers 685 Following Google PhD Fellow at @uncnlp. Interested in interpretable ML, natural language processing, AI Safety, and Effective Altruism.Chryssa Zerva @chryssaZrv
378 Followers 325 Following Assistant Professor @informatica_IST, @ist_tecnico. Interested in understanding uncertainty in data, models, life. NLP, ML and climbing fan.Alexandre Ramé @ramealexandre
1K Followers 658 Following Research scientist @GoogleDeepMind. Ph.D. @Sorbonne_Univ_. Averaging weights to align AIs.Taco Cohen @TacoCohen
21K Followers 3K Following Deep learner at FAIR. Into codegen, equivariance, generative models. Spent time at Qualcomm, Scyfer (acquired), UvA, Deepmind, OpenAI.Noam Brown @polynoamial
34K Followers 610 Following Researching reasoning @OpenAI | Co-created Libratus/Pluribus, the first superhuman no-limit poker AIs | Co-created CICERO | PhD from @SCSatCMUOptimaLab @optimalab1
1K Followers 232 Following Optimization for ML at Rice University led by Prof. Anastasios Kyrillidis - Efficient training methods, non-convex optimization, and more.Lenka Zdeborova @zdeborova
13K Followers 421 Following Professor at EPFL. Une mathémaphysinformaticienne. Passionate mushroom hunter. Tamer of two little dragons.Yu Su @ysu_nlp
6K Followers 857 Following Dist. Assist. Prof.@OhioState, Director @osunlp, 20% Researcher@Microsoft. I like to think about intelligence, artificial or biologicalQuanquan Gu @QuanquanGu
9K Followers 2K Following Professor @UCLA | Head of AIDD, ByteDance Research | Recent work: Self-play fine-tuning (SPIN) | Opinions are my ownRémi Leblond @RemiLeblond
2K Followers 155 Following Research Scientist @GoogleDeepMind. #Gemini, #AlphaCode, #AlphaStar. Working on solving hard problems with machine learning.Ming-Wei Chang @mchang21
1K Followers 509 Following Research Scientist @GoogleDeepMind. BERT co-author. Gemini project.Joost van Amersfoort @joost_v_amersf
1K Followers 338 Following Research Scientist @GoogleDeepMind -- PhD from @UniofOxford Large language model pre-training in the Gemini team.Sachin Kumar @shocheen
951 Followers 636 Following Incoming Asst. Prof. at @OhioStateCSE ('24). Postdoc at @allen_ai. Visiting @UWNLP. Ph.D. from @LTICMU. He/Him. Taking new students this cycle, reach out!Michael Tschannen @mtschannen
1K Followers 617 Following Machine learning researcher @GoogleDeepMind. Past: @Apple, @awscloud AI, @ETH_en. Multimodal/representation learning.Mina Lee @MinaLee__
3K Followers 452 Following Postdoc at @MSFTResearch | Assistant Professor at @UChicagoCS (2024) | PhD at @Stanford | Language models, AI-assisted writing, Human-AI interaction ✍️Denny Zhou @denny_zhou
9K Followers 419 Following @GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.Tristan Hume @trishume
6K Followers 330 Following Performance optimization lead @AnthropicAI. Profiling, distributed systems, dev tools, interpretability. [email protected]Jack Clark @jackclarkSF
67K Followers 5K Following @AnthropicAI, ONEAI OECD, co-chair @indexingai, writer @ https://t.co/3vmtHYkaTu Past: @openai, @business @theregister. Neural nets, distributed systems, weird futuresRuiqi Gao @RuiqiGao
5K Followers 512 Following Research scientist @Google DeepMind. Generative modeling, representation learning.Harish Tayyar Madabus.. @harish
2K Followers 1K Following Lecturer (~Assistant Professor) in Artificial Intelligence. Work on Deep Learning for #NLProc and Deep Contextual Meaning RepresentationsTom Hartvigsen @tom_hartvigsen
1K Followers 691 Following Assistant Professor @UVA developing machine learning and NLP methods responsible enough to deploy in dynamic high-stakes environments.Zijian Wang @zijianwang30
303 Followers 272 Following Senior Scientist at AWS AI Labs @amazonscience. LLM for code and beyond. DM for research internship opening. Past: @StanfordNLP @StanfordSymSys @UMich. #NLProcDisha Shrivastava @DishaShrivasta9
1K Followers 455 Following Research Scientist @GoogleDeepMind, PhD student @MilaMontreal. Previously @GoogleAI, @ServiceNowRSRCH, @IBMResearch, @IITDelhi.Arash Vahdat (hiring) @ArashVahdat
8K Followers 805 Following Principal scientist and research manager @nvidia research, leading forward-looking fundamental generative AI research efforts, views are my own.Yuntian Deng @yuntiandeng
3K Followers 3K Following #NLProc Postdoc @ai2_mosaic | Assistant Professor @UWaterloo '24 | Faculty Affiliate @VectorInst '24 | PhD @HarvardGuillermo Ortiz-Jimé.. @gortizji
812 Followers 610 Following Research Scientist at @GoogleDeepMind. Past: PhD at EPFL, intern at Google, ELLIS student Oxford.Anthropic @AnthropicAI
261K Followers 26 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at https://t.co/aRbQ97uk4d.Hyung Won Chung @hwchung27
18K Followers 228 Following Research Scientist @OpenAI. Past: @Google Brain / PhD @MITDon Metzler @metzlerd
3K Followers 602 Following Research Scientist at Google Research. Research interests: Large Language Models, Machine Learning, Information Retrieval..@ReviewAcl Can't we just do away with this confusing intermediary (3) score for meta-reviews? Instead just have a 4 scale AE rating: 1,2,4,5 What's the diff between "major" (3) and "significant" (2) revisions anyway? It only leads to inconsistency between AEs IMO
This was a really inspiring and IMO milestone paper in the space of efficient attention / RNNs! coming soon: a strong generalization of this from the view of SSMs ;)
Our 2020 paper "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" with @angeloskath @apoorv2904 and @nik0spapp reached 1000 citations! proceedings.mlr.press/v119/katharopo…
Our 2020 paper "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" with @angeloskath @apoorv2904 and @nik0spapp reached 1000 citations! proceedings.mlr.press/v119/katharopo…
LIFE UPDATE This is my last day @amazonLab126 @amazon working on the Astro robot. I will be joining @UMich @UMRobotics as an Assistant Professor on Jan '24! My lab, Robot Studio, will blend design finesse with engineering precision in the pursuit of human-robot interactions 👇
What are the generative limits of LLMs? Can LLMs, biases and all, still help us understand our society? I'm recruiting PhD students to study these questions and many more at the DILL lab @nlp_usc! Check out our recent work and your research alignment: dill-lab.github.io/opportunities/
Just 7 months after its release, Pythia has passed 200 citations. I'm very proud of the team that did it, but Pythia is first and foremost about empowering research that would otherwise be impossible. So to celebrate, here's my favorite non-@AiEleuther work requiring Pythia 🧵
🍥 My very first keynote will be about DESIGN for human-robot interaction 💫 I'll be sharing about my work as UX Designer for the Astro robot 🙌 Come join us @ai_for_hri @TAHRIorg @amazonLab126 @amazon
Super excited to start my lab @fluentrobotics! Our goal is to build robots that work fluently with and around people in unstructured, dynamic environments! This week, we’ll be at #IROS2023 to present new work on social navigation & multirobot coordination! @UMRobotics @UMich
Language models show impressive performance on a wide variety of tasks, but are they overfitting to evaluation instances and specific task instantiations seen in their pretraining? How much of this performance represents general task/reasoning abilities? 1/4
FlashAttention 2 is now in HF transformers, to make it faster and more memory efficient to train these models🚀! Thanks to @younesbelkada and the HF team for this integration effort, and more generally for the awesome transformers library
New feature alert in the @huggingface ecosystem! Flash Attention 2 natively supported in huggingface transformers, supports training PEFT, and quantization (GPTQ, QLoRA, LLM.int8) First pip install flash attention and pass use_flash_attention_2=True when loading the model!
To iterate and improve LLM-based systems you just 1. find where your system is failing, 2. fix the problem. But easier said than done, right? We just added an interface to Zeno Build for browsing underperforming examples, making the “finding” step easier: github.com/zeno-ml/zeno-b…
Wow, run LLMs like BitTorrents! petals.dev github.com/bigscience-wor…
Length generalization has become increasingly important because of the instruction finetuning of LLMs: Training on all possible instruction lengths is impractical, and longer sequences have significantly fewer training examples, making a generalizable model very desirable. [2/n]
I'm really excited about our new work on modelling the latent space of Transformers as nonparametric mixture distributions, and defining a Nonparametric Variational AutoEncoder with a Transformer encoder-decoder. Paper to be presented at #ICLR: lnkd.in/eVCg89x8
"The Little Book of Deep Learning" Consider this as a beta version rough on the edges. Comments are welcome. fleuret.org/public/lbdl.pdf @unige_en @sciences_UNIGE
The reviews we got from @UncertaintyInAI reviewers are detailed, to the point, and constructive. What more can an author want? #uai2023 Mainstream ML conferences can take note.
PyTorch 2.0 includes Accelerated Transformers, making efficient training and deployment of state-of-the-art Transformer models practical. This implementation was optimized to provide speed and efficiency for emerging Generative AI models. Read more: hubs.la/Q01JwX-X0
The language of these pronouncements makes me uneasy. It’s not “unscientific” exactly, but it lacks a certain skepticism that I think we need with any very new technology
Now that everyone is fatigued by GPT-4 hot takes and blocked the keyword "LLM", here's the blog post with my current view on the topic, and how my views changed: inference.vc/we-may-be-surp…
- Support for Flash-Attention in github.com/EleutherAI/gpt… (by @VHellendoorn). Flash Attention is a really great innovation in optimizing LLMs that provides ~25% speed-up at absolutely no cost in our testing.