Tianyu Gao @gaotianyu1350
CS PhD student @Princeton @Princeton_nlp working on NLP. Previously: @Tsinghua_Uni @TsinghuaNLP gaotianyu.xyz/about Joined April 2013-
Tweets146
-
Followers3K
-
Following686
-
Likes480
Dataset choice is crucial in today's ML training pipeline. We (@xiamengzhou and I) introduce desiderata for "good" data and explain how our recent algorithm, LESS, fits into the picture. Huge review of data selection algs for pre-training and fine-tuning! cs.princeton.edu/~smalladi/blog…
New preprint “The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models” w/ @danfriedman0 & @danqi_chen! We use structured pruning to find surprising phenomena and new insights on how a pretrained LM generalizes! arxiv.org/abs/2403.03942 1/8
Check out our new paper! We explore the representation gap between RNNs and Transformers. Theory: CoT improves RNNs but is insufficient to close the gap. Improving the capability of retrieving information from context is the key (e.g. +RAG / +1 attention). arxiv.org/abs/2402.18510
Check out our new paper! We explore the representation gap between RNNs and Transformers. Theory: CoT improves RNNs but is insufficient to close the gap. Improving the capability of retrieving information from context is the key (e.g. +RAG / +1 attention). arxiv.org/abs/2402.18510
CEPE🍄: Amazing work by @HowardYen1 on how you can extend any frozen LMs (pre-trained, instruction)'s context window to >= 128K by (a) encoding ctx with an encoder in parallel and (b) attending to them via inserted cross-attn. Much faster/memory-efficient vs. using full attn.
CEPE🍄: Amazing work by @HowardYen1 on how you can extend any frozen LMs (pre-trained, instruction)'s context window to >= 128K by (a) encoding ctx with an encoder in parallel and (b) attending to them via inserted cross-attn. Much faster/memory-efficient vs. using full attn.
Wondering why LLM safety mechanisms are fragile? 🤔 😯 We found safety-critical regions in aligned LLMs are sparse: ~3% of neurons/ranks ⚠️Sparsity makes safety easy to undo. Even freezing these regions during fine-tuning still leads to jailbreaks 🔗 boyiwei.com/alignment-attr… [1/n]
LESS is more!
Wanna know how to scale your learning rates (and other optimizer hyperparameters) according to batch sizes and why? Check out this great blogpost!
Wanna know how to scale your learning rates (and other optimizer hyperparameters) according to batch sizes and why? Check out this great blogpost!
Announcing the 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models (ME-FoMo) at ICLR 2024! Improving our understanding helps us advance capabilities and build safer, more aligned models. Paper deadline is Feb 3! Website: sites.google.com/view/me-fomo20…
Yangsibo is an amazing collaborator and also a great friend! If your school is hiring definitely consider her :)
Yangsibo is an amazing collaborator and also a great friend! If your school is hiring definitely consider her :)
We are at #Neurips2023! We will present this work as an oral on Wed at 4:15pm and as a poster on Wed at 5-7pm. Many of the authors are here -- stop by to chat with us!
We are at #Neurips2023! We will present this work as an oral on Wed at 4:15pm and as a poster on Wed at 5-7pm. Many of the authors are here -- stop by to chat with us!
Akari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Jim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Graham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.Mengzhou Xia @xiamengzhou
3K Followers 618 Following PhD student @princeton_nlp, MS @CarnegieMellon, Undergrad at Fudan.Wenhu Chen @WenhuChen
11K Followers 520 Following AI researcher @UWaterloo @GoogleAI @VectorInst. Interested in natural language processing, diffusion models. I direct TIGER-Lab at UWaterloo.Bill Yuchen Lin 🤖 @billyuchenlin
6K Followers 2K Following Research @allen_ai. I evaluate (multi-modal) LLMs, build agents, and study the science of LLMs. Previously: @GoogleAI & @MetaAI FAIR @nlp_uscTao Yu @taoyds
3K Followers 815 Following @XLangNLP lab, asst. prof. @HKUniversity. prev. postdoc @uwnlp; phd @Yale; intern @MSFTResearch, @SFResearch. he/him 🌈Victor Zhong @hllo_wrld
4K Followers 450 Following ML+NLP assistant prof @UWCheritonCS. Formerly @MSFTResearch @MetaAI, @SFResearch via @MetamindIO, @uwnlp, @StanfordNLP, @eceuoft.Yu Su @ysu_nlp
6K Followers 857 Following Dist. Assist. Prof.@OhioState, Director @osunlp, 20% Researcher@Microsoft. I like to think about intelligence, artificial or biologicalXi Ye @xiye_nlp
2K Followers 304 Following CS PhD student @UTAustin. I study NLP, particularly explanations. I sometimes make memes.Weijia Shi @WeijiaShi2
5K Followers 967 Following PhD student @uwcse @uwnlp | Visiting Researcher @MetaAI | Undergrad @CS_UCLA | https://t.co/eLBQmgkvymYao Fu @Francis_YAO_
14K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep runningSebastian Ruder @seb_ruder
80K Followers 1K Following Multilingual LLMs @cohere • Prev: @GoogleDeepMind • Newsletter: https://t.co/7JGh2qpG98Ofir Press @OfirPress
10K Followers 3K Following I build tough benchmarks for LMs and then I get the LMs to solve them. Postdoc @Princeton. PhD from @nlpnoah @UW. Ex-visiting researcher @MetaAI & @MosaicML.Tim Dettmers @Tim_Dettmers
29K Followers 819 Following PhD Student at @UW. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.Denny Zhou @denny_zhou
9K Followers 420 Following @GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.Sam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.HolyDifficult @HolyDifficult
9 Followers 144 FollowingNumeratili @numeratili
48 Followers 300 Following Lifelong Learner and Explorer across Science and Business Analytics using AI, ML, and Data Science to Help Decision MakingPhillip Lindsay @EastLAPinche
60 Followers 386 FollowingBen Schulz @schulzb589
1K Followers 5K Following 3D Geospatial Analyst at Maxar Space Operations. Opinions expressed on this site are my own and do not necessarily represent the views of Maxar Technologies.HONG PENG @EMILYPENG83
0 Followers 67 Followingpengch fan @FanPengch
212 Followers 6K FollowingHarsh Desai @dreamerharsh
1 Followers 3K FollowingMingyuan Wu @MingyuanWu4
2 Followers 52 Following CS PhD student at Multimedia Group @illinois | Vision Language Model + AR, VR, Multimodal Systems | Previous: UG @UIUC + @SJTU | Intern @Adobe, @Meta, @Intel安餒啊 @qiu48939
2 Followers 19 Following单调的生活 @paidaxing92
2 Followers 42 FollowingArif Ahmad @arif_ahmad_py
250 Followers 7K Following All things AI, Computer Science and Circuits! Prev. @GoogleAI19890723 @tDzISWb22CPtFrP
4 Followers 2K FollowingSalman Rahman @salmans_rahman
1 Followers 150 Following Researcher in Trustworthy NLP, Alignment, and AI Governance at @nyuniversity. Sharing insights on AI Safety. #NLP #AISafety #AIGovernanceXinyi Wang @XinyiWang98
797 Followers 299 Following UC Santa Barbara CS PhD student working on ML/NLPhengyuan zhang @Shade2Hamilton
6 Followers 73 FollowingJHU CLSP @jhuclsp
5K Followers 662 Following Center for Language and Speech Processing at @JohnsHopkins #NLProc #MachineLearning #AI https://t.co/6IXR5OSiDY @[email protected]Kun (Kevin) SUN @Sharp_K_Sun
226 Followers 2K Following Scientist Researcher @ Tübingen University and Professorial Research Fellow @ Fudan University, and interested in LLMs, NLP, and computational cognition .Jindong Gu @Jindong73504766
282 Followers 886 Following Senior Research Fellow in University of Oxford @OxfordTVG Faculty Researcher @Google #ResponsibleAI #AISafety #GenAI Homepage: https://t.co/YOSVO3jb6hTE SKH @_Teskh
37 Followers 191 FollowingShi Feng @ihsgnef
480 Followers 768 Following NYU Alignment Research Group Incoming Asst. Prof. at GWU (Fall 2024)Alo @Hal90910
0 Followers 2K FollowingEdmar Miyake @emiyake
38 Followers 462 FollowingDefu Cao @caodefu_dove
223 Followers 389 Following Phd student of @USC' CS. Working with Prof. @yanliu_usc. Time series 📈& Causal Inference 🔧💡 Ex: @PKU1898; @AdobeResearch, UCB, MSRA, Alibaba , BaiduMartin Shkreli (e/acc.. @MartinShkreli
166K Followers 3K Following https://t.co/lzin5ByH0t [email protected] https://t.co/oMIiyJcIzk https://t.co/DuU6MMqcgQWeixiang Yan @WeixiangYan99
28 Followers 55 FollowingAlyssa, Yi CHENG @YiCheng77783310
80 Followers 205 Following Ph.D. student, working on NLP for social good and conversational AI.❤NaJa @jame_adrew
18 Followers 5K FollowingYutong Zhang @zhangyt0704
7 Followers 92 Following cs master student @Stanford | previously undergrad @UofIllinoisZilai Zeng @zilaizeng
25 Followers 119 Following Master Student @BrownCSDept. Fall 2024 CS PhD applicant. Opinions are my own.Longxuan Yu @Loy004Yu
4 Followers 17 FollowingChenmien Tan @ ICLR'2.. @ChenmienTan
37 Followers 54 Following Thesis-based MS student @EdinburghNLP Incoming research assistant @NlpWestlake and intern @uiuc_nlp Looking for PhD position starting from 2025 FallPaul Naish @PaulNaish78
23K Followers 23K Following Global Head of Portfolio @ Taylor & Francis #Mathematics #Statistics #DataScience #HistSci #STS #scicomm. Fan of: 🐕🐈🐿🦔🍪☀️🍦Views = mine.Shiqi Lou @lou_shiqi60535
12 Followers 119 FollowingJacob Portes @JacobianNeuro
675 Followers 1K Following Research Scientist @MosaicMLxDatabricks. I like it when neuroscience inspires AI 🧠+🖥️AK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxAkari Asai @AkariAsai
11K Followers 650 Following Ph.D. student @uwcse & @uwnlp. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . ☕️ 🐕 🏃♀️🧗♀️🍳Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Yi Tay @YiTayML
29K Followers 97 Following chief scientist / cofounder @RekaAILabs 🫠 past: research scientist @google brain 🤯 currently learning to be a dad 🍼(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingGraham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.AI at Meta @AIatMeta
531K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzMengzhou Xia @xiamengzhou
3K Followers 618 Following PhD student @princeton_nlp, MS @CarnegieMellon, Undergrad at Fudan.Wenhu Chen @WenhuChen
11K Followers 520 Following AI researcher @UWaterloo @GoogleAI @VectorInst. Interested in natural language processing, diffusion models. I direct TIGER-Lab at UWaterloo.Bill Yuchen Lin 🤖 @billyuchenlin
6K Followers 2K Following Research @allen_ai. I evaluate (multi-modal) LLMs, build agents, and study the science of LLMs. Previously: @GoogleAI & @MetaAI FAIR @nlp_uscWilliam Wang @WilliamWangNLP
14K Followers 716 Following UCSB NLP Lab + ML Center. https://t.co/6TOnqbk6YT https://t.co/KJYhnav3Et Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS. Areas: #NLProc, Machine Learning, AI.Yoav Artzi @yoavartzi
13K Followers 163 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / https://t.co/3VmRSyYm2d / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaCJacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwTao Yu @taoyds
3K Followers 815 Following @XLangNLP lab, asst. prof. @HKUniversity. prev. postdoc @uwnlp; phd @Yale; intern @MSFTResearch, @SFResearch. he/him 🌈Victor Zhong @hllo_wrld
4K Followers 450 Following ML+NLP assistant prof @UWCheritonCS. Formerly @MSFTResearch @MetaAI, @SFResearch via @MetamindIO, @uwnlp, @StanfordNLP, @eceuoft.Alex Dimakis @AlexGDimakis
13K Followers 2K Following UT Austin Professor. Researcher in Machine Learning and Information Theory. National AI Institute on the Foundations of Machine Learning (IFML) Co-director.Teknium (e/λ) @Teknium1
29K Followers 3K Following Cofounder @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE Support me on Github SponsorsQuanta Magazine @QuantaMagazine
323K Followers 657 Following Illuminating math and science. Supported by @SimonsFdn. 2022 Pulitzer Prize in Explanatory Reporting.OpenBMB @OpenBMB
661 Followers 101 Following OpenBMB (Open Lab for Big Model Base), founded by @TsinghuaNLP & ModelBest Inc (面壁智能), aims to build foundation models and systems towards AGI.Omar Sanseviero @osanseviero
31K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽Tianle LI @TianleLI123
77 Followers 57 FollowingXiang Yue @xiangyue96
2K Followers 432 Following Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.John Yang @jyangballin
2K Followers 446 Following CS/NLP MS student @princeton_nlp Previously @Berkeley_EECSMichael Carbin @mcarbin
3K Followers 371 Following Associate Professor in EECS at @MIT | Founding Advisor at @mosaicml | Programming Systems | Neural Networks | Approximate ComputingZico Kolter @zicokolter
15K Followers 499 Following Associate professor at Carnegie Mellon, VP and Chief Scientist at Bosch Center for AI. Researching (deep) machine learning, robustness, implicit layers.Zhihao Jia @JiaZhihao
2K Followers 496 Following Assistant professor of Computer Science at Carnegie Mellon University. Research on systems and machine learning.Michal Valko @misovalko
5K Followers 2K Following Llama @AIatMeta Paris & Inria & MVA - Ex: Gemini and BYOL @GoogleDeepMindAshwinee Panda @PandaAshwinee
944 Followers 602 Following PhD @princeton, @Cal alum, currently working on LLMsLuxi (Lucy) He @LuxiHeLucy
324 Followers 135 Following Princeton CS PhD @PrincetonPLI. Previously @Harvard ‘23 CS & Math.Alexis Chevalier @AlexisChvlr
100 Followers 79 Following NLP postdoc @PrincetonPLI. Formerly researching mathematical logic @IAS and @UniOfOxfordJeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordAmanda Askell @AmandaAskell
26K Followers 653 Following Philosopher & ethicist teaching models to be good @AnthropicAI. Personal account. All opinions come from my training data.Samuel L Smith @SamuelMLSmith
2K Followers 360 Following Research Scientist at DeepMind. Optimization and Initialization. Formerly Google Brain. Ex-Physicist.Leshem Choshen 🤖�.. @LChoshen
4K Followers 550 Following 🥇 Collaborative LLMs 🥈 Opinionatedly sharing #ML & #NLP 🥉 Propagating us underdogs we owe science an alternative hype @IBMResearch & @MIT_CSAILYanda Chen @yanda_chen_
421 Followers 387 Following 3rd year PhD @ColumbiaCompSci, working on NLP & ML | Student Researcher @GoogleAI | Prev Intern @MSFTResearch, @AmazonScienceMosh Levy @mosh_levy
267 Followers 160 Following phd student @biunlp. studying ai robustness and behaviors.Alessandro Suglia @ale_suglia
974 Followers 1K Following Assistant Professor @HeriotWattUni/@NRobotarium & Head of Visual Dialogue at @helloalana; PhD @EDINRobotics; Former Research Intern @MetaAI and @AmazonScience.Adithya Bhaskar @AdithyaNLP
58 Followers 51 Following First Year CS Ph.D. student at Princeton University (@princeton_nlp), previously CS undergrad at IIT BombayCatherine Olsson @catherineols
15K Followers 1K Following Hanging out with Claude, improving its behavior, and building tools to support that @AnthropicAI 😁 prev: @open_phil @googlebrain @openai (@microcovid)Cohere For AI @CohereForAI
15K Followers 174 Following We are a research lab and open science initiative that seeks to solve complex machine learning problems. Join us in exploring the unknown, together.Albert Jiang @AlbertQJiang
2K Followers 407 Following AI4Maths @Cambridge_CL Science @MistralAI I bake my own opinions at temperature=2.0Guillaume Lample @GuillaumeLample
37K Followers 648 Following Cofounder & Chief Scientist https://t.co/hLfvKLkFHd (@MistralAI). Working on LLMs. Ex @MetaAI | PhD @Sorbonne_Univ_ | MSc @CarnegieMellon | X11 @PolytechniqueEleutherAI @AiEleuther
19K Followers 76 Following A non-profit research lab focused on interpretability, alignment, and ethics of artificial intelligence. Creators of GPT-J, GPT-NeoX, and VQGAN-CLIPjack morris @jxmnop
10K Followers 760 Following getting my phd in nlp @cornell_tech 🚠 // academic optimist // tweeting from the snack aisle at trader joesConference on Languag.. @COLM_conf
2K Followers 6 Following https://t.co/GhGCMEoa4A Abstract submission: March 22, 2024Haotian Liu @imhaotian
6K Followers 396 Following building intelligence @xAI, creator of #LLaVA, cs @UWMadison, prev @MSFTResearchYu Meng @yumeng0818
1K Followers 160 Following Asst. Professor @CS_UVA, Past: PhD from @IllinoisCS, visiting researcher @princeton_nlp, Google PhD Fellow. NLP/ML/LLMYuandong Tian @tydsh
16K Followers 801 Following Research Scientist and Senior Manager in Meta AI (FAIR). AI-guided Optimization and Representation Learning. Novelist in spare time. PhD in @CMU_Robotics.lmsys.org @lmsysorg
37K Followers 171 Following Large Model Systems Organization. We created Vicuna and Chatbot Arena! Compare 30+ LLMs (GPT-4/Claude/Llamas) side-by-side at https://t.co/IDFeIDIOtmDan Fu @realDanFu
4K Followers 176 Following CS PhD Candidate at Stanford, systems for machine learning. Sometimes YouTuber/podcaster. Academic Partner, @togethercompute.OpenNLPLab @opennlplab
260 Followers 87 Following OpenNLPLab Official Account Hugging Face: https://t.co/B9IzcQoCQP GitHub: https://t.co/PhoPmAkyf7 WeChat: OpenNLPLabSonglin Yang @SonglinYang4
2K Followers 2K Following PhD student @MIT_CSAIL. Prev. @ShanghaiTechUni @SUSTechSZ. Working on scalable and principled methods in #ML & #NLProc. INTP | 5w4 | sx/sp | she/herEvan Miller @EvMill
5K Followers 160 Following Statistically inclined software developer, occasional blogger about math + stats stuffManling Li @ManlingLi_
3K Followers 429 Following Postdoc @Stanford, Incoming Assistant Professor @Northwestern, PhD @UIUC. Working on Knowledge Foundation Models, especially for Multimodal data (Language + X).Shayne Longpre @ShayneRedford
4K Followers 998 Following PhD @MIT. Prev: @Google Brain, @apple ML, @stanfordnlp. 🇨🇦 Interests: AI/ML/NLP, Data-centric AI, transparency & societal impactAmanda Bertsch @abertsch72
1K Followers 673 Following PhD student @LTIatCMU / @SCSatCMU, researching text generation + summarization | she/her | also @ abertsch on bsky or https://t.co/L4HBUh0R9f or by email (https://t.co/bsHqwIMFPL)Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…
🚀 Introducing Pile-T5! 🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from the Pile using the Llama tokenizer. ✨ Featuring intermediate checkpoints and a significant boost in benchmark performance. Work done by @lintangsutawika, me…
MTEB is the most common text embedding benchmark with 190K installs/mon & 120K leaderboard visits/mon. We're extending it to be massively multilingual. Anyone is invited to contribute & co-author an upcoming publication📜 Details: github.com/embeddings-ben…
PASS seminar on 4/09 2pm ET! Speaker: @DanHendrycks from @ai_risks Topic: An Overview of Catastrophic AI Risks Live: youtube.com/@PrincetonPLI/… Submit questions: tinyurl.com/pass-question Recordings later at: youtube.com/@PrincetonPLI
We introduce LLM2Vec, a simple approach to transform any decoder-only LLM into a text encoder. We achieve SOTA performance on MTEB in the unsupervised and supervised category (among the models trained only on publicly available data). 🧵1/N Paper: arxiv.org/abs/2404.05961
Language models today are trained to reason either 1) generally, imitating online reasoning data or 2) narrowly, self-teaching on their own solutions to specific tasks Can LMs teach themselves to reason generally?🌟Introducing Quiet-STaR, self-teaching via internal monologue!🧵
Glad to see Gecko featured as the text embedding model in the Gemini API! developers.googleblog.com/2024/04/gemini…
Thrilled to receive the Superalignment Fast Grant! 🚀 This is particularly meaningful as my first grant proposal since becoming an assistant professor 🤩🤩 Thank you @OpenAI!! Excited for the groundbreaking research ahead!
Introducing Gecko 🦎, a new text embedding model from Google DeepMind! Distilled from LLMs, Gecko offers powerful embeddings for various NLP tasks. Gecko is now available in Google Cloud API 👉bit.ly/google-gecko-a… Paper: bit.ly/google-gecko Colab: bit.ly/google-gecko-c…
Dataset choice is crucial in today's ML training pipeline. We (@xiamengzhou and I) introduce desiderata for "good" data and explain how our recent algorithm, LESS, fits into the picture. Huge review of data selection algs for pre-training and fine-tuning! cs.princeton.edu/~smalladi/blog…
Fine-tuning on benign data (e.g. Alpaca) can jailbreak models unexpectedly. We study this problem through a data-centric perspective and find that some seemingly benign data could be more harmful than explicitly malicious data! ⚠️🚨‼️ Paper: arxiv.org/pdf/2404.01099… [1/n]
SWE-Agent is an open-source software engineering agent with a 12.3% resolve rate on SWE-Bench! Check out SWE-agent in action at swe-agent.com Repo: github.com/princeton-nlp/…
SWE-agent is our new system for autonomously solving issues in GitHub repos. It gets similar accuracy to Devin on SWE-bench, takes 93 seconds on avg + it's open source! We designed a new agent-computer interface to make it easy for GPT-4 to edit+run code github.com/princeton-nlp/…
I think I'm allowed to say this? COLM abstracts are just awesome so far, and wildly multi-disciplinery. I think this is going to be a special event.
RAG 2.0 is about making retrieval-augmented generation more end-to-end & learned, e.g. Self-RAG, RA-DIT, GRIT - High-impact research direction imo! 😊
Today, we’re excited to announce RAG 2.0, our end-to-end system for developing production-grade AI. Using RAG 2.0, we’ve created Contextual Language Models (CLMs), which achieve state-of-the-art performance on a variety of industry benchmarks. CLMs outperform strong RAG…
The GPUs have landed! (300 H100s on tap for AI research @PrincetonPLI ) ai.princeton.edu/news/2024/prin…
Happy to share REPLUG🔌 is accepted to #NAACL2024 We introduce a retrieval-augmented LM framework that combines a frozen LM with a frozen/tunable retriever. Improving GPT-3 in language modeling & downstream tasks by prepending retrieved docs to LM inputs. 📄:…
Announcing Princeton AI Alignment and Safety Seminar (PASS): A virtual & collaborative space for diverse researchers to learn & discuss aligning increasingly capable AI models for safe behavior. tinyurl.com/pass-seminar Join our mailing list for updates: tinyurl.com/pass-mailing
Thank you AK for sharing our Design2Code paper! Here’s my version of the story: To assess whether multimodal LLMs are ready to automate front-end engineering, we: - formalize the task of converting visual designs into code implementations - manually curate the Design2Code…
Design2Code How Far Are We From Automating Front-End Engineering? Generative AI has made rapid advancements in recent years, achieving unprecedented capabilities in multimodal understanding and code generation. This can enable a new paradigm of front-end development, in
New preprint “The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models” w/ @danfriedman0 & @danqi_chen! We use structured pruning to find surprising phenomena and new insights on how a pretrained LM generalizes! arxiv.org/abs/2403.03942 1/8