Dylan Sam @dylanjsam
phd student @mldcmu | past: intern @AmazonScience, BS @BrownCSDept dsam99.github.io Pittsburgh, PA Joined October 2017-
Tweets97
-
Followers428
-
Following351
-
Likes2K
1/What does it mean for an LLM to “memorize” a doc? Exactly regurgitating a NYT article? Of course. Just training on NYT?Harder to say We take big strides in this discourse w/*Adversarial Compression* w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter 🌐:locuslab.github.io/acr-memorizati…🧵
1/ 🥁Scaling Laws for Data Filtering 🥁 TLDR: Data Curation *cannot* be compute agnostic! In our #CVPR2024 paper, we develop the first scaling laws for heterogeneous & limited web data. w/@goyalsachin007 @zacharylipton @AdtRaghunathan @zicokolter 📝:arxiv.org/abs/2404.07177
Models with different randomness make different predictions at test time even if they are trained on the same data. In our latest ICLR paper (oral), we investigate how models learn different features, and the effect this has on agreement and (potentially) calibration. 1/
🚀Our latest blog post unveils the power of Consistency Models and introduces Easy Consistency Tuning (ECT), a new way to fine-tune pretrained diffusion models to consistency models. SoTA fast generative models using 1/32 training cost! 🔽 Get ready to speed up your generative…
SOTA AI for games like poker & Hanabi rely on search methods that don’t scale to games w/ large amounts of hidden information. In our ICLR paper, we introduce simple search methods that scale to large games & get SOTA for Hanabi w/ 100x less compute. 1/N arxiv.org/abs/2304.13138
1/5 Unleash the full power of RAG systems! 🔥 Introducing RAGGED, a framework for finding the optimal RAG configurations and bypassing common pitfalls. Dive deep into our findings: arxiv.org/pdf/2403.09040…
In this work we construct the first nonvacuous generalization bounds for LLMs, helping to explain why these models generalize. w/ @LotfiSanae, @KuangYilun, @timrudner @micahgoldblum, @andrewgwils arxiv.org/abs/2312.17173 A 🧵on how we make these bounds 1/9
🧵: How do you design a network that can optimize (edit, transform, ...) the weights of another neural network? Our latest answer to that question: *Universal* Neural Functionals (UNFs) that can process the weights of *any* deep architecture.
Unlabeled data is crucial for modern ML. It provides info about data distribution P, but how to exploit such info? Given a kernel K, our #ICLR2024 spotlight gives a general & principled way: Spectrally Transformed Kernel Regression (STKR). Camera-ready 👇 arxiv.org/abs/2402.00645
1/7 Super excited about my Apple Internship work finally coming out: Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling TLDR: You can train 3x faster and with upto 10x lesser data with just synthetic rephrases of the web! 📝 arxiv.org/abs/2401.16380
1/4 The Right to be Forgotten is knocking on the door. Yet, unlearning in LLMs has no clear task definition, no evaluation metrics or baselines. Introducing TOFU: Task of Fictitious Unlearning for LLMs 🌐 locuslab.github.io/tofu w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter🧵
Computer interfaces are inherently visual. To build general autonomous agents, we will need strong vision language models. To assess the performance of multimodal agents, we introduce VisualWebArena (VWA): a benchmark for evaluating multimodal web agents on realistic visually…
New work ("Low-Resource Languages Jailbreak GPT-4") from @BrownCSDept PhD student @yong_zhengxin, postdoctoral researcher @CriMenghini of @Brown_DSI, and @BrownCSDept faculty member @stevebach was chosen from 121 submissions for the SoLaR Best Paper Award: go.brown.edu/SoLaRBestPaper
Stable Diffusion is an effective data augmentation. Website: btrabuc.co/da-fusion Watch Here: youtu.be/IKDWOOWzwns I'm excited to share my NeurIPS talk about DA-Fusion from the Synthetic Data workshop, where we build an augmentation that semantically modifies images, and…
Robotic intelligence requires dexterous tool use, but generalizing across tools is hard. Our CoRL23 paper combines semantics (affordances) with low-level control (sim2real) to show functional grasping that generalizes to hammers, drills and more! dexfunc.github.io 1/n
Zachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Pratyush Maini @pratyushmaini
1K Followers 340 Following Trustworthy ML | PhD student @mldcmu | Founding Member @datologyai | Prev. Comp Sc @iitdelhiJeremy Cohen @deepcohen
4K Followers 868 Following PhD student in machine learning at Carnegie Mellon. The goal of my research is to turn deep learning into a real engineering discipline.Valerie Chen @valeriechen_
978 Followers 418 Following phd student @mldcmu @SCSatCMU | previously @MSFTResearch @yale @CMU_Robotics @IBMResearchKayo Yin @kayo_yin
8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Nicholas Roberts @nick11roberts
592 Followers 1K Following Ph.D. student @WisconsinCS. Working on data-centric automated machine learning. Previously at CMU @mldcmu, UCSD @ucsd_cse, FCC @fresnocity.Yiding Jiang @yidingjiang
1K Followers 468 Following PhD student @mldcmu @SCSatCMU. Formerly intern @MetaAI, AI resident @GoogleAI. BS from @Berkeley_EECS. Trying to understand stuff.Christina Baek @_christinabaek
781 Followers 230 Following PhD student @mldcmu | Past: intern @GoogleAIZico Kolter @zicokolter
15K Followers 499 Following Associate professor at Carnegie Mellon, VP and Chief Scientist at Bosch Center for AI. Researching (deep) machine learning, robustness, implicit layers.Lucio Dery Jnr Mwinm @derylucio
462 Followers 956 FollowingSaurabh Garg @saurabh_garg67
864 Followers 579 Following Building next-gen AI at @MistralAI | prev/ PhD @mldcmu; CS @iitbombay (undergrad); Collab @GoogleAI @awscloud @applePaul Liang @pliang279
4K Followers 910 Following PhD student @mldcmu @SCSatCMU. Foundations of multimodal learning & applications in social AI, NLP, and healthcare with @lpmorency and @rsalakhu.Aidan Yang @AidanZHYang
597 Followers 565 Following PhD student @CarnegieMellon || Previously @AWS, @MSFTResearch, @AMD and @Queensu 🇨🇦 || Program synthesis + MLSachin Goyal @goyalsachin007
764 Followers 715 Following PhD student @ CMU MLD || Microsoft Research || UG @ IIT BombayBrihi Joshi @BrihiJ
2K Followers 3K Following PhD-ing @nlp_usc🏝 + @NLPWithFriends, @AmazonScience @Apple Fellow, ex- @WWCode_Delhi @Snap @GoldmanSachs @IIITDelhi. Sky pics, #NLProc, @5sos and cat contentStephanie Milani @steph_milani
1K Followers 224 Following PhD Student at @mldcmu. Previously @UMBC @CMU_Robotics @MFSTResearch. Interested in human-centered reinforcement learning.Gokul Swamy @g_k_swamy
2K Followers 1K Following phd candidate @CMU_Robotics. ms @berkeley_ai. summers @GoogleAI, @msftresearch, @aurora_inno, @nvidia, @spacex. no model is an island.Stephen Bach @stevebach
2K Followers 422 Following Asst. prof. @BrownCSDept. Working on improving how humans teach computers. Weak supervision, zero-shot learning, few-shot learning, and high-level knowledge.Clara Na @claranahhh
669 Followers 504 Following PhD student at @LTIatCMU / @SCSatCMU she/her, prev. @UVA and intern @ai2_allennlp @/clara on https://t.co/GHxXbrRa33 and @/clarana on https://t.co/47UIhMFD1FHazel Pennebaker @Pennebaker97101
61 Followers 5K FollowingNate Gruver @gruver_nate
527 Followers 256 Following Machine learning PhD student at NYU BS & MS @StanfordAILab Industry @AIatMeta @WaymoMohMahS @MohSamiii
3 Followers 113 FollowingBeatriz Kathleen @KathleBeatr
24 Followers 5K FollowingYoung @younqchan
168 Followers 3K Following Final year Ph.D. student working on Out-of-Distribution Generalization and Causality of Large Pre-trained Models, and Graph Neural Networks.Trevor Loy @trevorloy
17K Followers 2K Following VC investor emerging ecosystems @FlywheelVC. Lecturer entrepreneurship & VC @Stanford. Prev: BoD @NVCA; Mentor @KauffmanFellows; 3x founder; Chip design @Intel.Tonisha Shilo @ShiTonis
58 Followers 5K FollowingManoj Acharya @manoja328
583 Followers 5K Following Mostly Interested in safe and aligned (neural inspired) Machine Intelligence ; PhD from Rochester Institute of TechnologyYining Lu @Yining__Lu
89 Followers 254 Following Incoming CS PhD student @NotreDame. Now master student @JHUCLSP | #NLProcMilin Bhade @MilinBhade
56 Followers 1K Following Post Grad Student at IISc, Bangalore Masters in Computer Science & AutomationClaudia Bouillion @BouillioClaud
79 Followers 5K FollowingOpeyemi Osakuade @o_feranmi_
548 Followers 1K Following Machine Learning Engineering | Speech Processing Research @InfAtEdLetisha Hernan @hern_letis
38 Followers 5K FollowingLuxi (Lucy) He @LuxiHeLucy
323 Followers 135 Following Princeton CS PhD @PrincetonPLI. Previously @Harvard ‘23 CS & Math.Dandan Shan @DandanShan_
895 Followers 1K Following PhD Candidate @UMichCSE; Visiting Scholar @NYU_Courant; Working on Computer VisionZhiyong Wang @Zhiyong16403503
394 Followers 2K Following Visiting Ph.D. student at Cornell University. Ph.D. candidate at CUHK. Working on bandits and reinforcement learning theory.daiwei @davy62486780
10 Followers 58 Following UW-Madison ECE PhD student, Representation Learning, Machine Learning TheoryNiki Hasrati @niki_hasrati
82 Followers 106 Following ML PhD student @CarnegieMellon’s School of CS | Previously CS master’s student @UWaterloo | Researching the intersection of theoretical CS and ML theoryAnanya Joshi @AnanyaAJoshi
7 Followers 21 Following Ph.D. Student at Carnegie Mellon University in the Computer Science DepartmentMicah Goldblum @micahgoldblum
5K Followers 690 Following 🤖Postdoc at NYU with @ylecun / @andrewgwils. All things machine learning🤖 🚨On the faculty job market this year!🚨Majeed Kazemi @MajeedKazemi
1K Followers 2K Following PhD student in CS @UofT with @ToviGrossman HCI + Computing Education + Coding / Creativity Support Tools Prev: @MSFTResearch + MSc @HCIL_UMD with @JonFroehlichYosef Skolnick, M.S. @yosefskolnick
195 Followers 2K Following Software Engineer | Aspiring AI Researcher | InventorSergey Litvinenko @sergeyltvn
263 Followers 3K Following @AUBGedu '20 Mathematics and Economics @WU_vienna '23 Quantitative Finance RT ≠ endorsement https://t.co/UP4UK7Y7FXMeadowRays @MeadowR68448
3 Followers 440 FollowingIftekhar Chowdhury @iftekhar_hc
8 Followers 297 FollowingAlex Robey @AlexRobey23
613 Followers 849 Following Ph.D. student at @Penn studying robust machine learning. Formerly @GoogleAI, @Livermore_Lab | B.S. & B.A. from @swarthmoreAakash Lahoti @aakash_lahoti
4 Followers 110 Following ML Ph.D. Student @mldcmu @SCSatCMU | Prev. CS @IITKanpurYan Scholten @YanScholten
95 Followers 120 Following ML PhD Student @TU_Muenchen. Working towards more reliable AI @zuseschoolrelAI.Prajwal 🛠️ @prajpawar23
548 Followers 2K Following 20 // incoming summer intern @qualcomm // curious human of earth // rvce'25nick nassuphis @NNassuphis
120 Followers 5K FollowingAvi Schwarzschild @A_v_i__S
267 Followers 183 Following Postdoc at CMU. Trying to learn about deep learning faster than deep learning can learn about me.Burny — Effective O.. @burny_tech
14K Followers 6K Following Transhuman engineer in singularity! Lover of AI & omnidisciplionary metamathemagics! Hypercuriousia! Omniperspectivity! Shapeshifting metafluid! Freedom 4 all!Rajan @SpeckofDUST16
104 Followers 1K Following exploring MultiModal LLMs @wadhwaniai | Math+CS undergrad @bitspilaniindia. | Previous @TCSResearch, @NIAS_India, @ICMEStanfordZachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Pratyush Maini @pratyushmaini
1K Followers 340 Following Trustworthy ML | PhD student @mldcmu | Founding Member @datologyai | Prev. Comp Sc @iitdelhiDivyansh Kaushik @dkaushik96
4K Followers 3K Following Emerging tech and national security. DC/PGH. “An imported Indian immigrant,” @BreitbartNews.Yann LeCun @ylecun
711K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Jeremy Cohen @deepcohen
4K Followers 868 Following PhD student in machine learning at Carnegie Mellon. The goal of my research is to turn deep learning into a real engineering discipline.Valerie Chen @valeriechen_
978 Followers 418 Following phd student @mldcmu @SCSatCMU | previously @MSFTResearch @yale @CMU_Robotics @IBMResearchKayo Yin @kayo_yin
8K Followers 556 Following PhD student @berkeley_ai @berkeleynlp working on interpretability and signed languages. Former @msftresearch @deepmind @carnegiemellon @polytechnique. 🇫🇷🇯🇵Alex Ratner @ajratner
5K Followers 548 Following @SnorkelAI @uwcse / prev @StanfordAILab – Interested in data management systems for machine learning, weak supervision, and impactful applications.Graham Neubig @gneubig
31K Followers 586 Following Associate professor at CMU, studying natural language processing and machine learning.Nicholas Roberts @nick11roberts
592 Followers 1K Following Ph.D. student @WisconsinCS. Working on data-centric automated machine learning. Previously at CMU @mldcmu, UCSD @ucsd_cse, FCC @fresnocity.Yiding Jiang @yidingjiang
1K Followers 468 Following PhD student @mldcmu @SCSatCMU. Formerly intern @MetaAI, AI resident @GoogleAI. BS from @Berkeley_EECS. Trying to understand stuff.Christina Baek @_christinabaek
781 Followers 230 Following PhD student @mldcmu | Past: intern @GoogleAIDan Roy @roydanroy
45K Followers 2K Following ML / AI researcher, emphasis on theory. Research Director and Canada CIFAR AI Chair, @VectorInst Professor, @UofT (Statistics/CS)Behnam Neyshabur @bneyshabur
18K Followers 690 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingZico Kolter @zicokolter
15K Followers 499 Following Associate professor at Carnegie Mellon, VP and Chief Scientist at Bosch Center for AI. Researching (deep) machine learning, robustness, implicit layers.Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistAnanya Kumar @ananyaku
4K Followers 470 Following Researcher at @openai Previously PhD at Stanford University (@StanfordAILab) advised by Percy Liang and Tengyu MaAK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxRosanne Liu @savvyRL
33K Followers 966 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRNate Gruver @gruver_nate
527 Followers 256 Following Machine learning PhD student at NYU BS & MS @StanfordAILab Industry @AIatMeta @WaymoAaron Roth @Aaroth
10K Followers 639 Following CS professor at Penn. Amazon Scholar at AWS. Author of The Ethical Algorithm (w/ Michael Kearns). I study machine learning, privacy, game theory, and fairness.Luxi (Lucy) He @LuxiHeLucy
323 Followers 135 Following Princeton CS PhD @PrincetonPLI. Previously @Harvard ‘23 CS & Math.Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzAnanya Joshi @AnanyaAJoshi
7 Followers 21 Following Ph.D. Student at Carnegie Mellon University in the Computer Science DepartmentShannon Shen @shannonzshen
1K Followers 2K Following PhD Student @MIT_CSAIL | previously @allen_ai @semanticscholar @harvard @brownuniversitynoahdgoodman @noahdgoodman
2K Followers 109 Following Professor of natural and artificial intelligence @Stanford. Research Scientist at @GoogleDeepMind. (@StanfordNLP @StanfordAILab etc)Niki Hasrati @niki_hasrati
82 Followers 106 Following ML PhD student @CarnegieMellon’s School of CS | Previously CS master’s student @UWaterloo | Researching the intersection of theoretical CS and ML theoryGabriel Ilharco @gabriel_ilharco
4K Followers 1K Following Building cool things @xAI. Prev. PhD at UW, Google AIMicah Goldblum @micahgoldblum
5K Followers 690 Following 🤖Postdoc at NYU with @ylecun / @andrewgwils. All things machine learning🤖 🚨On the faculty job market this year!🚨DatologyAI @datologyai
966 Followers 17 Following DatologyAI builds tools to automatically select and optimize the best data on which to train AI models, leading to better models which train faster.Alex Robey @AlexRobey23
613 Followers 849 Following Ph.D. student at @Penn studying robust machine learning. Formerly @GoogleAI, @Livermore_Lab | B.S. & B.A. from @swarthmoreVictor Sanh @SanhEstPasMoi
9K Followers 2K Following Dog sitter by day, Scientist at @huggingface 🤗 by nightCollin Burns @CollinBurns4
11K Followers 276 Following Superalignment @OpenAI. Formerly @berkeley_ai @Columbia. Former Rubik's Cube world record holder.Noah Hollmann @ Neuri.. @noahholl
165 Followers 127 Following Medical and Comp Sci Student working on digital health and genomics @ Charité Berlin, BIH & U of Freiburg 🐍⚙️💊 / Ex-Google and -BCG Intern / XPrize FinalistMichal Lukasik @miclukasik
130 Followers 194 Following Research Scientist at Google Research, New York.Jürgen Schmidhuber @SchmidhuberAI
107K Followers 0 Following Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.Explainable AI @XAI_Research
2K Followers 764 Following Explainable/Interpretable AI researchers and enthusiasts - DM to join the XAI Slack! Twitter and Slack maintained by @NickKroeger1Simon Kornblith @skornblith
3K Followers 999 Following researcher/engineer @AnthropicAI | former @GoogleDeepMind @mitbrainandcog @zotero | @[email protected]Bosch Center for Arti.. @Bosch_AI
13K Followers 109 Following Shaping the future of #IndustrialAI | developing innovative #AItechnologies + solutions for Bosch Imprint/Data Protection Policy: https://t.co/e27jQvM9i4Calvin Luo @calvinyluo
734 Followers 176 Following PhD Student @BrownUniversity. Former @GoogleAI Resident. @UofT Alum.Yangsibo Huang @YangsiboHuang
1K Followers 726 Following PhD candidate @Princeton. Prev: @GoogleAI @AIatMeta.Sang Choe @sangkeun_choe
134 Followers 136 Following cs phd @carnegiemellon. i like dynamical systems, linear algebra, convex optimization, taylor series, and bernedoodle.AI Breakfast @AiBreakfast
167K Followers 209 Following The latest rumors and developments in the world of artificial intelligence. DM to include your AI project in the newsletter.Microsoft Research @MSFTResearch
553K Followers 2K Following We advance science and technology to benefit humanity. https://t.co/kz0nARXbwT Register for Microsoft Research Forum on June 4 ⬇️ Get our newsletterStella Biderman @BlancheMinerva
15K Followers 748 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/herJiaming Song @baaadas
5K Followers 992 Following Chief Scientist @LumaLabsAI. Working on visual generative AI. Were @NVIDIA @Stanford @OpenAI @MetaAIMaksym Andriushchenko.. @maksym_andr
3K Followers 930 Following phd student at @EPFL🇨🇭 // google & open phil phd ai fellow // past @adoberesearch @uni_tue // best way to support 🇺🇦 https://t.co/fxomgJ7NU9Michael Oberst @MichaelOberst
2K Followers 937 Following Incoming Assistant Professor of Computer Science at @JohnsHopkins, postdoc at @CarnegieMellon. PhD from @MIT_CSAIL. Reliable ML & Causality for Healthcare.Dr. Karen Ullrich @karen_ullrich
4K Followers 562 Following Research scientist at FAIR NY + collab w/ Vector Institute. ❤️ Machine Learning + Information Theory. Previously, PhD at UoAmsterdam, intern at DeepMind + MSRC.Together AI @togethercompute
27K Followers 303 Following The future of AI is open-source. Let's build together.Hussein Mozannar @HsseinMzannar
834 Followers 912 Following PhD candidate @mitidss @mit_csail working on Human-AI Interaction 🇱🇧Theophile Gervet @theo_gervet
1K Followers 482 Following Accelerating open-source AI @MistralAI. Past: @Meta AI, PhD @SCSatCMUHattie Zhou @oh_that_hat
5K Followers 765 Following Finding \hat{y} Give me anonymous feedback: https://t.co/7aBNrpbad8Fahim Tajwar @FahimTajwar10
165 Followers 242 Following PhD Student @mldcmu @SCSatCMU BS/MS from @StanfordSanae Lotfi @LotfiSanae
2K Followers 251 Following PhD student @NYUDataScience, prev. Visiting Researcher @MetaAI (FAIR), @DeepMind and @MSFTResearch FellowAvi Schwarzschild @A_v_i__S
267 Followers 183 Following Postdoc at CMU. Trying to learn about deep learning faster than deep learning can learn about me.Ashwini Pokle @ashwini1024
271 Followers 434 Following PhD student at CMU (@mldcmu). Prev. @Stanford, @bitspilaniindia | interested in generative models and deep equilibrium modelsNew Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here: anthropic.com/research/probe…
lots of talk about “ethical AI”, not so much about what it actually means to be an ethical agent... long story short, if what we care about is “bad stuff that happens due to AI,” I don’t think (ethical) agency is a particularly useful or even logically sound starting point!!!
1/What does it mean for an LLM to “memorize” a doc? Exactly regurgitating a NYT article? Of course. Just training on NYT?Harder to say We take big strides in this discourse w/*Adversarial Compression* w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter 🌐:locuslab.github.io/acr-memorizati…🧵
LLMs are often said to "hallucinate", "confabulate", or produce untruthful responses, which led to much work trying to mitigate such behavior. But what does it mean for an LM to hallucinate? And how can we effectively intervene in model internals to combat hallucinations?
Super excited to share that I successfully defended my PhD thesis "Understanding Generalization and Robustness in Modern Deep Learning" today 👨🎓 A huge thanks to the thesis examiners @SebastienBubeck, @zicokolter, and @KrzakalaF, jury president Rachid Guerraoui, and, of course,…
Happy to share our work on preference learning methods for LLMs. Key insights: 1. Use more on-policy samples > off-policy samples 2. Contrastive DPO > Pref-FT. Also we provide insights on DPO's training mechanism. 3. Theoretical unification under mode-covering/seeking KL
Many LLM fine-tuning methods. Unclear what you should use & why? In our new paper, we did an extensive study of on-policy RL, supervised & offline contrastive methods (DPO, IPO) to answer this... 🧵⬇️ On-policy > offline, mode-seeking > mode-covering understanding-rlhf.github.io
I passed my thesis proposal! 🎊Thanks to my amazing committee @fangf07, @hongshenus, Geoff Gordon, @katjahofmann, & @OriolVinyalsML for their feedback & support. & thank you to my friends and collaborators for waking up early today to attend 🖤
How do model components (conv filters, attn heads) collectively transform examples into predictions? Is it possible to somehow dissect how *every* model component contributes to a prediction? w/ @harshays_ @andrewilyas, we introduce a framework for tackling this question!…
I'm super excited to have been selected for this year's Heidelberg Laureate Forum, and I'm looking forward to meeting everyone! 🎉 #HLF24
Congratulations to the 200 exceptional young researchers who have been selected to attend the 11th HLF! #HLF24 Sarah, Martina, Yasmin, Anna and the whole HLFF team look forward to seeing you all in Heidelberg this September for a week full of networking, exchange and inspiration!
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…
I'm excited to share that I'll be joining @Apple as a research intern this summer! 🤗 working on multimodal learning. Looking forward to a summer full of learning and growth! 🥳. Exciting months ahead. 🎉🎊 I will be at Apple Park 🏞️ let me know if you want to meet up!
Super excited to introduce 🌳Acadia (@AcadiaAI ) Playground, an interpretable data exploration tool to understand your evaluation data’s quality and help unlock insights into model performance using AI! 🧵
Honored to receive the 2024 Jane Street Graduate Research Fellowship! Thank you @JaneStreetGroup for the award and for organizing an amazing workshop! The best part of this was getting to meet PhD students working on algebraic geometry, cosmology, quantum algorithms, and more!
I am pleased to share a significant milestone in my academic journey. Yesterday, surrounded by brilliant minds and supportive hearts, I successfully defended my PhD dissertation at Carnegie Mellon University.
📬 Life update: I'm happy to share that I’ll be joining @DesignLabUCSD at @UCSanDiego as a PhD student this Fall, working with @stevendow! Excited to continue my research in human-computer/AI interaction to enhance how we discover, understand, and utilize information.
Yes Paul (@pliang279), that bubbly is yours! 🥳🎉Congratulations on your very successful dissertation defense! (on the "Foundations of Multisensory Artificial Intelligence" in the @mldcmu ,@SCSatCMU, @CarnegieMellon ).
I’m excited to share that this Fall I will start a PhD at @MITEECS, supported by the NSF Graduate Research Fellowship Program!
🎉 Excited to share that I'll be joining @MSFTResearch as a Research Intern this summer! I'll be working on aligning large language models to better understand and harness their capabilities. Looking forward to contributing to this groundbreaking field!