Banghua Zhu @BanghuaZ
PhD @Berkeley_EECS, statistics, info theory, LLM, RL, Human-AI Interactions. people.eecs.berkeley.edu/~banghua/ Berkeley, CA Joined August 2018-
Tweets212
-
Followers2K
-
Following804
-
Likes2K
Ongoing lawsuits against GenAI firms over possible use of #copyrighted data for training raise vital questions for our society. 🤖⚖️ How can we address the copyright challenges? New research proposes a solution: "An Economic Solution to Copyright Challenges of Generative AI"
Excited to announce #SnowflakeArctic, our new OSS LLM. Play with it at arctic.streamlit.app Read our cookbook at snowflake.com/en/data-cloud/… Read our blog at snowflake.com/blog/arctic-op… We are just getting started ...
Excited to announce #SnowflakeArctic, our new OSS LLM. Play with it at arctic.streamlit.app Read our cookbook at snowflake.com/en/data-cloud/… Read our blog at snowflake.com/blog/arctic-op… We are just getting started ...
phi-3 is here, and it's ... good :-). I made a quick short demo to give you a feel of what phi-3-mini (3.8B) can do. Stay tuned for the open weights release and more announcements tomorrow morning! (And ofc this wouldn't be complete without the usual table of benchmarks!)
Chatbot Arena usually captures the combination of two aspects: Basic capability + human preference alignment. In terms of basic capability, it seems still not yet at GPT-4 level from all benchmark metrics. But Llama3 did a really great job on human preferecen alignment, likely…
Chatbot Arena usually captures the combination of two aspects: Basic capability + human preference alignment. In terms of basic capability, it seems still not yet at GPT-4 level from all benchmark metrics. But Llama3 did a really great job on human preferecen alignment, likely…
Many LLM fine-tuning methods. Unclear what you should use & why? In our new paper, we did an extensive study of on-policy RL, supervised & offline contrastive methods (DPO, IPO) to answer this... 🧵⬇️ On-policy > offline, mode-seeking > mode-covering understanding-rlhf.github.io
Very excited about the release of arena hard, the main benchmark we looked at when selecting the checkpoints for Starling model. It focuses on a subset of very hard prompts from chatbot arena.
Very excited about the release of arena hard, the main benchmark we looked at when selecting the checkpoints for Starling model. It focuses on a subset of very hard prompts from chatbot arena.
Llama3 reminds everyone of the misconception about scaling laws again: it's not that a larger model is always better, but that a larger model is cheaper to train if you want to reach the same performance. Yes, this might be somewhat counter-intuitive, but this is one of the key…
Congrats to @AIatMeta on Llama 3 release!! 🎉 ai.meta.com/blog/meta-llam… Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we'll see when the rankings come in @ @lmsysorg :)) 400B is still training, but already encroaching…
Introducing our best OLMo yet. OLMo 1.7-7B outperforms LLaMa2-7B, approaching LLaMa2-13B at MMLU and GSM8k. High-quality data and staged training are key. I am so proud of our team making such significant improvement in a short period after our first release.
Introducing our best OLMo yet. OLMo 1.7-7B outperforms LLaMa2-7B, approaching LLaMa2-13B at MMLU and GSM8k. High-quality data and staged training are key. I am so proud of our team making such significant improvement in a short period after our first release. https://t.co/9NNwCxAwj6
📢We're thrilled to announce that Kurt Keutzer will give the keynote speech for MLSys 2024 Young Professionals Symposium. Welcome to join us for exciting invited talks by @Azaliamirh, Xupeng Miao, @jiawzhao , @ying11231 , @tri_dao on cutting-edge MLSys research! The full…
Welcome to our AI tea talks Singapore series. The very first talk will be given by Prof. Natasha Jaques from UW/Google Deepmind about Reinforcement learning with human feedback. Zoom link: nus-sg.zoom.us/j/84608066438Z… meeting ID: 846 0806 6438 All are welcome to join.…
I will give a keynote on Theoretical Foundations of Foundation Models (TF2M) workshop in ICML'24 and be a panelist to discuss interesting topics.
I will give a keynote on Theoretical Foundations of Foundation Models (TF2M) workshop in ICML'24 and be a panelist to discuss interesting topics.
Check out the ICML workshop on Theoretical Foundations of Foundation Models!
Check out the ICML workshop on Theoretical Foundations of Foundation Models!
My group at Berkeley Stats and EECS has a postdoc opening in the theoretical (e.g., scaling laws, watermark) and empirical aspects (e.g., efficiency, safety, alignment) of LLMs or diffusion models. Send me an email with your CV if interested!
Language models struggle to search, not due to an architecture problem, but a data one! They rarely see how to search or backtrack. We show how LLMs can be taught to search by representing the process of search in language as a flattened string, a stream of search (SoS)!
In @myhakureimu's recent work, we observed something very similar! Consider this prompt: 3+5=9 5+10=16 3+4=8 1+1=? LLMs will answer 2! What if we provide hundreds of examples? LLMs will give up the original definition of "addition", and will start predicting 3!
In @myhakureimu's recent work, we observed something very similar! Consider this prompt: 3+5=9 5+10=16 3+4=8 1+1=? LLMs will answer 2! What if we provide hundreds of examples? LLMs will give up the original definition of "addition", and will start predicting 3! https://t.co/23Da42B8VO
🆕 Check out the recent update of 𝕎𝕚𝕝𝕕𝔹𝕖𝕟𝕔𝕙! We have included a few more models including DBRX-Instruct @databricks and StarlingLM-beta (7B) @NexusflowX which are both super powerful! DBRX-Instruct is indeed the best open LLM; Starling-LM 7B outperforms a lot of even…
🆕 Check out the recent update of 𝕎𝕚𝕝𝕕𝔹𝕖𝕟𝕔𝕙! We have included a few more models including DBRX-Instruct @databricks and StarlingLM-beta (7B) @NexusflowX which are both super powerful! DBRX-Instruct is indeed the best open LLM; Starling-LM 7B outperforms a lot of even… https://t.co/imWcH5BGtq
Huge congrats to the amazing folks at lmsys! Vicuna and chatbot arena are really important milestones in the field of open source and LLMs!
Huge congrats to the amazing folks at lmsys! Vicuna and chatbot arena are really important milestones in the field of open source and LLMs!
Just wrote a new article on "Tips for LLM Pretraining and Evaluating Reward Models" (magazine.sebastianraschka.com/p/tips-for-llm…). Here, I am reviewing a paper that discusses strategies for continuing LLM pretraining. Then, I discuss reward modeling used in reinforcement learning with human…
Pauthare @PautharepxNK
0 Followers 19 FollowingHui Xu @HuiXu43118541
3 Followers 44 FollowingAkash @pocuseverything
2K Followers 5K FollowingThutoez @thutoez71050
0 Followers 181 FollowingShanita Sachar @SacharSac
17 Followers 3K FollowingDmitry Lyalin @LyalinDotCom
9K Followers 6K Following Product @ Google | Firebase serverless lead (web, compute, storage & AI & ML). Previously product @MSFT | 24+ years in tech .. dev, PMM, PM Opinions are my ownScoanitosm @scoanitosm43603
0 Followers 72 FollowingSiloughf @siloughf4685
0 Followers 179 FollowingEarthaEva @W1RF1BW3E7nFka4
0 Followers 89 FollowingDorothy @dorothy_austin5
132 Followers 3K FollowingHenry John @HenryJohn125977
0 Followers 6 FollowingJungwon Choi @JungwonChoi11
16 Followers 33 Following Assistant professor @ UWECE / Power Electronics/HF Power Converter/WPT/Renewable Energy SystemGuannan Qu @guannanqu
115 Followers 80 Following Assistant Professor at CMU Machine learning, control, reinforcement learning, multi-agent systemsMickel Liu @mickel_liu
100 Followers 235 Following research visiting @uwnlp, Prev: @PKU1898, @uoftengineering RL + LLMDawnBruce @i9VUCN077txLrk
0 Followers 169 Followingวิวรรณา @3Vu2TQSIem5X6
47 Followers 1K Following เราเจอชะตากรรมแบบไหน ชอบติดตามไว้ก่อนได้นะครับ ผมจะส่งข้อมูลติดต่อไปที่หน้าแรกเป็นระยะๆครับPurring Lynx @Purring_Lynx
35 Followers 121 Following ° autonomous systems engineer ¶ ° wired in via RJ45 ¶ ° running serial experiments on AI ¶ ° effectively accelerating ¶ ° pet lynx of @_Mira___Mira_neo @neobyd
49 Followers 392 FollowingYifang Chen @cloudwaysX
455 Followers 641 Following Ph.D. student @uwcse. Previously @usc undergrad. Online Learning, reinforcement learning, bandits, and active learning.Jiaxin Huang @jiaxinhuang0229
296 Followers 54 Following Incoming assistant professor @WUSTL CSE. PhD Candidate @IllinoisCS. Currently visiting @uwnlp. NLP, ML, Data Mining.Florine Colletta @CollettaFl70045
83 Followers 5K FollowingZhenting Wang @wang1999_zt
79 Followers 234 Following PhD Student @RutgersCS. Trustworthy and Responsible Generative Artificial Intelligence. Intern @SonyAI_global (current) @Meta GenAI (incoming)Shu @Rainb0ish
688 Followers 6K Following The girl with broken tooth. I fancy neurosynaptic chips more than potato chips. Love reading scientific papers & procrastination. Violin.hypocrite. fan igjh w @jhw990164844563
0 Followers 10 Following【𝕐o𝕦𝕤𝕖�.. @YosGPT
10K Followers 5K Following Programming Engineer & Linux+ | IT & Net+ | CCIE & CISSP | Azure Developer & Multi-Clouds Architect+ | Quantum AI Builder+ | #الحمدلله_على_نعمة_الامارات 🇦🇪 ❤️Arnav Das @arnaved
75 Followers 325 Following郝博阳 @ekaths
15 Followers 345 FollowingMichael M. Pieler @MichaelMPieler
332 Followers 1K FollowingMuhammad Abdullah @Abdullah_kwl
42 Followers 501 Following Life is better when you're laughing...... "your time is limited,So don't waste it living someone else's life❤Jonathan Wang @givemettt5600
23 Followers 179 FollowingTianle (Tim) Li @LiTianleli
13 Followers 10 Following EECS Undergraduate at UC Berkeley. ML Researcher at @BerkeleySky and @lmsysorgGantavya Bhatt @BhattGantavya
548 Followers 1K Following Ph.D. Student @UW, MELODI Lab and @uw_wail at @uwcse Formerly @amazonscience, EE undergrad @iitdelhi. An active photographer and Alpinist!Giulia Fanti @giuliacfanti
2K Followers 675 Following Assistant prof @ CMU ECE studying privacy, data sharing, and generative modelsDimitris Papailiopoul.. @DimitrisPapail
11K Followers 976 Following prof @ wisconsin; thinking about transformers; learning in context; babas of Inez LilyDelsie Specter @specter60156
86 Followers 5K FollowingWoosuk Kwon @woosuk_k
2K Followers 351 Following PhD student at @Berkeley_EECS building @vllm_projectElachqar Oussama @Oussama_e
60 Followers 2K FollowingEvan @evan_a_frick
6 Followers 12 Following CS at Berkeley. ML Research @berkeley_ai ML Engineer @NexusflowXMesubsetofRunionC @mesubsetof
33 Followers 471 FollowingEternal Max @k9TWEQPCC5d2dUG
66 Followers 459 FollowingShishir Patil @shishirpatil_
3K Followers 850 Following CS PhD @ UC Berkeley. Creator of Gorilla, GoEx, RAFT, OpenFunctions and Berkeley Function Calling Leaderboard. Previously researcher @GoogleAI @MSFTResearchIrene Chen @irenetrampoline
8K Followers 817 Following ML for equitable healthcare. Assistant Professor @UCBerkeley and @UCSF. Prev @Harvard, @MIT, @MSFTResearchUW News @uwnews
23K Followers 2K Following Experts, research and administration news from the University of Washington. Media assistance: [email protected]. See also: @UW @UWAthletics @UWMedicineUniversity of Washing.. @UW
186K Followers 2K Following University of Washington students, faculty and staff believe in boundless opportunities. Do you dare to Be Boundless? At the UW, you can.Mechanical Engineerin.. @ME_at_UW
2K Followers 418 Following Our faculty and students create a healthier, cleaner and more prosperous world. @UW @UWEngineeringUW Student Life @uwstudentlife
5K Followers 466 Following Follow us and learn more about student life at the University of Washington!UW Population Health .. @UW_PHI
2K Followers 187 Following The @UW Population Health Initiative seeks to create a world where all people can live healthier and more fulfilling lives.UW iSchool @uw_ischool
6K Followers 2K Following Official account of the University of Washington (@UW) Information School, one of the world's top schools in information science. We make information work.UW Alumni @UWalum
9K Followers 742 Following The UW Alumni Association is the foundation of the University of Washington alumni community. We connect alumni and friends around the world to the UW!UW Medicine Newsroom @uwmnewsroom
7K Followers 1K Following Newsroom reports news from UW Medicine and the University of Washington School of Medicine. We cover clinical care, research, education and issues.Ana Mari Cauce @amcauce
9K Followers 792 Following @UW president, loves teaching, learning & dawgs of all kinds, advocate for access & excellence, sings Bow Down to WA w/a rumba beat #GoHuskiesUW College of Educati.. @UWCollegeOfEd
5K Followers 880 Following From changemakers to educators, we create the leaders of tomorrow. Share your UW College of Education experience with #EduDawgs.Jungwon Choi @JungwonChoi11
16 Followers 33 Following Assistant professor @ UWECE / Power Electronics/HF Power Converter/WPT/Renewable Energy SystemUW Engineering @uwengineering
11K Followers 2K Following Research and administration news from the University of Washington’s College of Engineering. See also: @UWNews and @UW.UW ECE @uw_ece
2K Followers 606 Following Electrical & Computer Engineering at the University of Washington is a top-ranked, vibrant department, leading in cutting-edge science, technology & innovation.Mickel Liu @mickel_liu
100 Followers 235 Following research visiting @uwnlp, Prev: @PKU1898, @uoftengineering RL + LLMGuannan Qu @guannanqu
115 Followers 80 Following Assistant Professor at CMU Machine learning, control, reinforcement learning, multi-agent systemsVolkan Cevher @CevherLIONS
3K Followers 579 Following Associate Professor of Electrical Engineering, EPFL. Amazon Scholar. ELLIS Fellow.Yifang Chen @cloudwaysX
455 Followers 641 Following Ph.D. student @uwcse. Previously @usc undergrad. Online Learning, reinforcement learning, bandits, and active learning.Jiaxin Huang @jiaxinhuang0229
296 Followers 54 Following Incoming assistant professor @WUSTL CSE. PhD Candidate @IllinoisCS. Currently visiting @uwnlp. NLP, ML, Data Mining.Giulia Fanti @giuliacfanti
2K Followers 675 Following Assistant prof @ CMU ECE studying privacy, data sharing, and generative modelsGantavya Bhatt @BhattGantavya
548 Followers 1K Following Ph.D. Student @UW, MELODI Lab and @uw_wail at @uwcse Formerly @amazonscience, EE undergrad @iitdelhi. An active photographer and Alpinist!Woosuk Kwon @woosuk_k
2K Followers 351 Following PhD student at @Berkeley_EECS building @vllm_projectTianle (Tim) Li @LiTianleli
13 Followers 10 Following EECS Undergraduate at UC Berkeley. ML Researcher at @BerkeleySky and @lmsysorgAhmad Al-Dahle @Ahmad_Al_Dahle
4K Followers 53 Following #Girldad of twins. Leading GenAI @ Meta (llama, imagine, meta ai and more)Sheng Shen @shengs1123
1K Followers 540 Following Ph.D. student @berkeley_ai; Building 🦙@MetaAi; Former @MSFTResearch, @allen_ai, @GoogleDeepMindSergey Edunov @edunov
948 Followers 103 Following Director of Engineering @ GenAI, Meta. I work on LlamasSimon Willison @simonw
71K Followers 5K Following Creator @datasetteproj, co-creator Django. PSF board. @nichemuseums. Hangs out with @natbat + @cleopaws. He/Him. Mastodon: https://t.co/t0MrmnJW0KGüçlü Gökozan @GucluGokozan
17K Followers 2K Following • CEO & Entrepreneur ⛵️🏀 https://t.co/4HjOvw5eyTSusan Murphy lab @SusanMurphylab1
3K Followers 86 Following Designing trial and developing data analytic methods for informing intervention optimization in digital healthZhuang Liu @liuzhuang1234
3K Followers 933 Following Research Scientist @MetaAI (FAIR, at NYC). machine learning, computer vision, neural networks. PhD from @Berkeley_EECSNatasha Jaques @natashajaques
25K Followers 1K Following Senior Research Scientist at @GoogleAI and Assistant Professor @uwcse. Social Reinforcement Learning in multi-agent and human-AI interactions. PhD from @MIT.Ananya Kumar @ananyaku
4K Followers 472 Following Researcher at @openai Previously PhD at Stanford University (@StanfordAILab) advised by Percy Liang and Tengyu MaShashank Sonkar @shashank_nlp
51 Followers 396 Following NLP+Education | Grad Student @rbaraniuk group | @RiceECE @rice_dsp @OpenStax @IITKanpurNezihe Merve Gürel @nmervegurel
1K Followers 481 Following Faculty @TUDelft, Prev. @ETH_en @Stanford @EPFL_en Interested in ML robustness, reliability and reasoning Exec. Editor of https://t.co/qddizl9xTb @DMLRJournalRyan David Cotterell @ryandcotterell
9K Followers 1K FollowingHanna Hajishirzi @HannaHajishirzi
6K Followers 328 Following Associate professor at @uw_cse; senior director at @allen_ai co-leading @allenNLP; AI/NLP researcher at @uw_nlpAnanda Theertha Sures.. @th33rtha
521 Followers 125 Following Researcher in machine learning and information theory.Lisa Dunlap @lisabdunlap
502 Followers 154 Following PhD student & vibe curator @berkeley_ai and Sky Computing Lab -- for the love of god look at your data@Toong @TianDatong
293 Followers 2K Following Intelligent Symbiosis - All Things Connected, A Light in the Rift. 智能共生——万物互联,裂隙有光。 https://t.co/jffHiZGZ7uSeen on the @WSJ ! Join us at snowflake.com/summit/
It’s the time of the year that new faculty members are about to choose their offer and start a faculty job. 🤩🤩🤩I have some advice that I wish I knew when I first started: 1/6
Nice collection of finetuning datasets
💾 LLM Datasets LLM development is increasingly moving towards curating high-quality datasets, as shown by Llama 3. I've compiled a collection of fine-tuning datasets along with advice and tools for creating your own. 💻 GitHub: github.com/mlabonne/llm-d…
💾 LLM Datasets LLM development is increasingly moving towards curating high-quality datasets, as shown by Llama 3. I've compiled a collection of fine-tuning datasets along with advice and tools for creating your own. 💻 GitHub: github.com/mlabonne/llm-d…
Phi 3 and Arctic: Outlier LMs are hints Models that seem totally out of scope from recent open LLMs give us a sneak peek of where the industry will be in 6 to 18 months. interconnects.ai/p/phi-3-and-ar…
Tell me that you're a language model from X corporation without telling me you're a language model from X corporation.
Memory is available to all ChatGPT Plus users. We hope that you will find the answers become more personalized and relevant over time with use.
Memory is now available to all ChatGPT Plus users. Using Memory is easy: just start a new chat and tell ChatGPT anything you’d like it to remember. Memory can be turned on or off in settings and is not currently available in Europe or Korea. Team, Enterprise, and GPTs to come.
I'll be at #AISTATS2024 later this week! With Madhow, we will co-present @BhagyashreePu13's poster on TEXP to improve robustness with a tweak to the first layer of the network. Looking forward to meeting old and new friends!
📢📢📢 Late post, but here we go...! I am thrilled to announce that our work on 𝙚𝒏𝙝𝒂𝙣𝒄𝙞𝒏𝙜 𝙤𝒖𝙩-𝙤𝒇-𝒅𝙞𝒔𝙩𝒓𝙞𝒃𝙪𝒕𝙞𝒐𝙣 𝙧𝒐𝙗𝒖𝙨𝒕𝙣𝒆𝙨𝒔 of deep neural networks has been accepted to 𝘼𝑰𝙎𝑻𝘼𝑻𝙎 2024!
I'm honored to receive the Amazon Research Award🎉 My group will be exploring how to use LLMs better, guided by principles of information and coding theory. Special thanks to @myhakureimu @yzeng58 and @yingfan_bot, who are already actively engaged in this exciting research 😊
The recipients, representing 51 universities in 15 countries, will have access to Amazon public datasets, AWS AI/ML services and tools, and more. Congrats to the 99 awardees! #AmazonResearchAwards amazon.science/research-award…
After years eclipsed by its big brothers, gpt-2 resurgant? 🤔
The hype for finding out what is "gpt2-chatbot" on lmsys chatbot arena is real 😅
(perhaps) the most important topic in LLMs -- the data recipe!
We’re excited to share insights and lessons learned collecting the data needed for Arctic as part of our #SnowflakeArctic Cookbook Series. 📖 Our third edition covers the filtering, processing, and composition techniques we used, including what worked and what didn't.
Professor life is off to a great start! Honored to receive a grant from Apple ML Research and to be named a Google Research Scholar. Looking forward to more work developing ML methods for healthcare and equity Pictured: an apple, Google, and me
A suit jacket and a backpack is the universal uniform of the academic job interview in CS
@srush_nlp Hey Sasha, I think it makes sense. Phi-3 is fundamentally different from other models, so its behavior can be unexpected in some cases, both in a good and bad way (hopefully though much more in a good way ;-)).
What a week since we released Llama 3! I couldn’t be more proud of the response. 🏆 Llama 3 70B is now the highest ranking open model on @lmsysorg leaderboard. 📈 1.2M+ downloads. 🤗 600+ derivative models on @huggingface. I'm excited for much more to come.
Excited to partner w/ @vipulved @percyliang @tri_dao and team on this!
Together AI and Snowflake partner to bring their state-of-the-art Arctic LLM to enterprise customers. Experience Arctic on Together Inference with best in class performance. api.together.xyz/playground/cha…
Ppl ask: Why not simply add gradient to the backward sampling process of a diffusion model? Big NO! 🚩Naive gradient don't work as guidance!🚩 Naive gradient jeopardizes the data manifold learnt from pre-training. We show in theory and experiment that it take samples far away…
🎉🎉🎉We're thrilled to announce the kickoff of our Foundations of AI Seminar (FAIS) series, featuring an impressive lineup of speakers, starting tomorrow. Our first seminar is a special one, as we are honoured to welcome Prof. Volkan Cevher @CevherLIONS from @EPFL_en.
Ongoing lawsuits against GenAI firms over possible use of #copyrighted data for training raise vital questions for our society. 🤖⚖️ How can we address the copyright challenges? New research proposes a solution: "An Economic Solution to Copyright Challenges of Generative AI"