Xinyang (Young) Geng @younggeng

Research scientist at Google DeepMind young-geng.xyz Joined February 2014

Tweets

32
Followers

799
Following

461
Likes

142

Rulin Shao @RulinShao

7 months ago

Introduce LightSeq for long-context LLM training: - Highly optimized for decoder models - smarter checkpointing - better support for fewer heads models up to 2x faster, 2-8x longer sequences vs Megatron-LM. arxiv.org/abs/2310.03294

7 93 378 104K 175

Download Gif

Hao Liu @haoliuhl

7 months ago

New paper w/ @matei_zaharia @pabbeel on transformers with large context size. We propose RingAttention, which allows training sequences that are device count times longer than those of prior state-of-the-arts, without attention approximations or incurring additional overhead.

10 179 854 319K 505

Download Image

Philipp Wu @philippswu

7 months ago

🎉Excited to share a fun little hardware project we’ve been working on. GELLO is an intuitive and low cost teleoperation device for robot arms that costs less than $300. We've seen the importance of data quality in imitation learning. Our goal is to make this more accessible 1/n

26 108 685 145K 222

Download Video

Igor Babuschkin @ibab

7 months ago

μP allows you to keep the same hyperparameters as you scale up your transformer model. No more hyperparameter tuning at large size! 🪄 It saves millions of $ for very large models. It’s easier to implement than it seems: You have to 1. Keep the initialization and learning rate…

Shital Shah @sytelus

7 months ago

10 27 241 132K 318

10 60 421 106K 378

Download Image

Boris Dayma 🖍️ @borisdayma

8 months ago

Seeing people struggling with FSDP… That's exactly where JAX shines, I can use pretty much any parallelism strategy with these few lines 💪

4 16 117 47K 48

Download Image

Szymon Tworkowski @s_tworkowski

9 months ago

🎇Introducing LongLLaMA-Instruct 32K!🎇 Inspired by @p_nawrot #nanoT5, we fine-tune LongLLaMA- on a *single GPU* for ~48h to improve upon OpenLLaMA: 55% on lm-eval (vs. 53%), better perf on long context and code! We open-source our optimized fine-tuning code in PyTorch/HF!🧵

9 78 308 69K 173

Download Image

Xinyang (Young) Geng @younggeng

10 months ago

Aviral is one of the best collaborators I've worked with. For prospective students interested in RL and decision making, I'd strongly recommend him as an advisor.

Aviral Kumar @aviral_kumar2

10 months ago

Aviral is one of the best collaborators I've worked with. For prospective students interested in RL and decision making, I'd strongly recommend him as an advisor.

66 30 678 111K 35

0 0 18 4K 1

Charlie Snell @sea_snell

11 months ago

Check out our recent work! We find that recent models that imitate ChatGPT — like Alpaca, Vicuna, Koala — largely learn ChatGPT’s style and less so its capabilities/factuality. And that base model quality can be a highly effective lever for improving on factuality.

Arnav Gudibande @arnavg_

11 months ago

24 85 515 186K 243

Download Image

9 12 90 24K 16

lmsys.org @lmsysorg

a year ago

Evaluating LLMs is notoriously difficult, and academic benchmarks may fail. Inspired by chess and MOBA games, we are taking a new approach by calculating Elo ratings of models with crowdsourced battle data. - Blog: lmsys.org/blog/2023-05-0… - Leaderboard: leaderboard.lmsys.org

31 276 1K 335K 592

Download Image

Hao Liu @haoliuhl

a year ago

As a part of our effort to replicate LLaMA in an open-source manner, we are pleased to announce the release of preview of the 7B OpenLLaMA model that has been trained with 200 billion tokens on the RedPajama dataset. github.com/openlm-researc…

32 402 2K 346K 745

Lior⚡ @AlphaSignalAI

a year ago

Berkley just released Koala-13B! The open-source chatbot was trained by fine-tuning LLaMA on web dialogue! 50% of responses are similar to ChatGPT. The paper also suggests that training with high-quality datasets can compensate for smaller models, possibly matching larger ones

11 179 844 199K 366

Download Image

Berkeley AI Research @berkeley_ai

a year ago

Check out Koala 🐨: a new chatbot from BAIR researchers fine-tuned on dialogue that approaches ChatGPT quality! Work led by @haoliuhl @Eric_Wallace_ @ArnavGudibande and Xinyang Geng Blog: bair.berkeley.edu/blog/2023/04/0… Demo: koala.lmsys.org

15 160 691 278K 362

Download Image

Prime Intellect @PrimeIntellect

9K Followers 424 Following Find compute. Train models. Co-own intelligence. https://t.co/3NC0duKF4a.

Amirmmi @Amirmmi1998

1 Followers 81 Following

MoonRide @moonride303

89 Followers 4K Following Friend of AIs

Jin Schofield @jinschofield

88 Followers 546 Following CS @princeton

🚀 Journeying towards financial freedom & sharing the map as I chart it. ✨ Free digital gems weekly. 🎨 Web & graphic designer, Print on Demand.

Phil Greene @vtguy65

787 Followers 3K Following 🚀 Journeying towards financial freedom & sharing the map as I chart it. ✨ Free digital gems weekly. 🎨 Web & graphic designer, Print on Demand.

I am an AI and Machine Learning Engineer specializing in Game Theory and Reinforcement Learning, holding an MPhil in Computer Science from HKUST.

Stefan Juang @StefanJuang

169 Followers 2K Following I am an AI and Machine Learning Engineer specializing in Game Theory and Reinforcement Learning, holding an MPhil in Computer Science from HKUST.

Norbert Biedermann @ .. @NJBiedermann

602 Followers 3K Following VIsionary - Expert (@LinkedIn) - Online Research Professional

MSS @sajwan_mellow

22 Followers 385 Following

Mikkel @Mikkel86881951

425 Followers 2K Following

Álvaro @hilvanado

6 Followers 84 Following

Aaditya ; @Aaditya26082004

547 Followers 7K Following CS'26 • Machine Learning • Open-Source • Web Dev. • Algorithms • Jai Shree Krishna 🦚🪈

KB @katiebowles_

642 Followers 5K Following Advancing AI for Healthcare at Scale at @AbridgeHQ | $150M Series C 🚀 | We're Hiring!

aVerity @AVerityjane

4 Followers 144 Following

Eva Louise Marie Gabr.. @e681554349

9 Followers 3K Following

HinePo @Hine__Po

193 Followers 440 Following Head of AI & Data. Data science tech lead. Chemical engineer. Kaggle Competitions Expert (top 1%).

Founder @DigitalApplied. Digital Marketing & Transformation | AI | SEO | PPC | Social Media | Web Development | eCommerce | Automation | CRM & Analytics.

Richard Gibbons @RichardGibbonsX

Peter Morales @PeterMoralesX

223 Followers 2K Following Founder of funded Stealth AI Startup. Interested in AI development at the edge? DM.

زِرِنگ @premature79

402 Followers 919 Following

Redie @rediejarvis

14 Followers 166 Following

Ethan (Yixing) Jiang @ethanjyx

121 Followers 302 Following Early engineer at https://t.co/n7q2e0bpY4, prev @CovariantAI @facebook

Jay Whang @jaywhang_

270 Followers 95 Following Research Scientist at @GoogleDeepMind

ma @ma52987379

0 Followers 120 Following

MIke @MIke71530700

1 Followers 100 Following

Joe Fredrick @fredric11642

5 Followers 54 Following

Moein Heidari @MoeinHeidarii

175 Followers 1K Following PhD Student @UBC Vancouver

metasyntactic @metasyntactik

123 Followers 1K Following medtech

Syed Amaan @syedamaann

362 Followers 4K Following exited founder, cs undergrad. I oscillate between ai research and real-world ai

Ahad Jawaid @ahadj0

14 Followers 103 Following CS undergrad at UTD. Interested in Generative Models and Autonomous Decision Making. Interned at @alexa99

Ray Lillywhite @LillywhiteRay

18 Followers 147 Following 🇹🇼

TommyTang @Tommy_Tang_930

23 Followers 223 Following

Dᵇ(E/(ℤ/6)) @filmtransistor

152 Followers 2K Following Mathematicians:

Ole Jonas @friendly_tweedy

54 Followers 500 Following

Horozkentli @Horozke64041910

654 Followers 5K Following Özgürlük

Jishuai MIAO @JishuaiM88686

23 Followers 583 Following

Michel Teivel @michelteivel

1K Followers 5K Following There is no fear in love.

shubham maheshwari @here_for_papers

10 Followers 103 Following

Ömer Faruk UZKAL @OmerUzkal

69 Followers 837 Following Artificial Intelligence

Eddieeeee @zhangshiyu8023

11 Followers 47 Following Working at @Meta

Arif Ahmad @arif_ahmad_py

307 Followers 7K Following All things AI, Computer Science and Circuits! Prev. @GoogleAI

zhenyuan.ai @zhenyuanMIN

453 Followers 4K Following 24-25惊涛骇浪

Jack Reacher @JackReach516

77 Followers 1K Following

Allan @dbsynergy

193 Followers 1K Following

AVINASH ANAND @avin_anand

20 Followers 420 Following

GrownBreeze @bray_R

97 Followers 542 Following Product Manager@Daimler || People Observer || Mind Traveler｜World Citizen

Vinay Ahuja @vinayah

175 Followers 2K Following Passionate about Gen AI, Search, Advertising, Mobile, Creator economy

Elliot Luchansky, CFA: executive leader, expert in talent attraction, board advisory, and business optimization. Currently with CyberNova, MSP-focused PE

Elliot Luchansky @ElliotLuchansky

1K Followers 557 Following Elliot Luchansky, CFA: executive leader, expert in talent attraction, board advisory, and business optimization. Currently with CyberNova, MSP-focused PE

Pingchuan Ma @pika7ma

766 Followers 1K Following @mit_csail

Nazarsky @nzrsky

85 Followers 528 Following 📱 15+ years iOS ninja | 🤖 AI & ML enthusiast | ✨ Crafting digital magic for 500M+ users

Chaos Song @song_chaos2243

201 Followers 4K Following

AICurrent.ai @AIcurrent_ai

24 Followers 279 Following

Jay Whang @jaywhang_

270 Followers 95 Following Research Scientist at @GoogleDeepMind

シェイン・グウ @shanegJP

53K Followers 351 Following https://t.co/yYd252xC4w Gemini 1.5 Pro @GoogleDeepMind 東京・SF。元@GoogleAI Brain、元@OpenAI。英語: @shaneguML。全て個人意見です。

Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Delip Rao e/σ @deliprao

46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

Homer Walke @HomerWalke

172 Followers 94 Following PhD Student at UC Berkeley

Kefan XIAO @KevinKiao

194 Followers 234 Following Olympic weightlift AI - Pretraining&data of Palm2, Gemini and more.

Trevor Gale @Tgale96

1K Followers 250 Following Research Scientist @ Google DeepMind | PhD Candidate @ Stanford CS

secondary account, hardcore fans only.
friend of @agikoala the great researcher, main account: @yitayml
warning: hot takes.

yi 🦛 @agihippo

3K Followers 81 Following secondary account, hardcore fans only. friend of @agikoala the great researcher, main account: @yitayml warning: hot takes.

Sholto Douglas @_sholtodouglas

15K Followers 861 Following Scaling Gemini @Deepmind - working towards intelligence too cheap to meter

Arthur Zucker @art_zucker

3K Followers 356 Following Core Open Source Maintainer @huggingface 🤗

Member of Technical Staff at @inflectionAI. Former Research Scientist @Google. In a previous life, I did String Theory. Language models and Conversational AI.

alewkowycz @alewkowycz

3K Followers 174 Following Member of Technical Staff at @inflectionAI. Former Research Scientist @Google. In a previous life, I did String Theory. Language models and Conversational AI.

Researcher @ Google Deepmind. I work on JAX + Pallas (https://t.co/lPMsq3yzgL) and Gemini. In the past I worked on Oryx and TFP. I like learning.

Sharad Vikram @sharadvikram

1K Followers 510 Following Researcher @ Google Deepmind. I work on JAX + Pallas (https://t.co/lPMsq3yzgL) and Gemini. In the past I worked on Oryx and TFP. I like learning.

Hao AI Lab at UCSD. Our mission is to democratize large machine learning models, algorithms, and their underlying systems.

Hao AI Lab @haoailab

366 Followers 137 Following Hao AI Lab at UCSD. Our mission is to democratize large machine learning models, algorithms, and their underlying systems.

Brian Ichter @brian_ichter

2K Followers 178 Following Research Scientist at Google Brain, interested in robotics and AI

Quan Vuong @QuanVng

2K Followers 234 Following Robotics Research at @Physical_int, ex-@GoogleDeepMind Perpetually trying to find a quiet place to read.

Physical Intelligence @physical_int

4K Followers 8 Following Physical Intelligence (Pi), bringing AI into the physical world.

Cofounded & running @ml_collective.
Host of Deep Learning Classics & Trends.
Research at Google DeepMind.
DEI/DIA Chair of ICLR & NeurIPS.
Writing https://t.co/IbycyGfnDR

Rosanne Liu @savvyRL

33K Followers 969 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDR

Enrique Piqueras @epiqueras1

2K Followers 234 Following Organizing the world's information and making it universally accessible and useful using JAX @Google @Deepmind.

Jessy Lin @realJessyLin

2K Followers 728 Following PhD @Berkeley_AI | interactive language agents 🤖 💬

Noam Shazeer @NoamShazeer

5K Followers 12 Following Engineer

Final-year CS PhD student @Stanford. Previously, AI Resident @Google Brain, undergraduate @IITKanpur, research intern @MILAMontreal.

Archit Sharma @archit_sharma97

4K Followers 340 Following Final-year CS PhD student @Stanford. Previously, AI Resident @Google Brain, undergraduate @IITKanpur, research intern @MILAMontreal.

Yejin Choi @YejinChoinka

19K Followers 330 Following professor at UW, director at AI2, adventurer at heart

Senior staff scientist @GoogleDeepMind. PhD @StanfordNLP. PI #AlphaGeometry. Co-lead #Bard Multimodality, now #Gemini. Co-founder #MeenaBot (later LaMDA).

Thang Luong @lmthang

20K Followers 100 Following Senior staff scientist @GoogleDeepMind. PhD @StanfordNLP. PI #AlphaGeometry. Co-lead #Bard Multimodality, now #Gemini. Co-founder #MeenaBot (later LaMDA).

rapha gontijo lopes @rapha_gl

5K Followers 2K Following research @ openai

Stella Biderman @BlancheMinerva

15K Followers 749 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/her

Jiahui Yu @jhyuxm

2K Followers 777 Following Member of Technical Staff @OpenAI; previously Research Scientist at Google Brain/DeepMind.

PhD student @rlai_lab UAlberta, and @MSFTResearch. Currently visitor @berkeley_ai. Previously @MetaAI, @iitmadras. Opinions, if you find any, are my dog’s.

Manan Tomar @manan_tomar

299 Followers 512 Following PhD student @rlai_lab UAlberta, and @MSFTResearch. Currently visitor @berkeley_ai. Previously @MetaAI, @iitmadras. Opinions, if you find any, are my dog’s.

PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep running

Yao Fu @Francis_YAO_

14K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep running

Amil Dravid @_AmilDravid

318 Followers 300 Following PhD @Berkeley_AI

the tiny corp @tinygrad

33K Followers 61 Following We make tinygrad. Our mission is to commoditize the petaflop.

Jerry Tworek @MillionInt

7K Followers 284 Following I teach programs how to program @ OpenAI | putting the ball in the damn hoop - @jacobmenick

Jelani Nelson @minilek

22K Followers 184 Following Professor @Berkeley_EECS. Research Scientist (part-time) @GoogleAI. Founder @addiscoder. 🇻🇮🇺🇸🇪🇹

Hongyu Ren @ren_hongyu

3K Followers 595 Following Research Scientist @openai. CS PhD @stanford. Previously @apple, @googleai and @nvidiaai. I train language models.

Brydon Eastman @brhydon

878 Followers 729 Following Mathematician (Heavy on the ish) Research Scientist @OpenAI, Previously Ph.D. @WaterlooMath. ☕ //🚴//🧗‍♂️ // 🤔➡️💻

Joanne Jang @joannejang

15K Followers 741 Following product @openai

bogo @giertler

3K Followers 434 Following purveyor of fine things. voice @openai.

Satya Nadella @satyanadella

3.3M Followers 286 Following Chairman and CEO at Microsoft

Pranav Shyam @recurseparadox

1K Followers 450 Following Research Scientist @DeepMind; ಕನ್ನಡಿಗ. Past: @OpenAI, @SchmidhuberAI

Tao Xu @txhf

6K Followers 890 Following Learning Machine at OpenAI, previously Airbnb, Quora, Facebook and Microsoft.

Mark Chen @markchen90

10K Followers 246 Following Head of Frontiers Research at OpenAI. Coach for the USA IOI Team.

Jakub Pachocki @merettm

21K Followers 0 Following OpenAI

Bob McGrew @bobmcgrewai

7K Followers 252 Following VP of Research at OpenAI

Mohit Shridhar @mohito1905

1K Followers 1K Following Research Scientist at @Dyson. @uwcse PhD in Robotics.

Clémentine Fourrier .. @clefourrier

3K Followers 307 Following Leaderboards & evals research @HuggingFace 🐍✨ "The future is already here, it’s just not very evenly distributed" (Gibson)

Sasha Rush @srush_nlp

52K Followers 465 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGz

Mahmoud Soliman @mjsMLP

443 Followers 1K Following NaN. JAX @NVIDIA, opinions are my own.

Modular @Modular

18K Followers 2 Following The future of AI development starts here. Sign up to our 📪 Newsletter → https://t.co/gpuHGRyHTs. We are hiring → https://t.co/cPTAes0HMt 🚀

Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.

rohan anil @_arohan_

12K Followers 2K Following Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.

Assistant Professor at Yonsei | Postdoc @UCBerkeley with @pabbeel | PhD @USC with @JosephLim_AI | Reinforcement Learning and Robot Learning

Youngwoon Lee @YoungwoonLee

387 Followers 82 Following Assistant Professor at Yonsei | Postdoc @UCBerkeley with @pabbeel | PhD @USC with @JosephLim_AI | Reinforcement Learning and Robot Learning

Associate professor, Computer Science. Stanford. Stanford's Human Centered AI (HAI) Institute. Opinions expressed are my own.

Emma Brunskill @EmmaBrunskill

7K Followers 91 Following Associate professor, Computer Science. Stanford. Stanford's Human Centered AI (HAI) Institute. Opinions expressed are my own.

Group Product Manager, @Google Cloud TPU | Prev. GPM @Google Brain (now @GoogleDeepMind) | Building planet-scale AI/ML supercomputers | Investor @SeaChangeVC

Alex Spiridonov @alexknowsai

282 Followers 88 Following Group Product Manager, @Google Cloud TPU | Prev. GPM @Google Brain (now @GoogleDeepMind) | Building planet-scale AI/ML supercomputers | Investor @SeaChangeVC

yi 🦛 @agihippo

7 days ago

> phi-3 claims: better than mixtral 8x7B on benchmarks > phi-3 reality: worse than mistral 7b on lmsys you cannot cheat the scaling gods. very exciting 49 place. 🥲

2 8 121 8K 14

yi 🦛 @agihippo

7 days ago

Sorry but this is actually Top 3 benchmarks to *not" use.

Quanquan Gu @QuanquanGu

a week ago

Agree. Here are the top three LLM benchmarks I would recommend: 1. Open LLM leaderboard 2. MT-Bench 3. AlpacaEval

4 14 69 22K 35

6 1 49 7K 9

Logan Kilpatrick @OfficialLoganK

a week ago

Be skeptical, think from first principles, avoid the hype, keep building.

7 11 200 17K 18

Thomas Wolf @Thom_Wolf

2 weeks ago

This take on the FineWeb release is one of the most interesting feedback and also a reason FineWeb is very different from even larger datasets like RedPajama-V2 (which is double its size!) Surprisingly, the size of the dataset of 15T tokens is not very important, what is much…

Sergey Edunov @edunov

2 weeks ago

People seem to over-index on the 15T number after Llama 3. While the number matters, what is even more important is the quality and diversity of those tokens. If there was a good way to measure those, that would have been an impressive result to report.

1 9 116 202K 26

17 127 824 316K 617

Delip Rao e/σ @deliprao

2 weeks ago

I know you all are tired of me shilling open source and open weights, but read this thread from a computer scientist who has worked on antitrust. It's not just that closed model orgs are closed, but they are perniciously peddling misinformation.

Laura Edelson @LauraEdelson2

2 weeks ago

I'm so tired of being in rooms where people whisper about the absolute ARMY of Big Tech-funded people (most, but not all, ex-Googlers) that have popped up in nearly every corridor in DC where people are working on literally anything to do with AI. So let's talk about it! 1/12

24 238 1K 358K 616

1 10 43 7K 12

Laura Edelson @LauraEdelson2

2 weeks ago

24 238 1K 358K 616

Philipp Schmid @_philschmid

2 weeks ago

Data is all we need! 👑 Not only since Llama 3 have we known that data is all we need. Excited to share 🍷 FineWeb, a 15T token open-source dataset! Fineweb is a deduplicated English web dataset derived from CommonCrawl created at @huggingface! 🌐 TL;DR: 🌐 15T tokens of cleaned…

14 86 393 107K 167

Download Image

Mostafa Dehghani @m__dehghani

3 weeks ago

@YiTayML Incredible achievement for a team of 20! Congrats to the amazing team! 🚀

1 0 8 1K 0

Boris Dayma 🖍️ @borisdayma

3 weeks ago

@YiTayML Nice!!! And thanks for the report 😎

1 0 3 2K 0

Brydon Eastman @brhydon

3 weeks ago

sitting in a taqueria listening to a group of guys excitedly talk about how good the new gpt-4 model is on lmsys while i'm re-reading my H-1B rejection email in one tab and paying US taxes in the other

2 2 54 9K 7

シェイン・グウ @shanegJP

3 weeks ago

あと若いうちからReviewer 2 を経験するのは良い人生経験になるでしょう(鬼)

1 2 22 9K 4

William Fedus @LiamFedus

3 weeks ago

Our improved model in the arena at lmsys and we’ve rolled out to ChatGPT users today — stay tuned for better versions to come

lmsys.org @lmsysorg

3 weeks ago

🔥Exciting news -- GPT-4-Turbo has just reclaimed the No. 1 spot on the Arena leaderboard again! Woah! We collect over 8K user votes from diverse domains and observe its strong coding & reasoning capability over others. Hats off to @OpenAI for this incredible launch! To offer…

54 210 1K 621K 280

Download Image

11 14 157 44K 25

William Fedus @LiamFedus

3 weeks ago

@ren_hongyu Big contribution!

0 0 5 1K 0

Hongyu Ren @ren_hongyu

3 weeks ago

做了一点微小的贡献🀄️

OpenAI @OpenAI

3 weeks ago

Our new GPT-4 Turbo is now available to paid ChatGPT users. We’ve improved capabilities in writing, math, logical reasoning, and coding. Source: github.com/openai/simple-…

607 1K 7K 6.2M 1K

Download Image

12 3 176 47K 15

Mistral AI @MistralAI

3 weeks ago

magnet:?xt=urn:btih:9238b09245d0d8cd915be09927769d5f7584c1c9&dn=mixtral-8x22b&tr=udp%3A%2F%2Fopen.demonii.com%3A1337%2Fannounce&tr=http%3A%2F%https://t.co/OdtBUsbeV5%3A1337%2Fannounce

277 843 6K 1.7M 1K

Andrej Karpathy @karpathy

4 weeks ago

Have you ever wanted to train LLMs in pure C without 245MB of PyTorch and 107MB of cPython? No? Well now you can! With llm.c: github.com/karpathy/llm.c To start, implements GPT-2 training on CPU/fp32 in only ~1,000 lines of clean code. It compiles and runs instantly, and exactly…

306 2K 13K 1.6M 7K

Jiasen Lu @jiasenlu

4 weeks ago

(1/2) 📢 Introducing LL3M: Large Language, Multimodal, and Moe Model Open Research Plan 👉github.com/jiasenlu/LL3M With the following goals: - Build an open-sourced codebase in Jax / Flax that supports large-scale training in LLM, LMM, and MoE models. - Record and share the…

6 32 164 25K 92

lmsys.org @lmsysorg

a month ago

One year ago was Vicuna's birthday🎂! We were so excited and built a demo for it at chat .lmsys .org. We never imagined it could get this far. Millions of people downloaded our models, visited our demo, and played with our fine-tuning recipe in FastChat project. We then…

lmsys.org @lmsysorg

a year ago

Introducing Vicuna, an open-source chatbot impressing GPT-4! 🚀 Vicuna reaches 90%* quality of ChatGPT/Bard while significantly outperforming other baselines, according to GPT-4's assessment. Blog: vicuna.lmsys.org Demo: chat.lmsys.org

58 549 2K 764K 1K

Download Gif

7 21 198 39K 22

Lianmin Zheng @lm_zheng

a month ago

What a year it was!

lmsys.org @lmsysorg

a month ago

7 21 198 39K 22

1 1 26 4K 1

Trevor Gale @Tgale96

a month ago

I’m not done with MegaBlocks 😁 @apaszke @epiqueras1 @sharadvikram and I just dropped something we’ve been working on for a bit yesterday. MegaBlocks + JAX + TPU = MegaBlox 🔥 github.com/google/jax/pul…