Yiding Jiang @yidingjiang
PhD student @mldcmu @SCSatCMU. Formerly intern @MetaAI, AI resident @GoogleAI. BS from @Berkeley_EECS. Trying to understand stuff. yidingjiang.github.io Joined December 2015-
Tweets227
-
Followers1K
-
Following468
-
Likes1K
1/What does it mean for an LLM to “memorize” a doc? Exactly regurgitating a NYT article? Of course. Just training on NYT?Harder to say We take big strides in this discourse w/*Adversarial Compression* w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter 🌐:locuslab.github.io/acr-memorizati…🧵
1/ 🥁Scaling Laws for Data Filtering 🥁 TLDR: Data Curation *cannot* be compute agnostic! In our #CVPR2024 paper, we develop the first scaling laws for heterogeneous & limited web data. w/@goyalsachin007 @zacharylipton @AdtRaghunathan @zicokolter 📝:arxiv.org/abs/2404.07177
Check out this really cool work w/@yidingjiang ! We built a simple system (PCA + Clustering) for quantifying how "features" are distributed across models and data. Using this tool, we can mathematically understand the Generalization Disagreement Equality. 🤝
Check out this really cool work w/@yidingjiang ! We built a simple system (PCA + Clustering) for quantifying how "features" are distributed across models and data. Using this tool, we can mathematically understand the Generalization Disagreement Equality. 🤝
🚀Our latest blog post unveils the power of Consistency Models and introduces Easy Consistency Tuning (ECT), a new way to fine-tune pretrained diffusion models to consistency models. SoTA fast generative models using 1/32 training cost! 🔽 Get ready to speed up your generative…
Jim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Behnam Neyshabur @bneyshabur
18K Followers 689 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingDan Roy @roydanroy
45K Followers 2K Following ML / AI researcher, emphasis on theory. Research Director and Canada CIFAR AI Chair, @VectorInst Professor, @UofT (Statistics/CS)Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).Christina Baek @_christinabaek
778 Followers 230 Following PhD student @mldcmu | Past: intern @GoogleAIZachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Jeremy Cohen @deepcohen
4K Followers 868 Following PhD student in machine learning at Carnegie Mellon. The goal of my research is to turn deep learning into a real engineering discipline.Shane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)yobibyte @y0b1byte
15K Followers 2K Following Kurin ViTaly, senior research scientist @IsomorphicLabs, ML PhD from @UniofOxford on RL, Multitask learning & GraphsPratyush Maini @pratyushmaini
1K Followers 339 Following Trustworthy ML | PhD student @mldcmu | Founding Member @datologyai | Prev. Comp Sc @iitdelhiRoberta Raileanu @robertarail
4K Followers 1K Following Research Scientist @Meta & Honorary Lecturer @UCL. ex @DeepMind | @MSFTResearch | @NYU | @Princeton. Llama-3, Toolformer, Rainbow Teaming.Ethan Caballero is bu.. @ethanCaballero
8K Followers 2K Following ML PhD student @Mila_Quebec ; previously @GoogleDeepMindJason Lee @jasondeanlee
10K Followers 3K Following Associate Professor at Princeton and Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learningAndreas Kirsch 🇮�.. @BlackHC
9K Followers 5K Following Past: 🧑🎓 DPhil @AIMS_oxford @ExeterCollegeOx @UniofOxford (4.5yr) 🧙♂️ RE @DeepMind (1yr) 📺 SWE @Google (3yrs) 🎓 @TU_Muenchen 👤 Fellow @nwspkTed Xiao @xiao_ted
11K Followers 680 Following I teach robots to be smarter @GoogleDeepMind. Tweets about robot learning, scaling, and large models. Opinions my own.Dimitris Papailiopoul.. @DimitrisPapail
11K Followers 970 Following prof @ wisconsin; thinking about transformers; learning in context; babas of Inez LilyYiping Lu @2prime_PKU
3K Followers 2K Following Kernel, ML for PDE, Robust learning,non-parametric stats/🌈/PKU👉Stanford👉NYU Courant👉Northwestern IEMS/ Previous Intern @RIKEN_AIPNicholas Roberts @nick11roberts
591 Followers 1K Following Ph.D. student @WisconsinCS. Working on data-centric automated machine learning. Previously at CMU @mldcmu, UCSD @ucsd_cse, FCC @fresnocity.Yin-Hong Cao @caoyinhong
109 Followers 1K Following Postdoc in Jiayang Li Lab, Institute of Genetics and Developmental Biology, CAS. Focus on the Multi-omics of dandelions & rice🌱🌾Recruiting Top-Tier Talents👇GBA Insight @GBA_Foshan
928 Followers 2K Following Recreation 🎡 Food 🍲 Attractions🗼Business 📈 We have them all. What to expect in GBA? 📍Follow us to explore the marvelous region!Wanru Zhao (Looking f.. @Renee42581826
513 Followers 2K Following Postgraduate Student @CaMLSys @Cambridge_CL | Ex-Intern @DGLGraph @AWS and @CambridgeJBS | Do not go gentle into that good night 🧗PANDA FRANK @PANDAFRANK6
2 Followers 200 FollowingBrandon Amos @brandondamos
14K Followers 2K Following research scientist @MetaAI (FAIR) | optimization, machine learning, control, and reinforcement learning | PhD from @SCSatCMUAisha Perow @pero_ais
72 Followers 5K Followingaishwarya sathish @ApiAish16813
3 Followers 72 FollowingPensé FFun @inftyCategory
113 Followers 6K FollowingGiovanna Winesberry @giovanna23262
77 Followers 5K FollowingRagnar Herron @RagnarHerron
13 Followers 76 FollowingManoj Acharya @manoja328
582 Followers 5K Following Mostly Interested in safe and aligned (neural inspired) Machine Intelligence ; PhD from Rochester Institute of TechnologyN Sreeram @NSreeram5
53 Followers 499 FollowingTia Mannix @TiaMannix10538
82 Followers 5K FollowingNguyen Thong @NguyenThong4
185 Followers 2K FollowingArif Ahmad @arif_ahmad_py
248 Followers 7K Following All things AI, Computer Science and Circuits! Prev. @GoogleAIAlyssa, Yi CHENG @YiCheng77783310
80 Followers 205 Following Ph.D. student, working on NLP for social good and conversational AI.Lyla Bonnin @bonn_lyl
38 Followers 5K FollowingEliot Xing @etaoxing
130 Followers 206 Following phd student @cmu_robotics @scsatcmu • prev. math & cmpe @georgiatechnrRNjkitRHmMP @RNjkit72037
0 Followers 592 FollowingMilin Bhade @MilinBhade
56 Followers 1K Following Post Grad Student at IISc, Bangalore Masters in Computer Science & AutomationZeynep Özdemir @zynbzdmr
144 Followers 585 Following #ResearchAssistant #AnkaraUniversity #ComputerEngineer #DeepLearning #LinuxLia Palazzola @LiaPalazzo52890
54 Followers 5K FollowingKathleen Greenbaum @GreenbaKathlee
73 Followers 5K Followingmurphy law @MurphyLaw50778
1 Followers 53 FollowingHengxu Yu @hengxu_yu
19 Followers 160 Following Ph.D. student @CUHKSZ; Passionate about creating things in Opt&MLliuyong @forrestbing
243 Followers 5K Following I am a researcher in AIGC, Multi-modality and VitrualHuman tech directionisbn009 @weissleke
3 Followers 159 FollowingRosanna Guinnip @RGuinnip3782
52 Followers 5K FollowingZhouxing Shi @zhouxingshi
243 Followers 291 Following PhD candidate @UCLAComSci. Trustworthy machine learning | Robustness | NN verfication. Alumnus @Tsinghua_Uniwanlin zhu @neuromanifold
32 Followers 3K FollowingZhan Su @zhansu9
13 Followers 58 Following Ph.D. student at the Department of computer science. University of Copenhagen.Claudio Borile @cldbrl
13 Followers 129 Following Researcher @CentaiInstitute. Working on Graph Machine Learning, eXplainable Artificial Intelligence, Complex Systems.Malvina Nikandrou @MNikandrou
67 Followers 410 Following PhD student @EDINrobotics working on Vision and LanguageShivam Duggal @ShivamDuggal4
389 Followers 379 Following PhD Student @MIT | Prev: Carnegie Mellon University @SCSatCMU | Research Scientist @UberATGAjay Jain @ajayj_
6K Followers 3K Following Co-founder @genmoai. Co-created denoising diffusion (DDPM), DreamFusion, Dream Fields. Ex Ph.D. @berkeley_ai, @googleai, @facebookai, @nvidiaai, @mitCooper Leong @cooperleong22
97 Followers 1K FollowingJJ McCammon @jjmccammon
340 Followers 4K Following Formerly @Microsoft. Host of the #1 AI podcast in Harlem, NY according to my mom. Liked tweets ≠ endorsement. Newsletter: https://t.co/rwxxpe4HZO.Suvansh Sanjeev @SuvanshSanjeev
486 Followers 483 Following having a blast w AI consulting @BrilliantlyAI. ex-phd student @CMU_Robotics. @Berkeley_EECS alum. teamwork makes the dream work, and dreamworks made shrekYann LeCun @ylecun
710K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.AK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Andrej Karpathy @karpathy
978K Followers 904 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Google DeepMind @GoogleDeepMind
943K Followers 275 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.Behnam Neyshabur @bneyshabur
18K Followers 689 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingSergey Levine @svlevine
79K Followers 122 Following Associate Professor at UC Berkeley Co-founder, Physical IntelligenceKarol Hausman @hausman_k
22K Followers 141 Following @Physical_int ex: researcher @GoogleAI/@DeepMind, adj. Prof. @Stanford. Into robots, AI, NBA, philosophy, soccer and almond croissants. 🇵🇱🇺🇸Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistDan Roy @roydanroy
45K Followers 2K Following ML / AI researcher, emphasis on theory. Research Director and Canada CIFAR AI Chair, @VectorInst Professor, @UofT (Statistics/CS)Yi Ma @YiMaTweets
71K Followers 123 Following Chair Professor in AI, Director of IDS, Head of CS, HKU; Professor of EECS, Berkeley; Author of Book: High-Dim Data Analysis, https://t.co/gwaqMJp8av.Natasha Jaques @natashajaques
25K Followers 1K Following Senior Research Scientist at @GoogleAI and Assistant Professor @uwcse. Social Reinforcement Learning in multi-agent and human-AI interactions. PhD from @MIT.Kyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).François Chollet @fchollet
469K Followers 770 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.Christina Baek @_christinabaek
778 Followers 230 Following PhD student @mldcmu | Past: intern @GoogleAIGrant Sanderson @3blue1brown
365K Followers 362 Following Pi creature caretaker. Contact/faq: https://t.co/brZwdQfdifNeurIPS Conference @NeurIPSConf
111K Followers 35 Following New Orleans, Dec 10-16, 23. https://t.co/ga8aOw615g Tweets to this account are not monitored. Please send feedback to [email protected].Brandon Amos @brandondamos
14K Followers 2K Following research scientist @MetaAI (FAIR) | optimization, machine learning, control, and reinforcement learning | PhD from @SCSatCMUAjay Jain @ajayj_
6K Followers 3K Following Co-founder @genmoai. Co-created denoising diffusion (DDPM), DreamFusion, Dream Fields. Ex Ph.D. @berkeley_ai, @googleai, @facebookai, @nvidiaai, @mitSuvansh Sanjeev @SuvanshSanjeev
486 Followers 483 Following having a blast w AI consulting @BrilliantlyAI. ex-phd student @CMU_Robotics. @Berkeley_EECS alum. teamwork makes the dream work, and dreamworks made shrekChristian Szegedy @ChrSzegedy
32K Followers 2K Following #deeplearning, #ai research scientist. Opinions are mine.Katherine Tian @kattian_
713 Followers 494 Following cs/stat @harvard, working on calibration & factuality of LLMs, prev @GoogleAI tensorflow, golden state @warriors fanKeller Jordan @kellerjordan0
1K Followers 197 Following Independent research Prev MLE @ Hive AI, math @ UCSDdepths of wikipedia! @depthsofwiki
880K Followers 4K Following Hello I am @anniierau Please take away my blue check! I did not ask for it!Chuang Gan @gan_chuang
4K Followers 443 Following Faculty Member at UMass Amherst; Principal researcher at MIT-IBM Watson AI Lab; Homepage: https://t.co/oXP6pqXCpoNancy Pelosi Stock Tr.. @PelosiTracker_
560K Followers 223 Following Highlighting Politicians' trades so we can invest alongside Goal: get them banned from trading Powered by @joinautopilot_kache (dingboard.com) @yacineMTB
53K Followers 3K Following i'm a swe. go to https://t.co/pWRBfY8kn2 - AI image editing IN YOUR BROWSER! follow to watch a self funded founder beat VC backed AI startups with @dingboard_Niki Hasrati @niki_hasrati
82 Followers 106 Following ML PhD student @CarnegieMellon’s School of CS | Previously CS master’s student @UWaterloo | Researching the intersection of theoretical CS and ML theoryMarcus Hutter @mhutter42
2K Followers 42 Following I 👨🔬 a mathematical definition&theory of Artificial General Intelligence 🎥&🎤@ https://t.co/OZsooP92mn 🍀 I now work @GoogleDeepMind 🧠 History:🇩🇪🇨🇭🇦🇺🇬🇧Olivia Simin Fan @Olivia61368522
580 Followers 848 Following 🎓Ph.D.@EPFL_en-MLO|| https://t.co/QGwaUTkuyY.@UMich. || https://t.co/QGwaUTkuyY.@sjtu1896. ML&LLM research🧐 Interested in being an interesting girl ;)Dimitris Papailiopoul.. @DimitrisPapail
11K Followers 970 Following prof @ wisconsin; thinking about transformers; learning in context; babas of Inez LilyAnalysis Fact @AnalysisFact
126K Followers 19 Following Daily tweets about real and complex analysis and related topics. From @JohnDCook.Joseph Suarez (e/🐡.. @jsuarez5341
2K Followers 63 Following MIT PhD candidate, creator of Neural MMO (https://t.co/NaaDv6UQlN), PufferLib (https://t.co/43D0orh0lJ). Open-source RLTheophile Gervet @theo_gervet
1K Followers 482 Following Accelerating open-source AI @MistralAI. Past: @Meta AI, PhD @SCSatCMUFrançois Fleuret @francoisfleuret
31K Followers 456 Following Prof. @Unige_en, Adjunct Prof. @EPFL_en, Research Fellow @idiap_ch, co-founder @nc_shape. AI and machine learning since 1994. I like reality.Aldo Pacchiano @aldopacchiano
1K Followers 414 Following AI research at Broad Institute and Boston University 🇲🇽Zilai Zeng @zilaizeng
25 Followers 119 Following Master Student @BrownCSDept. Fall 2024 CS PhD applicant. Opinions are my own.Olivier Hénaff @olivierhenaff
2K Followers 229 Following Staff Research Scientist @GoogleDeepMind, interested in active, multimodal, and memory-augmented learning. Formerly @NYU_CNS and @PolytechniqueSamarth Thopaiah @SamarthThopaiah
4 Followers 162 FollowingSatya Nutella @satyanutella_
5K Followers 1K Following Memer of Technical Staff // Head of Nutella AI Research TeamYefan Zhou @LiamZhou98
17 Followers 46 Following Ph.D. @DartmouthCS, ML Researcher @ICSIatBerkeley ex-Master in EECS @UCBerkeleyWen Sun @WenSun1
246 Followers 33 Following Assistant Professor at Cornell CS. Machine Learning and Reinforcement Learning; check out the RL Algorithm and theory book here https://t.co/HROGwaflCnBlaze (Balázs Galamb.. @gblazex
1K Followers 976 Following A Smooth Guy; Developer of SmoothScroll for macOS, Windows & Google Chrome.Tianyu Gao @gaotianyu1350
3K Followers 686 Following CS PhD student @Princeton @Princeton_nlp working on NLP. Previously: @Tsinghua_Uni @TsinghuaNLPJacob Tyo @jaketyo
215 Followers 209 Following Motorcycle Enthusiast, GNCC Racer, Ph.D. in Machine Learning from CMU @mldcmu in the @acmi_lab. Principal ML Engineer @Raft_Tech.Amil Merchant @amilmerchant
157 Followers 462 FollowingSatnam Singh @satnam6502
14K Followers 3K Following Punjabi-Scottish-American Haskell hacker at @GroqInc, cook, cyclist, lost in music. ∃🇮🇳 ∧ ∀🇬🇧 ∧ ∃🇪🇺 ∧ ∀🇺🇸 #celiac ex-{Microsoft, Google, Facebook}Colin Raffel @colinraffel
30K Followers 654 Following nonbayesian parameterics, sweet lessons, and random birds. Friend of @srush_nlpPeter Whidden @computerender
557 Followers 22 FollowingPatrick Chao @patrickrchao
357 Followers 148 Following PhD Student @Penn, interested in making language models smarter and saferAvi Schwarzschild @A_v_i__S
264 Followers 181 Following Postdoc at CMU. Trying to learn about deep learning faster than deep learning can learn about me.Ellis Brown @_ellisbrown
357 Followers 940 Following CS PhD Student @NYU_Courant | Prev: @CarnegieMellon, https://t.co/EkTzprGc4i, @VanderbiltUAnkit @ashah0052
1K Followers 5K Following LLM Arch Assoc Director - @Accenture Ph.D. - @LTIatCMU @SCSatCMU Previous: @GoogleAI, @merl_news, @Revive_Med, @ARM Smartly working hard to make things happen!In the words of Billy Idol, give a "rebel yell" for REBEL: a strikingly simple RL algorithm (it's just regression! no clipping / critics!) that scales to generative models (both LLMs and Diffusion Models!) and has deep theoretical interpretations: arxiv.org/abs/2404.16767. [1/8]
There's been a lot of discussion on LLMs "memorizing" training data, but we argue for more nuance in the definition of "memorize". This work advocates for adversarial prompts (and whether they can be shorter than the output) as a metric for assessing memorization.
1/What does it mean for an LLM to “memorize” a doc? Exactly regurgitating a NYT article? Of course. Just training on NYT?Harder to say We take big strides in this discourse w/*Adversarial Compression* w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter 🌐:locuslab.github.io/acr-memorizati…🧵
1/What does it mean for an LLM to “memorize” a doc? Exactly regurgitating a NYT article? Of course. Just training on NYT?Harder to say We take big strides in this discourse w/*Adversarial Compression* w/@A_v_i__S @zhilifeng @zacharylipton @zicokolter 🌐:locuslab.github.io/acr-memorizati…🧵
you were in my dreams last night brother. took me a moment to recognize it was you. happy birthday
To my wonderful brother-in-law Mat, you were a perfect father, husband, son, brother, and human. You will live on in our memories and through the four children to whom you gave the best possible start. I love and miss you.
Super excited to share that I successfully defended my PhD thesis "Understanding Generalization and Robustness in Modern Deep Learning" today 👨🎓 A huge thanks to the thesis examiners @SebastienBubeck, @zicokolter, and @KrzakalaF, jury president Rachid Guerraoui, and, of course,…
🌎 Excited to share a major update of the DreamerV3 agent! A couple of smaller changes, more benchmarks, and substantially improved performance. 👇 Main differences from our earlier preprint:
Excited to announce DreamerV3 🌍, a scalable and general RL algorithm that masters a wide range of applications with fixed hyperparameters! Applied out of the box, it solves the Minecraft Diamond challenge without human data. 💎 👇 Thread x.com/deepmind/statu…
Happy to share our work on preference learning methods for LLMs. Key insights: 1. Use more on-policy samples > off-policy samples 2. Contrastive DPO > Pref-FT. Also we provide insights on DPO's training mechanism. 3. Theoretical unification under mode-covering/seeking KL
Many LLM fine-tuning methods. Unclear what you should use & why? In our new paper, we did an extensive study of on-policy RL, supervised & offline contrastive methods (DPO, IPO) to answer this... 🧵⬇️ On-policy > offline, mode-seeking > mode-covering understanding-rlhf.github.io
Our work on @minerl_official was featured in this nice article about AI safety! Thanks @khulick for speaking with us.
It’s an ordinary day in Minecraft… until a bot starts destroying a house. How can we stop bad bot behavior? AI safety researchers are on the case. snexplores.org/article/artifi…
Reviewer’s evaluation of your argument
@Dahoas1 shares some cool insights on improving LLM reasoning with RL, highlighting exploration as one of the key challenges. Thanks @twimlai for discussing our work on your podcast!
Today we're joined by @Dahoas1 from @GeorgiaTech to discuss the reasoning capability of language models and the potential to improve it with traditional RL methods 🎧 / 🎥 Listen to the episode at: twimlai.com/go/680. 📖 CHAPTERS 00:00 - Introduction 02:19 - RL vs RLHF…
Llama-3 is absolutely impressive, but is it more resilient to adaptive jailbreak attacks compared to Llama-2? 🤔 Not much. The same approach as in our recent work arxiv.org/abs/2404.02151 leads to 100% attack success rate. The code and logs of the attack are now available:…
In machine learning, we often distinguish aleatoric and epistemic uncertainty. Aleatoric is called irreducible error because it's caused by "true" randomness, and epistemic is reducible caused by lack of information, like model misspecification. But it's not always clear
Can’t wait to see what you all build with Llama 3, the best openly available LLM! Enjoy and stay tuned for more🦙🦙🦙
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…
I passed my thesis proposal! 🎊Thanks to my amazing committee @fangf07, @hongshenus, Geoff Gordon, @katjahofmann, & @OriolVinyalsML for their feedback & support. & thank you to my friends and collaborators for waking up early today to attend 🖤
I am super excited to share our Llama3 preview models (8B and 70B). I am proud to have been a part of this amazing effort over the past 8 months. We still have some super cool stuff coming up in the coming months... until then, enjoy playing with these preview models…
My team and I are moving from Google Research to Google DeepMind. We'll keep working on SOTA LLM and multimodal models. Very excited for what's to come!
The Llama 3 models look fantastic! And we're proud to support them on Day 0 throughout NVIDIA's product stack: blogs.nvidia.com/blog/meta-llam…
We just released Meta Llama 3: the most capable openly available LLM available to date! The 8B & 70B models are out now, and we expect to release models with larger context windows, additional model sizes and more capabilities in the coming months.
Here is a grid of different 1D optimal transport solutions for different regularizations (columns) and unbalanced marginal penalizations (rows). The example is coming to POT shortly and is done with a for loops because we now implement all those solvers in one function (ot.solve)
Super excited to see this work out! Happy to have contributed a small part to thinking about the optimization dynamics at this scale :)
How to enjoy the best of both worlds of efficient training (less communication and computation) and inference (constant KV-cache)? We introduce a new efficient architecture for long-context modeling – Megalodon that supports unlimited context length. In a controlled head-to-head…