Amanpreet Singh @apsdehal
CTO @ContextualAI. Past: @huggingface and @MetaAI. apsdehal.in Menlo Park, CA Joined January 2010-
Tweets554
-
Followers3K
-
Following621
-
Likes986
Aligning Diffusion Models by Optimizing Human Utility We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility. Since this objective applies to each
💼 I’ve joined @ContextualAI as a research intern. I’ll be working on topics in AI alignment and retrieval-augmented generation. Let’s fry some GPUs!
We have the new way to build #enterpriseai with RAG 2.0. Our CTO @apsdehal will be sharing how we accelerate #AI training workloads with @GoogleCloudNext tech. Join the discussion on April 10 → g.co/cloudnext
Join us on April 10 as we take part in #GoogleCloudNext. Our CEO @douwekiela will dive into retrieval-augmented generation (RAG), which he pioneered at Facebook, and share how Contextual AI's RAG 2.0 approach is key to #generativeAI deployment in the #enterprise. Register to join…
Our CEO @douwekiela spoke at @saastr 's AI Day yesterday - want to know what it takes to build AI products for the enterprise? Watch the recording here: youtube.com/live/NOAcuI7qa…
Excited to share something that we've needed since the early open RLHF days: RewardBench, the first benchmark for reward models. 1. We evaluated 30+ of the currently available RMs (w/ DPO too). 2. We created new datasets covering chat, safety, code, math, etc. We learned a lot.…
With RAG 2.0, the generator and retriever are always working together. Whether you're building a house or enterprise-grade AI, teamwork makes the dream work
AI News: @xai, @Apple, @nvidia, @cohere, @ContextualAI Links: Apple's MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training - arxiv.org/pdf/2403.09611… Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets: txt.cohere.com/int8-binary-em……
Looking for production-grade AI? Look no further than our new RAG 2.0. Thrilled to share what we have been working on.
Looking for production-grade AI? Look no further than our new RAG 2.0. Thrilled to share what we have been working on.
Excited to speak at GTC! Come chat with me on how you're using AI in production if you're also there.
Excited to speak at GTC! Come chat with me on how you're using AI in production if you're also there.
Looks like I picked the correct place to do an internship 😅. Great results by @winniethexu
Looks like I picked the correct place to do an internship 😅. Great results by @winniethexu
Microsoft's new Orca-Math model based on Mistral uses multiple passes with @ContextualAI's KTO approach to achieve superior performance on math word problems.
Microsoft's new Orca-Math model based on Mistral uses multiple passes with @ContextualAI's KTO approach to achieve superior performance on math word problems.
Selecting the right data is critical for LLM performance across all stages of training. This recent paper surveys data selection, and shows that there is much more exciting research to be done.
Selecting the right data is critical for LLM performance across all stages of training. This recent paper surveys data selection, and shows that there is much more exciting research to be done.
Google releases Gemma a family of lightweight, state-of-the-art open models for their class built from the same research & tech used to create the Gemini models. Gemma is available worldwide starting today in two sizes (2B and 7B), supports a wide range of tools and systems,…
Introducing GRIT🦾to unify text embedding 🔢& generation 📝. GritLM is open SoTA on embedding (MTEB) & generative tasks (BBH etc.) – Both in 1 model. See 🧵for how GRIT🦾 makes RAG >60% faster & more 📜arxiv.org/abs/2402.09906 💻github.com/ContextualAI/g… 1/12
AK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.AI at Meta @AIatMeta
532K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Julien Chaumond @julien_c
47K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).merve @mervenoyann
56K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papersOmar Sanseviero @osanseviero
32K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽Thomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceSam Bowman @sleepinyourhat
35K Followers 3K Following AI alignment + LLMs at NYU & Anthropic. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.Dhruv Batra @DhruvBatraDB
14K Followers 324 Following Senior Director (FAIR @MetaAI). Professor (@GeorgiaTech). Co-founded CaliperAI. Researcher in AI. @CarnegieMellon alum.Sara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Tristan Thrush @TristanThrush
3K Followers 762 Following PhD-ing @StanfordAILab @stanfordnlp. Advisor @PlaytestAI. Past: @ContextualAI, @huggingface, @Meta FAIR, @mitbrainandcog, @MIT_CSAIL, @NASAJPLJeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordThomas Simonini ᯅ @ThomasSimonini
6K Followers 1K Following Game Developer making games with AI 🪄 @huggingface 🤗 Writing ML for Games course ➡️ https://t.co/bvW8PMeARO Wrote Deep RL Course ➡️ https://t.co/5Pk3rwOjjqShaily @shaily99
5K Followers 2K Following PhD @LTIatCMU Prev: @GoogleAI @MSFTResearch. Working on ethics and evaluation in #NLProc. Usually ranting, often about research & DEI. 📚 @readsndrantsNathan Lambert @natolambert
25K Followers 690 Following Figuring out AI @allen_ai, "rl boi" DM me papers. Writes @interconnectsai, talks @retortai Has phd and some credentialsSakuye Entertainer�.. @SakuyeEnte16474
316 Followers 2K Following Use my promo-code ''SAKUYE'' to register on 1xbet & you'll get 300% bonus on your first deposit, goodluck.Nikita @nikitavoloboev
4K Followers 7K Following Make @LearnAnything_ Learn in public: https://t.co/GbFvuErkYn macOS course: https://t.co/JdbJWru6zG https://t.co/94R8ER7K2h https://t.co/ROkqhyhpEKANUBHAV CHATURVEDI @anubhavchaturvd
252 Followers 4K Followingybtsdst @ybtsdst_hz
27 Followers 1K FollowingDhrumil Bhut @BhutDhrumil
10 Followers 145 FollowingMcTesee @MTesee9062
1 Followers 222 FollowingThoughset @thoughset79393
1 Followers 197 FollowingJust An Independent V.. @JustAnIndepend1
4K Followers 4K Following Politics, husband, dad, dog lover, volunteer, gardener, woodworker, brewer, bee keeper, edtech geek, sports, concerts, beach, hiker, paddler, & bacon lover!A @juniormarcatto
38 Followers 515 FollowingKhiem Vinh Tran @vinhkhiem
16 Followers 146 Following NLP Enthusiast. My Google Scholar: https://t.co/GwQ5ZUTphW…AbigailRhodes @gqlfeSVaGz773
0 Followers 168 FollowingSUMIT DVIVEDI @DvivediSumit
7 Followers 183 FollowingTeemu Summanen @teemusum
195 Followers 3K Following Interested in AI, security, healthcare, and Flutter & Dart.👨🏼💻At X for reading diverse views by professionals and hobbyists.🔬📚🫶Ashutosh Kumar Singh @0xAshutosh
14 Followers 432 Following 💻 Software Engineer & Security Enthusiast • exploring new Technologies • Passionate Coder 🔥 | 3x GCP | AI Enthusiast,Opensource ask me about DevOps & Securitykapil sharma @kapilsh06216614
6 Followers 37 FollowingHristo @hristozaykov
237 Followers 2K FollowingKhush Gupta @notkhushg
31 Followers 79 FollowingMacduff Hughes @MacduffHughes
0 Followers 749 Followingmoni @maniirou
530 Followers 5K FollowingAaditya ; @Aaditya26082004
532 Followers 7K Following CS'26 • Machine Learning • Open-Source • Web Dev. • Algorithms • Jai Shree Krishna 🦚🪈Kun (Kevin) SUN @Sharp_K_Sun
220 Followers 2K Following Scientist Researcher @ Tübingen University and Professorial Research Fellow @ Fudan University, and interested in LLMs, NLP, and computational cognition .Robin Bordoli @rbordoli
13K Followers 1K Following Engineer turned startup executive. Emergent markets + technology discontinuities. CMO @weights_biases building the #generativeAI industry.Sim Do @Simdo222
1 Followers 17 FollowingTu Vu @tuvllms
3K Followers 894 Following Research Scientist @GoogleDeepMind & Assistant Professor @VT_CS. PhD from @UMass_NLP. #NLProcMartin Fan @perfectoid_ai
395 Followers 8K FollowingNikita Vassilyev @NikitaVassilyev
6 Followers 19 FollowingSalad Cloud @SaladTech
513 Followers 263 Following The world's most affordable #GPU cloud for #AI #ML Inference at scale. From $0.02/hr. Get more inferences/images per dollar #cloudcomputing #cloudMark R. Hinkle @mrhinkle
7K Followers 5K Following I help enterprises understand and use artificial intelligence. Leveraging my 25 years of enterprise software experience in emerging technology to drive results.Caleb McKinney @mckinney_c62042
0 Followers 44 FollowingAditya @adind_2kx
47 Followers 367 Following Proficiency in Python and C++ | Mastered generics in SQL while continuing to further my learnings within SQL | Currently Learning GCP | DSA-everyday |tushar madaan @webgeektushy
59 Followers 1K FollowingAli Hindy @AliHindy2
35 Followers 163 Following Tweets about Productivity & Growth Models | Symsys / CS @stanford | Interested in personal finance, deep learning, NLP, and moreArthur Stemmer @stemmr_
115 Followers 568 Following Make it awesome 🌍 Software Engineer )'( Humanist. Mostly here for Tech, ML/RL and economics. 🇪🇺Dynamism.Phasleth @phasleth97704
111 Followers 3K FollowingChris Albon @chrisalbon
86K Followers 2K Following Director of Machine Learning at the Wikimedia Foundation. We host Wikipedia.Khuram Zaman @zamandigital
996 Followers 547 Following CTO of @univstartups. Professor at @GeorgetownEship. Pythonista. Focus: Product Design & Software Development.kun zhu @kunzhu118672
79 Followers 897 FollowingJack Reacher @JackReach516
71 Followers 1K FollowingMatt Groff @groffdev
104 Followers 319 Following Building with Web & AI. The views expressed on this account are solely my own and do not necessarily reflect the views or opinions of Bain & Company.Shubham Saboo @Saboo_Shubham_
41K Followers 449 Following AI Products @tenstorrent 📕 Author of books on GPT-3 & Neural Search in Prod ✍️ Tweets about LLMs & Prompt Engineering 📩 DMs open for collabAK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxYann LeCun @ylecun
712K Followers 719 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Andrej Karpathy @karpathy
980K Followers 905 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥François Chollet @fchollet
470K Followers 769 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.Sebastian Raschka @rasbt
267K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Soumith Chintala @soumithchintala
187K Followers 883 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Jim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.AI at Meta @AIatMeta
532K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.clem 🤗 @ClementDelangue
91K Followers 5K Following Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform for AI buildersPaul Graham @paulg
1.9M Followers 772 FollowingHugging Face @huggingface
345K Followers 189 Following The AI community building the future. https://t.co/VkRPD0VKaZ #BlackLivesMatter #stopasianhatePyTorch @PyTorch
380K Followers 77 Following Tensors and neural networks in Python with strong hardware acceleration. PyTorch is an open source project at the Linux Foundation. #PyTorchFoundationJulien Chaumond @julien_c
47K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueKyunghyun Cho @kchonyc
61K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).merve @mervenoyann
56K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papersHannah Rose Kirk @hannahrosekirk
3K Followers 686 Following AI researcher trying to make sense of all things cyberspace 🤖 Uni of Ox PhD (loading…) @oiioxford & @OxfordAI. Prev @turinginst & @Cambridge_Uni. Visitor @ NYUDwarkesh Patel @dwarkesh_sp
55K Followers 700 Following Being pretrained Host of Dwarkesh Podcast https://t.co/3SXlu7fy6N https://t.co/rEhnfYywXY https://t.co/hQfIWdM1UnConnor Shorten @CShorten30
16K Followers 15K Following Research Scientist @weaviate_io! Mostly working on Generative Feedback Loops with DSPy and Filtered ANN. Host of the Weaviate podcast! DSPy playlist below!Shikib Mehri @shikibmehri
340 Followers 808 Following MTS @ContextualAI | Previously @AmazonScience; PhD @LTIatCMUChris Albon @chrisalbon
86K Followers 2K Following Director of Machine Learning at the Wikimedia Foundation. We host Wikipedia.Alessio Fanelli @FanaHOVA
5K Followers 991 Following Cohost @latentspacepod | Partner & CTO @decibelvc | OSS: https://t.co/u4J6NVksoL | Writing: https://t.co/H7iEpzgxWQKarel D’Oosterlinck @KarelDoostrlnck
2K Followers 593 Following Interpretable AI, RAG, Biomedical NLP. Intern @ContextualAI, PhD student @ugent, visitor @stanfordnlp. Instigator of hikes.Winnie Xu @winniethexu
2K Followers 457 Following Cookin' up LLM alignment at scale. Raised by @MetaAI @StanfordAILab @GoogleDeepmind. BS in CS/Math @UofT.Groq Inc @GroqInc
46K Followers 470 Following Creator of the LPU™ Inference Engine, providing the fastest speed for AI applications, designed & engineered in N. America https://t.co/DsEqVAC5DpInterconnects @interconnectsai
2K Followers 1 Following What you need to know about AI research trends, from @natolambert Wednesday mornings weekly, sometimes extra posts.Juan Manuel Ciro @ciropython
11 Followers 81 FollowingRajiv Shah @rajistics
2K Followers 332 Following occasionally funny videos along with practical AI posts, now at ML/AI @snowflakedb - was @huggingface @datarobot @snorkelaiTeknium (e/λ) @Teknium1
29K Followers 3K Following Cofounder @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE Support me on Github SponsorsRemi Cadene @RemiCadene
8K Followers 587 Following Robotics at Hugging Face Ex-Tesla Autopilot Optimus Postdoc Brown, PhD SorbonneSamarth Madduru @SamarthMadduru
82 Followers 91 Following technical staff @ContextualAI || prev. AWS, @UofIllinoisJerry Liu @jerryjliu0
45K Followers 1K Following co-founder/CEO @llama_index Careers: https://t.co/EUnMNmbCtx Enterprise: https://t.co/Ht5jwxSrQBVaibhav (VB) Srivasta.. @reach_vb
11K Followers 169 Following GPU poor @Huggingface | F1 fan | Here for @at_sofdog’s wisdom | *opinions my ownYao Fu @Francis_YAO_
14K Followers 2K Following PhD @EdinburghNLP on LLMs and Machine Reasoning. Ex. @Columbia @PKU1898 @MITIBMLab @allen_ai AGI has yet to come, so keep runningPratyusha Sharma @pratyusha_PS
2K Followers 400 Following PhD student @MIT_CSAIL. Studying 🐳 ,🤖 and language.thebes @voooooogel
4K Followers 526 Following ꙮ programming & LLM & SFF enjoyer @ https://t.co/aykxqKippW ꙮ games @ https://t.co/3Pz19vHOwd ꙮ 💞💍📝 @holotopian ꙮ she/they 🏳️⚧️Alex Warstadt @a_stadt
1K Followers 452 Following Postdoc @ ETH Zürich | Future Asst Prof. @ UCSD | Former PhD @ NYU | computational linguistics, NLProc, CogSci, pragmatics | he/him 🏳️🌈Naveen Rao @NaveenGRao
28K Followers 788 Following VP GenAI @Databricks. Former CEO/cofounder MosaicML & Nervana/IntelAI. Neuro + CS. I like to build stuff that will eventually learn how to build other stuff.Omar Khattab @lateinteraction
11K Followers 2K Following CS PhD candidate @StanfordNLP. 2022 Apple Scholar in AI/ML. Author of ColBERT (https://t.co/2ZtgXoa1np), DSPy (https://t.co/BH7WmMKDXR), & various retrieval & LM systems.Kawin Ethayarajh @ethayarajh
3K Followers 728 Following PhD student @StanfordAILab @stanfordnlp Working on machine learning under human incentives.benahorowitz.eth @bhorowitz
652K Followers 607 FollowingLiroy Leshed @liroyleshed
8K Followers 3K Following Founder & CEO of 21tycoons. Makers of TYCOON (CRM), Bonfire (simple project management), and We Buy Once. No DMs, email me: [email protected].Wei Ping @_weiping
783 Followers 220 Following Principal Research Scientist @NVIDIA. Working on large language models and generative models. Views are my own.Niklas Muennighoff @Muennighoff
5K Followers 323 Following @ContextualAI | Interests: AI/LLM Research & Health ❤️ | Past: @huggingface @PKU1898terminally onλine ε.. @tekbog
15K Followers 958 Following vim connoisseur | localhost k8s enjoyer | hacker larping as software engineer (doing a bit) | jestermaxxin @ e/acc | x-CTO | r/acc r/eng founder | AI Eng @ _John Schulman @johnschulman2
39K Followers 611 Following Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz musicAditya Bindal @adbindal
133 Followers 681 Following Mostly AI, Cricket, Reading. VP Product @ContextualAISoumitr @TehAurum
326 Followers 594 Following Does life imitate art or does art parody life Mastodon: @[email protected]GyuPyTer2 Meowbooks @untitled01ipynb
15K Followers 314 Following Managing Director, Memetics and Advanced Shitposting Institute (hyperstitonal) || I lied. there's nothing in bio || AKA Kandrej ArpathyYaroslav Bulatov @yaroslavvb
6K Followers 703 Following ex-Google Brain, OpenAI, Meta Scholar: https://t.co/iVycFw5dSX New Blog: https://t.co/SLix8HqVeY Old Blog: https://t.co/Ur3GWKoOzyHarkirat Singh @kirat_tw
52K Followers 129 Following 300k on YT Curating the best developer community YT: https://t.co/DXjYUcX07NPeyton Casper @peytoncasper
327 Followers 717 Following Startups, AI, Distributed Systems | SF Former: Contextual AI, HashiCorpMing-Wei Chang @mchang21
1K Followers 510 Following Research Scientist @GoogleDeepMind. BERT co-author. Gemini project.Suchin Gururangan @ssgrn
4K Followers 250 Following he/him Research scientist 🦙 Llama team, @meta GenAI PhD @uwcse + @uwnlpVinod Valloppillil @vinodv
449 Followers 743 Following Enterprise + AI. Partner/Dir PM Azure AI. ex-$GOOG (led Cloud AI Language & Vision PM), $DBX (search, ML), Startups (3 exits), early $MSFT (OS, web).Shital Shah @sytelus
10K Followers 8K Following Deep learning research and code. If universe is an optimizer, what is the loss function? All opinions are my own.Daniel Campos @spacemanidol
418 Followers 301 Following Shitposting the future of search one thought at a time. Sauna and Ice bath addict. Lover of 🍷☕️ ⛷🛹🛠. BS @rpi, MS @UW, Ph.D. @Illinois_AlmaDavid @DavidSHolz
54K Followers 5K Following founder @midjourney, prev founder leap motion, nasa, max planckPatronusAI @PatronusAI
991 Followers 307 Following Automated evaluation for LLMs 🦄 Boost your confidence in generative AI ✨Ethan Mollick @emollick
211K Followers 553 Following Professor @Wharton studying AI, innovation & startups. Democratizing education using tech Book: https://t.co/CSmipbJ2jV Substack: https://t.co/UIBhxu4bgqToday we're launching PRISM, a new resource to diversify the voices contributing to alignment. We asked 1500 people around the world for their stated preferences over LLM behaviours, then we observed their contextual preferences in 8000 convos with 21 LLMs arxiv.org/abs/2404.16019
Another useful resource for multimodal LLama3: OBELICS. OBELICS is an open, massive, and high-quality collection of interleaved image-text web documents, containing 141M English documents, 115B text tokens, and 353M images, extracted from Common Crawl dumps between February 2020…
Can't wait to see multimodal LLama 3! We released a resource that might come in handy: The Cauldron🍯 The Cauldron is a massive manually-curated collection of 50 vision-language sets for instruction fine-tuning. 3.6M images, 30.3M query/answer pairs. It covers a large…
Announcing surya reading order! It predicts the order that a human would read a document in. It's useful for RAG, accessibility, and text extraction. It works on a variety of documents, layouts, and languages.
In addition to data and model optimization, stability, efficiency and fault tolerance of the entire infra stack is extremely crucial when we scale LLM training to tens of thousands of GPUs. Really glad to be working with @reducescatter and the brilliant AI Infra team @AIatMeta,…
Introducing Arena-Hard – a pipeline to build our next generation benchmarks with live Arena data. Highlights: - Significantly better separability than MT-bench (22.6% -> 87.4%) - Highest agreement to Chatbot Arena ranking (89.1%) - Fast & cheap to run ($25) - Frequent update…
At this point I feel like we understand pretty well what's going on with LLMs: - Outputs are roughly equivalent to kernel smoothing over positional embeddings (arxiv.org/pdf/1908.11775…) - The learned computation model is *probably* bounded by RASP-L (arxiv.org/pdf/2310.16028…) -…
Llama3 was trained on 15 trillion tokens of public data. But where can you find such datasets and recipes?? Here comes the first release of 🍷Fineweb. A high quality large scale filtered web dataset out-performing all current datasets of its scale. We trained 200+ ablation…
We have just released 🍷 FineWeb: 15 trillion tokens of high quality web data. We filtered and deduplicated all CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile and SlimPajama!
We have just released 🍷 FineWeb: 15 trillion tokens of high quality web data. We filtered and deduplicated all CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile and SlimPajama!
I'm going to release my reading order model next week. I had to change the architecture to perform better with complex layouts. It seems to be working, though (see the image). There are mistakes, but it's only 20% trained, and still improving.
Happy to be part of this incredible journey of Llama3 and to share the best open weight 8B and 70B models! Our largest 400B+ model is still cooking but we are providing a sneak peek into how it is trending! Check more details here ai.meta.com/blog/meta-llam…
We've added some experiments on GRIT + KTO in the paper to improve generative performance (arxiv.org/abs/2402.09906). Also, I'll give a talk on GRIT in 6 hours (below) if you want to discuss/learn more🙂
In the 42nd session of #MultimodalWeekly, we dive into novel research in instruction tuning and video agency.
Can't wait to see multimodal LLama 3! We released a resource that might come in handy: The Cauldron🍯 The Cauldron is a massive manually-curated collection of 50 vision-language sets for instruction fine-tuning. 3.6M images, 30.3M query/answer pairs. It covers a large…
Llama3 8B & 70B text is here, it's been a fun ride being part of the team building this. Newer capabilities and models are still cooking @Meta Website: llama.meta.com/llama3/
Anybody can now train a multimodal model on their own dataset in just a few lines of code with TRL 🚀! The SFTTrainer now has support for vision LLMs like LLaVa, so you can fine-tune your models to both see and follow your instructions 👀 TRL: github.com/huggingface/trl Full…
Handling long context in LLMs is expensive, but can we cut the cost by learning them offline for a specific set/genre of documents? Introducing LLoCO, our new technique that learns documents offline through context compression and in-domain finetuning using LoRA, which archives…
I have been working on vision+language models (VLMs) for a decade. And every few years, this community re-discovers the same lesson -- that on difficult tasks, VLMs regress to being nearly blind! Visual content provides minor improvement to a VLM over an LLM, even when these…
Today we’re releasing OpenEQA — the Open-Vocabulary Embodied Question Answering Benchmark. It measures an AI agent’s understanding of physical environments by probing it with open vocabulary questions like “Where did I leave my badge?” More details ➡️ go.fb.me/7vq6hm…
MTEB is the most common text embedding benchmark with 190K installs/mon & 120K leaderboard visits/mon. We're extending it to be massively multilingual. Anyone is invited to contribute & co-author an upcoming publication📜 Details: github.com/embeddings-ben…
Aligning Diffusion Models by Optimizing Human Utility We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility. Since this objective applies to each
This week was one of the most fun ever 😍 - Met @jefrankle (DBRX), @sophiamyang (Mistral) and @ylecun (🐐) in person - Met super interesting people from the Llama team (@misovalko @ThomasScialom) and collaborators (@christiankeller, Code Llama team, and more) - Saw @sarahookr…