Niklas Muennighoff @Muennighoff

@ContextualAI | Interests: AI/LLM Research & Health ❤️ | Past: @huggingface @PKU1898 muennighoff.github.io Joined May 2020

Tweets

93
Followers

5K
Following

319
Likes

510

Niklas Muennighoff @Muennighoff

2 weeks ago

We've added some experiments on GRIT + KTO in the paper to improve generative performance (arxiv.org/abs/2402.09906). Also, I'll give a talk on GRIT in 6 hours (below) if you want to discuss/learn more🙂

Twelve Labs (twelvelabs.io) @twelve_labs

2 weeks ago

1 2 4 4K 3

Download Image

2 3 27 3K 5

Download Image

Niklas Muennighoff @Muennighoff

3 weeks ago

MTEB is the most common text embedding benchmark with 190K installs/mon & 120K leaderboard visits/mon. We're extending it to be massively multilingual. Anyone is invited to contribute & co-author an upcoming publication📜 Details: github.com/embeddings-ben…

Kenneth Enevoldsen @KCEnevoldsen

3 weeks ago

2 8 43 17K 18

4 8 81 12K 31

Niklas Muennighoff @Muennighoff

a month ago

RAG 2.0 is about making retrieval-augmented generation more end-to-end & learned, e.g. Self-RAG, RA-DIT, GRIT - High-impact research direction imo! 😊

Contextual AI @ContextualAI

a month ago

RAG 2.0 is about making retrieval-augmented generation more end-to-end & learned, e.g. Self-RAG, RA-DIT, GRIT - High-impact research direction imo! 😊

35 140 1K 187K 354

Download Image

3 7 58 8K 18

Niklas Muennighoff @Muennighoff

a month ago

The best LLMs now train way beyond Chinchilla compute-optimality ("over-training") -- but how predictable is scaling in this regime?🎢 Work by the amazing @sy_gadre shows that it's very predictable🔎

samir gadre @sy_gadre

a month ago

5 34 165 22K 66

Download Image

3 3 56 6K 20

Download Image

Niklas Muennighoff @Muennighoff

2 months ago

The best open model on Korean MMLU (KMMLU) is the primarily Chinese & English Qwen model. Surprising to me & hints at cool research - maybe @huybery has thoughts🤔 Great work by the talented @gson_AI & team❤️

arlo_son @gson_AI

2 months ago

1 5 21 8K 6

Download Image

1 2 22 5K 3

Niklas Muennighoff @Muennighoff

2 months ago

StarCoder2 15B is trained on 4.3 trillion total tokens via 4.5 epochs!💫 Great work by @BigCodeProject ❤️

BigCode @BigCodeProject

2 months ago

StarCoder2 15B is trained on 4.3 trillion total tokens via 4.5 epochs!💫 Great work by @BigCodeProject ❤️

15 192 675 204K 256

Download Image

1 4 39 5K 4

Niklas Muennighoff @Muennighoff

2 months ago

What’s the most impactful LLM📚data research rn? Find out in this paper by the talented @AlbalakAlon arxiv.org/abs/2402.16827 Good directions imo🙂: ▶️Curriculum training ▶️Sample-level weights (extending DoReMi) ▶️Quality-filter+repeating (extending Scaling Data-Constrained LMs)

Alon Albalak @AlbalakAlon

2 months ago

10 77 305 100K 269

Download Image

2 9 40 5K 17

Niklas Muennighoff @Muennighoff

3 months ago

For BLOOMZ/mT0 we had to rely on the finding that instruction tuning generalizes to unseen langs to use them beyond their 46. By tuning on 101 langs via Aya data/xP3x, the Aya models have much better coverage leading to better performance🌍🌎🌏Very impressed by the Aya team💙

Cohere For AI @CohereForAI

3 months ago

77 383 1K 674K 539

Download Video

1 6 47 4K 6

Postdoc @allen_ai, working on Natural Language Processing (#NLProc) | PhD @SCSatCMU @LTIatCMU | Friend of @NLPWithFriends | @lasha_nlp@sigmoid.social

Abhilasha Ravichander @lasha_nlp

3K Followers 2K Following Postdoc @allen_ai, working on Natural Language Processing (#NLProc) | PhD @SCSatCMU @LTIatCMU | Friend of @NLPWithFriends | @[email protected]

Nat Friedman @natfriedman

183K Followers 287 Following https://t.co/Lhh178sIjq

Passionate Java developer | Code enthusiast | Problem solver | | Sharing insights and tips on Java development | | Lifelong learner |
#JavaDeveloper.

nikesh patil @NikeshPatil1998

UT Austin Professor. Researcher in Machine Learning and Information Theory. National AI Institute on the Foundations of Machine Learning (IFML) Co-director.

Alex Dimakis @AlexGDimakis

13K Followers 2K Following UT Austin Professor. Researcher in Machine Learning and Information Theory. National AI Institute on the Foundations of Machine Learning (IFML) Co-director.

ADS @langxuxing

25 Followers 221 Following XOps*AI

Kshitij @yaarusername

91 Followers 445 Following Full Stack Developer || NCR

Ashish Arora @AshArr

0 Followers 54 Following

Pavl @tru_pablo

217 Followers 822 Following e/acc vs decel - play Spock

Vanshit Mehta @vanshitkmehta

2 Followers 55 Following Software Dev @reliancejio , Building Cool Things.

Shubhanshu Arya @thisisshubh21

0 Followers 54 Following Aspiring Software Developer 👨‍💻. Always try to learn something. Loves 🍎

Dhruv Charne @CharneDhruv

1 Followers 133 Following

Dhrumil Bhut @BhutDhrumil

11 Followers 145 Following

Aravind Ram @arvindiram

90 Followers 2K Following call me Arvi🙋🏻‍♂️! building the web 🚀

Mohammed Saqib Patel @patel_saqib26

41 Followers 648 Following

Rom1 @KoroSao_

32 Followers 414 Following Deep Learning @GeorgiaTech 22 yo

Rohit kumar barada @Rohit_ku_1

16 Followers 144 Following travel enthusiast & web developer

Ahmed Hisham @AhmedHi08078280

0 Followers 50 Following

Abhijeet singh @Abhijeet_S1

72 Followers 2K Following .

Aditya Jha @codeinvoid

2 Followers 76 Following NSUT'27 Open source I Web Development

The DAO investor. Early @Aleph__zero inv. Decentralization. Born on Vikings island called Jomsborg. Applied math. My posts are not financial advise.

Jacek (Jomsborg.eth) @timelessdev

1K Followers 5K Following The DAO investor. Early @Aleph__zero inv. Decentralization. Born on Vikings island called Jomsborg. Applied math. My posts are not financial advise.

yahia battach @by_ai_tech

2 Followers 191 Following Science.

I work on genomics, but love all of biology and any means to investigate it with math, probability, and computation. He/him/his/they/their.

Charles Vaske @CharlesVaske

761 Followers 943 Following I work on genomics, but love all of biology and any means to investigate it with math, probability, and computation. He/him/his/they/their.

Tech Enthusiast @Im_techie

2 Followers 52 Following

Rupam Ash @rupam_ash

28 Followers 124 Following Computer Science Student . Tech Enthusiast . Learning MERN Stack .

Research Engineer at Google Brain. Interested in Science, Psychology, Investing, Design and generally almost everything.

Good Thoughts, Good Words, Good Deeds.

Afroz Mohiuddin @afrozenator

1K Followers 5K Following Research Engineer at Google Brain. Interested in Science, Psychology, Investing, Design and generally almost everything. Good Thoughts, Good Words, Good Deeds.

Hassan arzoo @hssnarzoo

16 Followers 220 Following

Gabriel Lespérance @GabLesperance

339 Followers 926 Following COO / CTO @TrampolineAI

Sumit Roy @SumitRoy_twt

3 Followers 79 Following Chasing the Sun 🌞

Arif Ahmad @arif_ahmad_py

282 Followers 7K Following All things AI, Computer Science and Circuits! Prev. @GoogleAI

check my ongoing endeavors

https://t.co/5U9sMlWI1Q
Making robust Backends for Webapps|| NodeJs||Express
Love To REACT⚛️
Currently Diving Into World Of WEB3

!(hardyNeverCodes) @solanki_haard

39 Followers 131 Following check my ongoing endeavors https://t.co/5U9sMlWI1Q Making robust Backends for Webapps|| NodeJs||Express Love To REACT⚛️ Currently Diving Into World Of WEB3

PhD student @uwcse. Robustness and Interpretability in ML. Former intern at @amazon, @itsArthurAI, @ETH_en, @MIT, @NUSingapore. Undergrad @IITKanpur

Sahil Verma @Sahil1V

460 Followers 1K Following PhD student @uwcse. Robustness and Interpretability in ML. Former intern at @amazon, @itsArthurAI, @ETH_en, @MIT, @NUSingapore. Undergrad @IITKanpur

Zhengping JIANG @zhengping_jiang

51 Followers 397 Following PhD Student in Natural Language Processing at JHU-CLSP

Uttam Sutariya @uttam_sutariya_

27 Followers 156 Following 🧠🧘🏻 Asli engineer 💻

Harish R @HarishR93882470

9 Followers 63 Following

Jalil Umer @JalilDev

1 Followers 53 Following

Monish @Monish__Sharma

10 Followers 209 Following HI I'M A SOFTWARE DEVELOPER

Anshdeep Singh @singh09_a

17 Followers 179 Following CSE Undergrad at NIT J

Dhruv Rajput @dhruuuvv__

2 Followers 62 Following

Ross @ma1547372858

15 Followers 1K Following

Vigneshwaran N @Vigneshwaran__N

47 Followers 670 Following ML/NLP engineer. Curious about people and minds.

John @fiveseveny

718 Followers 2K Following @milliondotjs

Wenting Zhao @wzhao_nlp

812 Followers 356 Following PhD student @cornell_tech Food for life, NLP for soul!

Dipan Mondal @dipanmondal22

18 Followers 62 Following Nothing is life but there are something in math.

LLM360 @llm360

1K Followers 50 Following A framework for open-source LLMs to foster transparency, trust, and collaborative research.

Ranveer🐼 @ranvir__rana

42 Followers 621 Following moshi moshi

Steve Li @steveshenli

137 Followers 160 Following CS + Stat @ Harvard. Previously AI Research at BAIR

Aryan Panchal @naughtypaanda

2 Followers 139 Following

Lakshay Bansal @LBansal_123

6 Followers 69 Following Just an average guy

Vansh Chitransh @Vansh_Twts

4 Followers 57 Following

Karthikeya B @7808_kk

8 Followers 102 Following Tech Seeker | Web dev

Jeff Rasley @jeffra45

678 Followers 928 Following @SnowflakeDB AI Research Team. @MSFTDeepSpeed co-founder, @BrownCSDept PhD, @uwcse alum

Qinan Yu @qinan_yu

102 Followers 177 Following CS-Math@Brown

startup investor and builder, founder @w_conviction. accelerating AI adoption, interested in progress. tech podcast: @nopriorspod

sarah guo // convicti.. @saranormous

91K Followers 3K Following startup investor and builder, founder @w_conviction. accelerating AI adoption, interested in progress. tech podcast: @nopriorspod

Incoming Assistant Professor at the University of Toronto and Vector Institute. Generative AI (Vision/Language), Embodied AI, Robotics.

Shuang Li @ShuangL13799063

5K Followers 755 Following Incoming Assistant Professor at the University of Toronto and Vector Institute. Generative AI (Vision/Language), Embodied AI, Robotics.

merve @mervenoyann

56K Followers 4K Following open-sourceress at @huggingface 🧙🏻‍♀️ proud mediterrenean 🍋 I do TL;DR on ML papers

Incoming CS PhD student @Stanford, currently cuDNN @Nvidia | M.Eng, B.S. in CS @MIT | self-improving ML systems + performance engineering

Anne Ouyang @anneouyang

3K Followers 582 Following Incoming CS PhD student @Stanford, currently cuDNN @Nvidia | M.Eng, B.S. in CS @MIT | self-improving ML systems + performance engineering

Yann Dubois @yanndubs

4K Followers 1K Following PhD student @stanfordAILab | Prev: AI resident @metaai, @vectorinst, @CambridgeMLG

Jinhyuk Lee @leejnhk

781 Followers 336 Following Research Scientist at Google DeepMind

Imene Kerboua @imenelker

12 Followers 65 Following PhD Student @ Esker & LIRIS - INSA Lyon

interdisciplinary Ph.D. Student working on representation learning in Clinical NLP and Genetics at @AarhusUni and @interact_minds

Kenneth Enevoldsen @KCEnevoldsen

326 Followers 696 Following interdisciplinary Ph.D. Student working on representation learning in Clinical NLP and Genetics at @AarhusUni and @interact_minds

Professor of natural and artificial intelligence @Stanford. Research Scientist at @GoogleDeepMind.
(@StanfordNLP @StanfordAILab etc)

noahdgoodman @noahdgoodman

2K Followers 109 Following Professor of natural and artificial intelligence @Stanford. Research Scientist at @GoogleDeepMind. (@StanfordNLP @StanfordAILab etc)

Eric Zelikman @ericzelikman

5K Followers 1K Following studying why @xAI // was phd-ing @stanford

Katherine Tian @kattian_

716 Followers 494 Following cs/stat @harvard, working on calibration & factuality of LLMs, prev @GoogleAI tensorflow, golden state @warriors fan

Orion Weller @orionweller

863 Followers 745 Following PhD student @jhuclsp. Previously: @apple, @allen_ai, @byu. #NLProc and #IR research

Sijia Liu @letti_liu

47 Followers 163 Following Research Scientist @Amazon AGI. | Interests: AI/LLMs/Conversations. | Previously: @CarnegieMellon @pku1898

Akash Mahajan @akashmjn

595 Followers 393 Following MTS @ContextualAI | prev in awe of PNW beauty 🏔 @Azure Speech; @Stanford @atherenergy @iitmadras

Aditya Bindal @adbindal

134 Followers 681 Following Mostly AI, Cricket, Reading. VP Product @ContextualAI

Shikib Mehri @shikibmehri

339 Followers 808 Following MTS @ContextualAI | Previously @AmazonScience; PhD @LTIatCMU

Carlos @_carlosejimenez

695 Followers 478 Following PhD Student @princeton_nlp Ex Mormon

John Yang @jyangballin

2K Followers 450 Following CS/NLP MS student @princeton_nlp Previously @Berkeley_EECS

Reinhard Heckel @HeckelReinhard

409 Followers 286 Following Associate Professor at Technical University of Munich and Adjunct Faculty at Rice University

Mitchell Wortsman @Mitchnw

2K Followers 956 Following @AnthropicAI | prev @uwcse

Achal Dave @achalddave

170 Followers 228 Following vision and language @toyotaresearch

Xindi Wu @cindy_x_wu

940 Followers 808 Following PhD student @PrincetonCS | Data-centric multimodal ml | prev @RealityLabs @roboVisionCMU @CMU_Robotics @Snapchat

Sungdong Kim @SungdongKim4

370 Followers 174 Following Research Scientist @ NAVER Cloud; MS&PhD student @ KAIST #NLP #LLM #Alignment

Incoming Ph.D. student @LTIatCMU, M.S. student @kaist_ai working on LLM Evaluation & Systems that Improve with (Human) Feedback | Prev: @yonsei_u @NAVER_AI_Lab

Seungone Kim @seungonekim

929 Followers 832 Following Incoming Ph.D. student @LTIatCMU, M.S. student @kaist_ai working on LLM Evaluation & Systems that Improve with (Human) Feedback | Prev: @yonsei_u @NAVER_AI_Lab

arlo_son @gson_AI

84 Followers 158 Following Undergraduate @ Yonsei. UIC Economics.

Holy Lovenia @HolyLovenia

70 Followers 19 Following

Jiawei Liu @JiaweiLiu_

2K Followers 957 Following Simplifying the making of great software. PhD Student @plfmse @IllinoisCS.

Yuxiang Wei @YuxiangWei9

290 Followers 216 Following PhD student @IllinoisCS. Incoming AI/ML Intern @SnowflakeDB

Undergraduate Researcher @neu_prl
Upcoming @scale_AI
Previous industry research @cursor_ai, @Roblox, @trailofbits
Papers here: https://t.co/PgUSaxXs1B

Federico Cassano @ellev3n11

126 Followers 67 Following Undergraduate Researcher @neu_prl Upcoming @scale_AI Previous industry research @cursor_ai, @Roblox, @trailofbits Papers here: https://t.co/PgUSaxXs1B

@Wellesley CS professor and computational linguist. Studies meaning with computational and experimental tools.
https://t.co/0k477lFlwd

Carolyn Anderson @linguistcarolyn

629 Followers 772 Following @Wellesley CS professor and computational linguist. Studies meaning with computational and experimental tools. https://t.co/0k477lFlwd

Lingming Zhang @LingmingZhang

1K Followers 308 Following Associate Professor @plfmse @IllinoisCS. Enjoy breaking, fixing, and synthesizing software. SE | PL | FM | LLM4Code

Researching deep learning for generating and understanding programs. Research Scientist @GoogleAI

Also at @miltos@sigmoid.social

(Opinions are my own.)

Miltos Allamanis 🇪.. @miltos1

1K Followers 338 Following Researching deep learning for generating and understanding programs. Research Scientist @GoogleAI Also at @[email protected] (Opinions are my own.)

Jacob Springer @jacspringer

324 Followers 169 Following PhD student @mldcmu

Postdoc at @ucsantabarbara @ucsbNLP | Ph.D. from @NUSingapore @wing_nus | Researcher in #NLProc | Interests: Reasoning, QA, Generation, Fact Checking

Liangming Pan (on job.. @PanLiangming

1K Followers 717 Following Postdoc at @ucsantabarbara @ucsbNLP | Ph.D. from @NUSingapore @wing_nus | Researcher in #NLProc | Interests: Reasoning, QA, Generation, Fact Checking

Entrepreneur, designer, investor @huggingface 🤗, @deforum_art, @talktomem1, wingmate / interested in AI, design, art, tech, science / happy dad of 2

Matt Valoatto @mvaloatto

2K Followers 646 Following Entrepreneur, designer, investor @huggingface 🤗, @deforum_art, @talktomem1, wingmate / interested in AI, design, art, tech, science / happy dad of 2

Haewon Jeong @HaewonJeong00

240 Followers 183 Following Assistant Prof @UCSB ECE. Previously, Ph.D student @CMU_ECE & Post-doc @Harvard @hseas. She/her/hers. https://t.co/eukRWcPU9i

Shiyu Chang @CodeTerminator

686 Followers 400 Following Assistant Professor at UC Santa Barbara. Tweets reflect my views alone.

UCSB NLP Lab + ML Center. https://t.co/6TOnqbk6YT https://t.co/KJYhnav3Et Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS. Areas: #NLProc, Machine Learning, AI.

William Wang @WilliamWangNLP

14K Followers 719 Following UCSB NLP Lab + ML Center. https://t.co/6TOnqbk6YT https://t.co/KJYhnav3Et Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS. Areas: #NLProc, Machine Learning, AI.

Yanai Elazar @yanaiela

3K Followers 1K Following Postdoc @ AI2 & UW | NLP

Xinyi Wang @XinyiWang98

795 Followers 299 Following UC Santa Barbara CS PhD student working on ML/NLP

@GoogleDeepMind. Formerly a Neulab member @LTIatCMU. Interested in machine learning for NLP and code, dog training and aviation.

Pengcheng Yin @pengchengyin

577 Followers 123 Following @GoogleDeepMind. Formerly a Neulab member @LTIatCMU. Interested in machine learning for NLP and code, dog training and aviation.

Aakanksha Chowdhery @achowdhery

7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to change

Anton Lozhkov @anton_lozhkov

2K Followers 283 Following Open-sourcing Language Models @huggingface ✨

Marc Marone @ruyimarone

421 Followers 586 Following PhD student at Johns Hopkins @jhuclsp. Previously @microsoft Semantic Machines, @mstranslator, @GeorgiaTech

Ellen Wu @zeqiuwu1

593 Followers 430 Following PhD student at UWNLP

Tianbao Xie @TianbaoX

1K Followers 1K Following Ph.D. student of @XLangNLP lab and @HKUNLP group 2022. Advised by @taoyds and @ikekong . e/ia

a group of nlpers at @HKUniversity working on language model agents, executable language grounding, code generation, semantic parsing, and interactive systems.

XLang NLP Lab @XLangNLP

509 Followers 27 Following a group of nlpers at @HKUniversity working on language model agents, executable language grounding, code generation, semantic parsing, and interactive systems.

Tu Vu @tuvllms

3K Followers 894 Following Research Scientist @GoogleDeepMind & Assistant Professor @VT_CS. PhD from @UMass_NLP. #NLProc

Govind Gnanakumar — h/ai @sandkoan

7 hours ago

@natfriedman arxiv.org/abs/2305.16264

2 0 3 688 3

Zhangir Azerbayev @zhangir_azerbay

7 hours ago

@natfriedman Section 7 of "scaling data constrained language models" has an experiment supporting this claim. ,

2 0 9 975 3

Nat Friedman @natfriedman

4 hours ago

@zhangir_azerbay This part is interesting.

1 0 3 223 0

Download Image

Nat Friedman @natfriedman

4 hours ago

@zhangir_azerbay This is the best demonstration I've seen so far! Thank you. But it doesn't totally settle things for me. At 20% code the performance is on average the same as with 0% code. At 30% code it's only modestly better than 4 epochs without code. Is that right?

1 0 7 819 5

Download Image

Federico Cassano @ellev3n11

7 hours ago

@natfriedman I think @Muennighoff's paper showed this! arxiv.org/abs/2305.16264 > training LLMs on a mix of NL data and Python data at 10 different mixing rates and find that mixing in code is able to provide a 2× increase in effective tokens even when evaluating only NL tasks.

0 0 2 98 0

Twelve Labs (twelvelabs.io) @twelve_labs

2 weeks ago

@Muennighoff Thanks for your scientifically rigorous talk, @Muennighoff!

0 0 1 67 0

Tengyu Ma @tengyuma

2 weeks ago

link to the MTEB legal benchmark huggingface.co/spaces/mteb/le…

0 0 4 2K 1

Tengyu Ma @tengyuma

2 weeks ago

@Voyage_AI_ @Voyage_AI_ is dedicated to building better generalist, domain-specific, or fine-tuned embedding models and rerankers. Plz check out our recent products: voyage-code-2: x.com/tengyuma/statu… rerank-lite-1: x.com/Voyage_AI_/sta… 📄 API references: docs.voyageai.com/docs/introduct…

Voyage AI @Voyage_AI_

a month ago

Rerankers refine the retrieval in RAG. 🆕📢 Excited to announce our first reranker, rerank-lite-1: state-of-the-art in retrieval accuracy on 27 datasets across domains (law, finance, tech, long docs, etc.), enhancing various search methods, vector-based or lexical. 🧵

4 10 60 25K 42

Download Image

0 0 6 1K 1

Download Image

Tengyu Ma @tengyuma

2 weeks ago

@Voyage_AI_ Below: long-context retrieval results. More in blog post 📖: blog.voyageai.com/2024/04/15/dom… Please check it out! The first 50M tokens are on us. We’d also love to support academic retrieval research and benchmarking. Please write to us at [email protected] for more free tokens.

1 0 3 1K 1

Download Image

Tengyu Ma @tengyuma

2 weeks ago

🆕📢 @Voyage_AI_'s new embedding model for legal and long-context retrieval and RAG: voyage-law-2! 1.🥇 # 1 on MTEB legal retrieval benchmark with a large margin 2.📜 Best quality for long-context (16K) 3.✨ Improved quality across domains 4.🛒 On AWS Marketplace #RAG #LLMs

3 24 85 21K 37

Download Image

Kenneth Enevoldsen @KCEnevoldsen

2 weeks ago

A big thank to existing contributor and an especially large thanks to the team of reviewers; @imenelker, @isaacchung1217, @Muennighoff, and @m_bernstorff 🎉

0 0 2 49 0

Kenneth Enevoldsen @KCEnevoldsen

2 weeks ago

If you want to join this open project you can find good first issues to start with here: github.com/embeddings-ben…

1 0 3 54 0

Kenneth Enevoldsen @KCEnevoldsen

2 weeks ago

- We now cover 247 languages, including code! 😎 - We include the longEmbed benchmark 📃 - A big thanks to PR and paper author Dawei Zhu - We include multiple code retrieval tasks 👩‍💻 However, we are still missing many important languages like: Urdu, Greek, Icelandic, Punjabi...

2 0 4 77 0

Kenneth Enevoldsen @KCEnevoldsen

2 weeks ago

It rocks indeed! And actually the development of one of the most comprehensive benchmarks to date is going great 🌐

Imene Kerboua @imenelker

2 weeks ago

I would like to thank every person that is contributing to MMTEB, this community rocks!🚀 Thank you everyone and keep going!

0 0 5 379 0

1 0 4 280 0

Imene Kerboua @imenelker

2 weeks ago

I would like to thank every person that is contributing to MMTEB, this community rocks!🚀 Thank you everyone and keep going!

0 0 5 379 0

Luca Soldaini 🎀 @soldni

2 weeks ago

more details in this announcement! fixed data link: huggingface.co/datasets/allen…

Allen Institute for AI @allen_ai

2 weeks ago

Announcing our latest addition to the OLMo family, OLMo 1.7!🎉Our team's efforts to improve data quality, training procedures and model architecture have led to a leap in performance. See how OLMo 1.7 stacks up against its peers and peek into the technical details on the blog:…

13 44 170 66K 41

Download Image

1 0 8 2K 1

Kyle Lo @kylelostat

2 weeks ago

notable stuff: 🦉ton of perf boost from mixing instruct data at end (e.g., flan) 🐋anneal learning rate (Fig 9b in arxiv.org/abs/2403.08763) 🐞changing data mix boosts MMLU at some cost to other evals 🍇huggingface.co/allenai/dolma 🧀huggingface.co/allenai/OLMo-1…

Allen Institute for AI @allen_ai

2 weeks ago

13 44 170 66K 41

Download Image

2 9 51 8K 9

Allen Institute for AI @allen_ai

2 weeks ago

13 44 170 66K 41

Download Image

Hanna Hajishirzi @HannaHajishirzi

2 weeks ago

Introducing our best OLMo yet. OLMo 1.7-7B outperforms LLaMa2-7B, approaching LLaMa2-13B at MMLU and GSM8k. High-quality data and staged training are key. I am so proud of our team making such significant improvement in a short period after our first release.