Clémentine Fourrier 🍊 @clefourrier
Leaderboards & evals research @HuggingFace 🐍✨ "The future is already here, it’s just not very evenly distributed" (Gibson) huggingface.co/spaces/Hugging… Joined October 2019-
Tweets2K
-
Followers3K
-
Following301
-
Likes6K
What if you could make model evaluation less prompt sensitive? With our friends @dottxtai , we wrote a blog on how structured generation seems to reduce model score variance considerably. Tell us what you think! huggingface.co/blog/evaluatio…
🌎 Better AI is better data, and for better data we need expertise! As part of the 'Data is Better Together' project in collaboration with Hugging Face, we bring the Domain Specific Datasets. You can read more in this post: huggingface.co/blog/burtensha… 🤗
We are happy to integrate "text-to-video" into GenAI arena huggingface.co/spaces/TIGER-L…. Currently, we support six open-source video generation models. Please help us vote to create the video leaderboard! For "text-to-image" arena, Playground V2 and V2.5 @playground_ai are leading…
Adding a long prompt can help you fight LLM hallucinations. However, if you know exactly how you want your LLM output to be constrained, there are much better strategies! 💪 Did you know you can force your LLM to ALWAYS generate a valid JSON file? Or to follow a well-defined…
All tools I've developed related to Open LLM Leaderboard: If you have any other cool project ideas related to the leaderboard, please let me know :) Here we go first: New model notifier on Twitter: @OpenLLMLeaders 🧵
Adding leaderboard results to model cards (thanks for the help @clefourrier and @Wauplin) :) huggingface.co/spaces/Weyaxi/…
What is the next feature you want to have in the Open RL Leaderboard ? Open a discussion 💬
What is the next feature you want to have in the Open RL Leaderboard ? Open a discussion 💬
Startup vs Big Company Dynamics Startup • Observation: Users want a new feature. • Designer (60 min later): Here are some figma prototypes. • Engineer: We can ship this by the end of the week. Big Company: • Observation: Let’s discuss our observations when Suzy’s back;…
oh no not this again
imgsys (the chatbot arena of image generation) by @isidentical @FAL is now on @huggingface spaces. @playground_ai & Pixart are leading the leaderboard but still early in the votes! huggingface.co/spaces/fal-ai/…
Hey journalists, which AI tools could help you in your work? We are building a community on 🤗 to explore AI applications in journalism. A space to share ideas, free tools, and resources. Is there a specific AI tool or concept you're curious about? Something that could enhance…
This take on the FineWeb release is one of the most interesting feedback and also a reason FineWeb is very different from even larger datasets like RedPajama-V2 (which is double its size!) Surprisingly, the size of the dataset of 15T tokens is not very important, what is much…
This take on the FineWeb release is one of the most interesting feedback and also a reason FineWeb is very different from even larger datasets like RedPajama-V2 (which is double its size!) Surprisingly, the size of the dataset of 15T tokens is not very important, what is much…
Most exciting paper of the week? Clearly this one 👇 Finally a successor to the super impressive phi-1.5/2 models – so much looking forward to playing with the weights, come help me encourage the authors to share them in the comments 😅 huggingface.co/papers/2404.14…
Very cool community article by @WolframRvnwlf! huggingface.co/blog/wolfram/l…
🆕 Introducing JAT, the first open-source multi-modal, multi-task multi-domain agent! 🤖 A step toward open generalist agents! 🚀 📰 Blog: huggingface.co/blog/jat
AK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80Gxmerve @mervenoyann
56K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papersOmar Sanseviero @osanseviero
32K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽Hugging Face @huggingface
344K Followers 189 Following The AI community building the future. https://t.co/VkRPD0VKaZ #BlackLivesMatter #stopasianhateclem 🤗 @ClementDelangue
91K Followers 5K Following Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform for AI buildersSasha Luccioni, PhD �.. @SashaMTL
19K Followers 4K Following AI & Climate @HuggingFace, Board Member of @WiMLworkshop and @ClimateChangeAI. @techreview 35 Innovators under 35, @TEDTalks speaker. She/her/Dr/ 🦋Julien Chaumond @julien_c
47K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueMMitchell @mmitchell_ai
80K Followers 1K Following Interdisciplinary researcher focused on shaping AI towards long-term positive goals. ML & Ethics. Same content in the Sky, Threads, & the Prehistoric ElephantNate Raw @_nateraw
7K Followers 1K Following machine learning hacker. previously @huggingface @lightningaiLewis Tunstall @_lewtun
9K Followers 425 Following 🤗 LLM engineering & research @huggingface 📖 Co-author of "NLP with Transformers" book 💥 Ex-particle physicist 🤘 Occasional guitarist 🇦🇺 in 🇨🇭Thomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceJeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordThomas Simonini ᯅ @ThomasSimonini
6K Followers 1K Following Game Developer making games with AI 🪄 @huggingface 🤗 Writing ML for Games course ➡️ https://t.co/bvW8PMeARO Wrote Deep RL Course ➡️ https://t.co/5Pk3rwOjjqDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Zach Mueller @TheZachMueller
10K Followers 392 Following 🤗 Technical Lead for the Accelerate Project | Passionate about Open Source | Nerd who enjoys touching the grass | #ADHD | He/HimNathan Lambert @natolambert
25K Followers 690 Following Figuring out AI @allen_ai, "rl boi" DM me papers. Writes @interconnectsai, talks @retortai Has phd and some credentialsvicki @vboykis
52K Followers 1K Following Born: USSR. Raised: USA. ML Eng @mozillaai Ex: @duosec @Tumblr, @automattic Nights: 👦 & 👧 working on some ✨ new vectors ✨Abubakar Abid @abidlabs
12K Followers 1K Following Hind Rajab. 5 yrs old. She + 14,000 children killed by Israeli forces. PLEASE don't be silent. Take 5 min to call your reps and urge peace (link in bio)J Bubenzer @_2txt
0 Followers 5 Followingemi learns @ml_emiii
7 Followers 104 Following learning llm engineering and advanced/concurrent typescript/js from ground up before @elicitorg internshipMax @MaximeLhst
79 Followers 163 FollowingPyry Takala @pyrytakala
107 Followers 328 FollowingJacob 'kurtextrem' Gr.. @kurtextrem
3K Followers 2K Following Performance Engineer @Framer · webperf/UX/security · @[email protected]Shilpa Suresh @shilpa15397
129 Followers 882 Following NLP/ML Research at Norstella |Ex: Harvard Med (BCH), @UMass CS. NLP for Health but not exclusively. History buff and coffee nut.Zhaoyang Chu @zhaoyang_c68411
8 Followers 365 Following CS Master@HUST. Interested in SE+ML, specifically focusing on building trustworthy and reliable AI-based software systems. Seeking PhD starting in 2025 Fall.omar boukherys @OmarBoukherys
102 Followers 1K Following Statistician, Data Scientist. I make computers program themselves.camhowe @camhowe1729
4 Followers 218 Following full-time techbro, part-time anon/undergrad. love explaining tech stuff.nedned @nletcher
1K Followers 5K Following data (science | analytics | visualisation | engineering), @thoughtworks, #Python, #nlproc, ML, & assorted whimsical miscellaniaLawliet @aslawliet
15 Followers 77 Following Deep Learning Researcher, Techno Optimistic, actively building JARVIS like AI systems with large deep neural netsGiwon Hong @GiwonHong413849
45 Followers 61 Following PhD student in ILCC (NLP) program at the University of EdinburghAhmed Salem Elhady @ahsalem511
28 Followers 501 Following PhD student @Hitz_zentroa || ex. @Microsoft in Egypt, and @agolo . Building niche NLU/NLP solutions.brainmatics @brainmatics
304 Followers 5K FollowingSodium Fluoride @NaFluoride
1 Followers 17 Followingyianan @yianan
40 Followers 1K FollowingGrapinet Tom @Tgrpt1
0 Followers 130 FollowingcrystalWen @JackLiuWen
28 Followers 644 Following. tanguy @TanguyPGT
1 Followers 65 FollowingTerry Yue Zhuo @terryyuezhuo
213 Followers 663 Following No HumanEval. We have a better answer @BigCodeProject @sgSMU @seaAIL @Data61news @MonashinfotechBastien Duclaux @bduclaux
1K Followers 2K Following Entrepreneur. AdTech, Machine learning, AI, eCommerce. ENST 98. Roland Berger, co-Founder & ex CEO of Twenga. Auvergnat living in Brussels. $ARDX, $TMDX, $SGMOLogikon @LogikonAI
1 Followers 8 Following Github: https://t.co/iqieHEaUN4 Hugging Face: https://t.co/0xifPlz4YOName @anomium
325 Followers 3K Following? @hezi1024
19 Followers 294 FollowingPrakarsh Kaushik @KaushikPrakarsh
20 Followers 697 FollowingFrank Kumli @frankkumli
13K Followers 11K Following Shaping the Future: Future-Thinking, Strategy and Innovation!Lea_liu @Lealiu32431828
8 Followers 167 FollowingPedro Azevedo @Eppie_vux
152 Followers 1K FollowingJulien Tannou @JTannou
21 Followers 141 Following #Geek #ParisI #PanthéonSorbonne #LawPoliticalScience #Technophile #JVTee Kumthorncharoen @tee_iam78
1 Followers 31 FollowingPraveen Kumar @praveenkumaryo
108 Followers 2K Following Engineering Humanoids (Embodied AGI) https://t.co/H4BJkzVbQPGuru @guru_tweets_0
416 Followers 950 Followingkonichiwa @konichiwaai
9 Followers 2K FollowingMayank Vora @MadAdMan01
13 Followers 167 FollowingDJ McCloskey @djmccl
292 Followers 2K Following Tweets are my opinion and may overlap with fact on occasion. Cares about humanity, loves Logic, Math and Science, does AI, NLP, Comp Sci., Guitar.Strawman✝️🇺�.. @databreachez
607 Followers 1K Following God above ALL | crypto class of '17 | but rly its'12| but rly still broke | Data Scientist| options degen | pythonNick @nicolaslargueze
131 Followers 2K FollowingAK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80Gxmerve @mervenoyann
56K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papersOmar Sanseviero @osanseviero
32K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽clem 🤗 @ClementDelangue
91K Followers 5K Following Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform for AI buildersSasha Luccioni, PhD �.. @SashaMTL
19K Followers 4K Following AI & Climate @HuggingFace, Board Member of @WiMLworkshop and @ClimateChangeAI. @techreview 35 Innovators under 35, @TEDTalks speaker. She/her/Dr/ 🦋Julien Chaumond @julien_c
47K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueMMitchell @mmitchell_ai
80K Followers 1K Following Interdisciplinary researcher focused on shaping AI towards long-term positive goals. ML & Ethics. Same content in the Sky, Threads, & the Prehistoric ElephantJim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.AI at Meta @AIatMeta
532K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Nate Raw @_nateraw
7K Followers 1K Following machine learning hacker. previously @huggingface @lightningaiLewis Tunstall @_lewtun
9K Followers 425 Following 🤗 LLM engineering & research @huggingface 📖 Co-author of "NLP with Transformers" book 💥 Ex-particle physicist 🤘 Occasional guitarist 🇦🇺 in 🇨🇭PyTorch @PyTorch
379K Followers 77 Following Tensors and neural networks in Python with strong hardware acceleration. PyTorch is an open source project at the Linux Foundation. #PyTorchFoundationThomas Wolf @Thom_Wolf
68K Followers 4K Following Co-founder and CSO @HuggingFace - open-source and open-scienceLeandro von Werra @lvwerra
6K Followers 310 Following Machine learning @huggingface: co-lead of @bigcodeproject and maintainer of TRL.Thomas Simonini ᯅ @ThomasSimonini
6K Followers 1K Following Game Developer making games with AI 🪄 @huggingface 🤗 Writing ML for Games course ➡️ https://t.co/bvW8PMeARO Wrote Deep RL Course ➡️ https://t.co/5Pk3rwOjjqZach Mueller @TheZachMueller
10K Followers 392 Following 🤗 Technical Lead for the Accelerate Project | Passionate about Open Source | Nerd who enjoys touching the grass | #ADHD | He/HimAnaïs Urlichs @urlichsanais
23K Followers 1K Following 🕸️Newsletter https://t.co/kuJYGTTiYv 🚀she/her Opinions are mine. I am not responsible for anyone not tagged/directly addressed in my tweets feeling addressed.Maxime Voisin @maximevoisin_ai
743 Followers 669 Following Product manager RAG/Tools/Code @cohere. Previously @labelbox, @stanford computer vision labsFlorent Daudens @fdaudens
11K Followers 6K Following Press Lead @HuggingFace / Passionate about AI & news / Previously @radiocanadainfo @ledevoir & coAlessio Fanelli @FanaHOVA
5K Followers 992 Following Cohost @latentspacepod | Partner & CTO @decibelvc | OSS: https://t.co/u4J6NVksoL | Writing: https://t.co/H7iEpzgxWQQuentin Gallouédec @QGallouedec
325 Followers 417 Following Research engineer @huggingface 🤗 PhD in RL Member of Stable-Baselines team: https://t.co/eX7JDWqc9FIlyas Moutawwakil @IlysMoutawwakil
553 Followers 189 Following All benchmarks are wrong, some will cost you less than the others. MLE @HuggingFace 🤗 MEng @CentraleSupelec 🧑🎓Konrad Szafer @KonradSzafer
73 Followers 214 Following LLM Eval intern research @ Hugging Face | research assistant intern @ CMU AutonLabTaelin @VictorTaelin
17K Followers 902 Following Founder of @HigherOrderComp Building the massively parallel future of computing Reaching AGI to cure all diseases and suffering is all that mattersOpenLLMLeaders @OpenLLMLeaders
198 Followers 1 Following Track 🤗 Open LLM Leaderboard. Created by https://t.co/ywEwEb4O1GMisha Laskin @MishaLaskin
8K Followers 175 Following Staff Research Scientist @DeepMind. Previously @berkeley_ai. YC alum.Costa Huang @vwxyzjn
3K Followers 1K Following RLHF @huggingface 🤗; main dev of @cleanrl_lib; CS PhD @DrexelUniv; Ex @CuraiHQ @weights_biases @NVIDIAAI @riotgames.Lucie-Aimée Kaffee @frimelle
1K Followers 2K Following Computer Scientist, PhD. Applied Policy Researcher @huggingface 🤗 ML & Society; Wikipedia & languages are my ♡Clément ROMAC @ClementRomac
462 Followers 252 Following Research Scientist at 🤗 @huggingface, PhD. student at @FlowersINRIA. Studying how autonomous Deep RL agents 🤖 can leverage LLMs 📖 Also playing bass 🎸Alina Lozovskaya @ailozovskaya
112 Followers 52 Following ML Engineer Intern at @huggingface 🤗 | Linguist who codes 👩🏻💻 | Enjoy taking photos and playing musicSuraj Patil @psuraj28
5K Followers 61 Following Research Engineer @pika_labs prev Open Source AI @huggingface nlp, diffusion, text2imageMichal Valko @misovalko
5K Followers 2K Following Llama @AIatMeta Paris & Inria & MVA - Ex: Gemini and BYOL @GoogleDeepMindCuriosity Rover @MarsCuriosity
4.2M Followers 52 Following Your friendly neighborhood NASA Mars rover. Exploring the Red Planet since 2012. Team headquartered at @NASAJPL 🚀 (Verification: https://t.co/T3V89CljZ2)Jingna Zhang @zemotion
78K Followers 485 Following Gundam pilot wannabe. Photographer, AD. Building new social platform for art 👉 @cara_hq | https://t.co/iM0FwRT0Qz ✨ in Tokyo!🗼Andrea Sommese, Ph.D. @andrea_sommese
1K Followers 713 Following Domestic pets cognition researcher and baker 🐕🧁 Postdoc @VetmeduniVienna 🐈 Featured in media for the Gifted Dog Project 📰🎥Animal Computer Inter.. @animal_computer
346 Followers 86 Following International Conference on Animal-Computer Interaction (ACI) - 11th ACI'24 in Glasgow, Uk 2-5th December 🏴Ilyena Hirskyj-Dougla.. @Ilyena
1K Followers 923 Following Lecturer at @UofGlasgow, Explorer of interactive systems for animals | Animal-Internet, Interface Design and Animal-Controlled SystemsSean White @seanwhite
3K Followers 782 Following CEO @Inflection AI | Co-founder Braingels; Prev: Chief R&D @Mozilla; Co-founder @BrightSkyLabs, Teaching @Stanford, @GreylockVC,Olivia Sullivan @zeb_ko
19K Followers 673 Following Illustration | Comics | Design https://t.co/B3FFAwd4rD Insta: zeb.ko Blueksky: https://t.co/RmMmHPmy9MDan Hendrycks @DanHendrycks
17K Followers 81 Following • Director of the Center for AI Safety (https://t.co/ahs3LYCpqv) • GELU/ImageNet-C/MMLU/safety groundwork • PhD in AI from UC Berkeley https://t.co/rgXHAnYAsQ https://t.co/YtGtDh1aAVBill Yuchen Lin 🤖 @billyuchenlin
6K Followers 2K Following Research @allen_ai. I evaluate (multi-modal) LLMs, build agents, and study the science of LLMs. Previously: @GoogleAI & @MetaAI FAIR @nlp_uscEoghan Flanagan @KateandPie
143 Followers 544 FollowingHailey Schoelkopf @haileysch__
3K Followers 813 Following she/her | research scientist @aiEleuther | LLM training/infra, eval, data | LM Evaluation Harness maintainerLintang Sutawika @lintangsutawika
381 Followers 565 Following Incoming Ph.D. student @LTIatCMU. Researcher at @AIEleuther. Maintainer of LM-Eval Harness. Here for machine learning papers and discussion.SomosNLP @SomosNLP_
2K Followers 226 Following ¡Únete al mayor hackathon open-source de #PLN en español! 🚀 https://t.co/RCLawhwaMf Hackathon #Somos600MAparna Dhinakaran @aparnadhinak
4K Followers 1K Following Building @arizeai & @arizephoenix 💙 I post about MLOps, Generative AI, and occasionally Amazing RaceMatthieu Lambda @MatthieuLambda
3K Followers 4K Following ✍️ Auteur @micode & #Epicurieux 🎙️ Rédac' chef & chroniqueur @UnderscoreTalk 🗣 Animateur à la prise de parole @EloquentiaFr ☕️🎙️🍻🔁María Grandury @mariagrandury
980 Followers 629 Following ML Research Engineer | Founder @SomosNLP_ 🚀 | @HuggingFace Fellow 🤗 | Responsible AI, Open-Source, AI in Spanish #Somos600M ➡️ https://t.co/D6YTS64aN4Weyaxi @Weyaxi
2K Followers 2K Followingefxmarty @efxmarty
343 Followers 134 Following ML Engineer at @huggingface Optimization team. efxmarty/fxmarty elsewhereMartin Signoux @MartinSignoux
2K Followers 1K Following Public policy @Meta - Interested in #AI #XR #competition #IP / Tweeting in my own capacityAlex Cabrera @a_a_cabrera
1K Followers 491 Following PhD candidate @cmuhcii @scsatcmu. Humans + AI = ???Tom Jobbins @TheBlokeAI
15K Followers 237 Following My Hugging Face repos: https://t.co/yh7J4DFGTc Discord server: https://t.co/5h6rGsGfBx Patreon: https://t.co/yfQwFggGtxPedro Cuenca @pcuenq
5K Followers 768 Following ML Engineer at 🤗 Hugging Face | Co-founder at LateNiteSoft (Camera+). I love AI and photography.Emma Bostian @EmmaBostian
203K Followers 1K Following Engineering Manager @spotify 🇸🇪 American in StockholmProf. Anima Anandkuma.. @AnimaAnandkumar
25K Followers 2K Following Bren Professor @caltech, Fmr Sr Director of #AI research @nvidia, Fmr Principal Scientist @awscloud, AI+Science, PDE, Neural operators. Views my own.vince @vincelwt
2K Followers 200 Following Building an open-source SaaS: 🧑💻 @lunary_hq Prev: 📈 @astrolytics_io (sold) & https://t.co/UlUXpxrVtu@ImMr_Wise sure but you don't need a LLM to do this (retrieve training data), if all you want is retrieval then there are much more efficient methods
Prompting is so much more dynamic and flexible, I find that FT really degrades other domains - something that prompting does not suffer from. And of course you can just have hundreds of prompts for hundreds of tasks
We've been blessed again with new LLaVA-like models based on LLaMA 3 & Phi-3 🤩 Also passes the baklava benchmark 🤝✅
LLM evaluation 🤜🤛 outlines (structured generation)
What if you could make model evaluation less prompt sensitive? With our friends @dottxtai , we wrote a blog on how structured generation seems to reduce model score variance considerably. Tell us what you think! huggingface.co/blog/evaluatio…
@BramVanroy @clefourrier @dottxtai @E_Rijgersberg That's true, but in this cases we used simple regexes to structure the output, not JSON.
best takes about AI news/new models are often on @huggingface internal slack channels and I just know that if people couldn't crack a case there no one can
I'm just blessed and happy I get to be a part of the cultural capital and the neverending quality memes lol
I'd argue @huggingface Izmir office is the best one
I guess the answer is just that the summarization metrics always were really bad.
🌎 Better AI is better data, and for better data we need expertise! As part of the 'Data is Better Together' project in collaboration with Hugging Face, we bring the Domain Specific Datasets. You can read more in this post: huggingface.co/blog/burtensha… 🤗
@clefourrier @MaziyarPanahi @winglian Clinical trials and physician manual evaluations, I don't think there is a proper way to evaluate models with benchmarks. I understand the appeal but I think these benchmarks do more harm than good...
@SashaMTL It's definitely the cows, not the private jets. 🙄
I'm a big fan of climate-positive AI solutions...but surely we can do better than this? 🐮
Bill Gates: Artificial intelligence, combined with "our ability to edit genes", will enable us to dramatically reduce methane emissions by "making the cows better".
Adding a long prompt can help you fight LLM hallucinations. However, if you know exactly how you want your LLM output to be constrained, there are much better strategies! 💪 Did you know you can force your LLM to ALWAYS generate a valid JSON file? Or to follow a well-defined…
@aryopg @clefourrier @aadityaura @PMinervini @AMotzfeldt @OpenlifesciAI Maybe LLM + PGP participant conversations could inform leaderboard somehow? ♥️ PGP participants share data publicly under rigorous Institutional Review Board oversight. I'm a co-founder of Harvard Personal Genome Project and worked on global PGP from inception. (Cf. @PGorg)
@wait_sasha @clefourrier @aadityaura @PMinervini @AMotzfeldt @OpenlifesciAI Thanks @wait_sasha ! This leaderboard, by all means, is not a static and absolute truth. We hope that people can help improve it over time! We realise there may be a disconnect between NLP development and real clinical settings. We hope this will spark a constructive discussion!
All tools I've developed related to Open LLM Leaderboard: If you have any other cool project ideas related to the leaderboard, please let me know :) Here we go first: New model notifier on Twitter: @OpenLLMLeaders 🧵
@maximegmd @winglian Thanks @maximegmd - you are correct, it's new and it has less people to report. Maybe we can get some help from our leaderboards master @clefourrier huggingface.co/blog/leaderboa…
Adding leaderboard results to model cards (thanks for the help @clefourrier and @Wauplin) :) huggingface.co/spaces/Weyaxi/…
Introducing The Hallucinations Leaderboard! 🚀 An open effort to measure the LLMs’ tendency to generate hallucinations across various tasks, like open-domain QA, instruction following, and summarisation! 🧵1/N 📄Paper:arxiv.org/abs/2404.05904 🤗Leaderboard:huggingface.co/spaces/halluci…