Nathan Godey @nthngdy
3rd year PhD student @InriaParisNLP Working on the representations of language models, architectures, and pretraining methods Paris Joined November 2021-
Tweets113
-
Followers514
-
Following839
-
Likes161
Nathan Godey (@nthngdy), Éric de La Clergerie (@DeVillemonte), Benoît Sagot (@bensagot) On the Scaling Laws of Geographical Representation in Language Models arxiv.org/abs/2402.19406
🤩📄 We are delighted that 8 papers from the team have been accepted at @LrecColing 2024! Have a read through the titles and camera-ready versions here (in no particular order):
Currently at #EACL2024 to present an improved version of this paper! Feel free to reach out if you want to meet and discuss efficient pre-training, contrastive learning or representation degeneration (PS: thanks @songdng for the perfectly timed picture)
🤔 Ever wondered how to summarize an entire book 𝑎𝑙𝑙 𝑎𝑡 𝑜𝑛𝑐𝑒? We delve into this question in our recent paper “𝐋𝐎𝐂𝐎𝐒𝐓: 𝐒𝐭𝐚𝐭𝐞-𝐒𝐩𝐚𝐜𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 𝐟𝐨𝐫 𝐋𝐨𝐧𝐠 𝐃𝐨𝐜𝐮𝐦𝐞𝐧𝐭 𝐀𝐛𝐬𝐭𝐫𝐚𝐜𝐭𝐢𝐯𝐞 𝐒𝐮𝐦𝐦𝐚𝐫𝐢𝐳𝐚𝐭𝐢𝐨𝐧” (#EACL2024). 🧵 (1/8)
The memory in Transformers grows linearly with the sequence length at inference time. In SSMs it is constant, but often at the expense of performance. We introduce Dynamic Memory Compression (DMC) where we retrofit LLMs to compress their KV cache while preserving performance…
Want more high-quality ML content in your X feed? 👉 Follow this X list of top @huggingface users: x.com/i/lists/176210… This list was compiled from these leaderboards, ranking Hugging Face's top contributors 🤗 huggingface.co/spaces/mvaloat… ⚙️ For those who don't use the X list…
Want more high-quality ML content in your X feed? 👉 Follow this X list of top @huggingface users: x.com/i/lists/176210… This list was compiled from these leaderboards, ranking Hugging Face's top contributors 🤗 huggingface.co/spaces/mvaloat… ⚙️ For those who don't use the X list… https://t.co/T910GOZTuD
Cristina Reyes @CristinaRe16409
105 Followers 3K Following Trader | Investor | Entrepreneur 📈 Bitcoin Mining ,📊 NFT / Market Analysis📉 Crypto Currencies Investment 🪙 DM for more info. +18605101558OlgaWill @0G1VM6W1840hbe
0 Followers 164 FollowingRobert Scoble @Scobleizer
504K Followers 68K Following Follow me on my new podcast with AI startups, Unaligned. Tech industry color commentator since 1993. Author/Blogger. Former strategist @Microsoft.RaeAntoinette @y1oi3LRkI6VrM8
0 Followers 111 FollowingNicholasLotton @lotton30521
41 Followers 787 FollowingAgastya Seth @agastya_seth
29 Followers 137 Following Techie | Innovator | Musician - Senior Software Engineer (R&D) at Cadence Design SystemsProfessor Usama Amjid @professor_usama
63 Followers 479 Following JavaScript (Node.js) Associate Back-end EngineerYifu Qiu @yifuqiu98
106 Followers 174 Following @ELLISforEurope PhD Student in NLP @EdinburghUni /@Cambridge_Uni | formerly intern @Apple @Baidu_inc | 2023 Apple AI/ML ScholarYongyi (Colin) Zang @yongyi_zang
51 Followers 115 Following Audio Machine Learning Engineer/Researcher. MLE@Neosensory.Irina Proskurina @irproskurina
32 Followers 91 Following NLP, PhD student at @UniversiteLyon Université Lumière Lyon 2Jeetendra K Sharma @Jeetkarsh
153 Followers 5K Following Data Science Manager @United Airlines | Career Coach @Almabetter | Top Rated Plus @Upwork | ODSC '19 Speaker | Ex @TCS A&I |Niel Wagensommer @niwasox
64 Followers 139 Followingmerve @mervenoyann
56K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papersPrince Osei Aboagye @kp_aboagye
26 Followers 198 Following Staff Research Scientist @Visa Research || Formerly: Ph.D. Student @UUtah || Research Interest: Natural Language Processing, Ethical and Responsible AI.gylns @glovepm
3 Followers 437 FollowingEkaterina Lobacheva @KateLobacheva
300 Followers 204 Following Independent Researcher, PhD in CS, Collaboration with https://t.co/tB7QL7Sw3a and https://t.co/JN95AWiNhB Like to explain unexpected behavior of neural nets 🤯Simo Ryu @cloneofsimo
3K Followers 385 Following #KAIST RAI Lab (ML engineering #Naver) Interested in robotics, RL, math (but you might know me for t2i diffusion) [email protected]arges @arges8770465409
6 Followers 47 FollowingMaureen de Seyssel @Maureendss
513 Followers 638 Following PhD from @CoML_ENS in speech, ml and cognition. Ex research intern @MetaAI. @CoML_ENS. unsupervised (multilingual) speech representationsAlicem @Alicem94718195
77 Followers 2K FollowingAndrew Carr (e/🤸) @andrew_n_carr
15K Followers 3K Following science @getcartwheel AI writer @tldrnewsletter advisor @arcade_ai Past - Codegen @OpenAI, Brain @GoogleAI, world ranked Tetris playerCoco des Rosiers @PauseZenCEO
1K Followers 2K Following Creating Solutions not excuses | Progress and success comes from Being Authentic | Navigating Life with Love & a Smile | Team Human ❤️David Marx || digthat.. @DigThatData
4K Followers 2K Following Generative AI MLE, FOSS toolmaker, innovation catalyst @CoreWeave + @AiEleuther. AI enhanced creativity, philosophy of mind/science/probabilityAmartya Banerjee @eigenamartya
28 Followers 641 Following PhD student @unccs | Undergrad @UofMaryland '20 Math + CSLukas Galke @LukasGalke
656 Followers 1K Following How machines learn to communicate, postdoc @MPI_NL | Natural Language Processing, Lifelong Machine Learning | @[email protected]Abdel Jie @TheObserverJi
16 Followers 251 FollowingNeal Oliver @ncoliver
6 Followers 343 FollowingMark Rofin @broccolitwit
11 Followers 109 FollowingLopez @Lopez9857474490
654 Followers 5K FollowingGuy Dar @guy_dar1
362 Followers 221 Following #NLProc Researcher | #AI #NLProc #interpretability | opinions my own sadly | off-topic tweets erased periodically | he/him🦅 @ChelseaMCMV
4K Followers 2K Following shitposting by day, serving AGI by night ⏩ highly caffeinated engineer & AI delveper ☕Brace(Hanyang) Zhao @OptionsGod_lgd
13 Followers 63 Following PhD student at Columbia IEOR | Machine Learning Researcher on LLM, RL, Diffusion ModelsElian Carsenat @ElianCarsenat
3K Followers 3K Following Founder @NamSor_com, applied onomastics for serious studies : gender / migration / discrimination / urban seggregation. Also building an AI biais estimator API.coprophet @coprophet_com
15 Followers 349 Following Un prévisionnel des ventes boosté à l'I.A. pour le commerce de proximté et le tourisme ( J.O, météo, calendrier, vacances, événements etc) #antigaspi #marges生 @shng1461041
0 Followers 30 FollowingArkadiy Saakyan @rkdsaakyan
138 Followers 388 Following PhD student @ColumbiaCompSci @columbianlp working on natural language processing. prev. intern @AmazonScienceSeagull @Seagull_Thief
34 Followers 280 FollowingDilermando Queiroz @DilermandoQ
1 Followers 42 Followingguoguoguo @iron_guo
4 Followers 107 FollowingYouval Vanlaer @_youval
3 Followers 38 FollowingYifu Qiu @yifuqiu98
106 Followers 174 Following @ELLISforEurope PhD Student in NLP @EdinburghUni /@Cambridge_Uni | formerly intern @Apple @Baidu_inc | 2023 Apple AI/ML ScholarSimo Ryu @cloneofsimo
3K Followers 385 Following #KAIST RAI Lab (ML engineering #Naver) Interested in robotics, RL, math (but you might know me for t2i diffusion) [email protected]Youval Vanlaer @_youval
3 Followers 38 FollowingMatthew Finlayson @mattf1n
798 Followers 868 Following First year PhD at @nlp_usc | Former predoc at @allen_ai on @ai2_aristo | Harvard 2021 CS & LinguisticsAndrei Mircea @mirandrom
53 Followers 301 Following PhD student @Mila_Quebec ⊗ mechanistic interpretability + systematic generalization + LLMs for science ⊗ https://t.co/xg8aE8CoWvEmile van Krieken @EmilevanKrieken
2K Followers 1K Following Postdoc @ University of Edinburgh | Neurosymbolic Machine LearningHailey Schoelkopf @haileysch__
3K Followers 814 Following she/her | research scientist @aiEleuther | LLM training/infra, eval, data | LM Evaluation Harness maintainerThibaud LETENO @ThibaudLeteno
16 Followers 83 FollowingSarah Bénière @sarahbeniere
34 Followers 97 Following R&D Engineer at @Inria (ALMAnaCH team) | former TNAH (prom. 2023, @Ecoledeschartes) and ECMA (2021, @LermaAmu) | interested in DH, stylistics and open scienceThéo Gigant @gigant_theo
48 Followers 215 Following PhD Student @ Université Paris-Saclay / Centrale Supélec, Working on multimodal summarization of videoconference recordsOna de Gibert @OnadeGibert
354 Followers 486 Following PhD Student @HelsinkiNLP / Open-source, NLP, Low-resource, Machine Translation / KPIs @YoungITGirlsMathias Vast @MathiasVast1
21 Followers 124 FollowingAmsterdamNLP @AmsterdamNLP
4K Followers 348 Following Tweeting about NLP research, events and opportunities in Amsterdam -- run by @wzuidema and others.Manuel Faysse @ManuelFaysse
199 Followers 233 Following NLP (LLMs) & ML Privacy - PhD Candidate @CentraleSupelec Prev: @imperialcollege, @epfl, @La_UPMKonstantin Dobler @konstantdobler
74 Followers 113 Following PhD student @ELLISforEurope @hpi_de in NLP, prev @sapAaron Defazio @aaron_defazio
6K Followers 365 Following Research Scientist at Meta working on optimization. Fundamental AI Research (FAIR) teamJaap Jumelet @JumeletJ
493 Followers 315 Following PhD candidate at UvA with @wzuidema NLP ∩ Interpretability ∩ Linguistics PhD in 5 words: Finding structure in language modelsMLIA @mlia_isir
2K Followers 173 Following #MachineLearning & #DeepLearning for Information Access. Academic research lab within @ISIR_labo, @ScienceSorbonne, @Sorbonne_Univ_Simone Scardapane @s_scardapane
8K Followers 673 Following I fall in love with a new #machinelearning topic every month 🙄 Tenure-track Ass. Prof. @SapienzaRoma | Previously @iaml_it @SmarterPodcast | @GoogleDevExpertFrederick Riemenschne.. @bowpis
43 Followers 178 FollowingFei Wang @fwang_nlp
920 Followers 2K Following PhD candidate @USC. PhD Fellow @Amazon. Responsible LLM.Albert Villanova @avillanovamoral
2K Followers 5K Following ML Engineer @huggingface. Data Scientist, PhD Theoretical Particle Physics, BSc Computer Science. Always learning. he/himDebora Nozza @debora_nozza
4K Followers 4K Following Assistant Professor at @Unibocconi in @MilaNLProc group • Working in #NLP, #HateSpeech and #FairnessML • She/her • #ERCStG PERSONAEAjitesh Shukla @ajitesh_shukla7
711 Followers 5K Following Student,Love to solve hardest math problem. LLM's, Mathematical Research(Geometric Topology,Differential Geometry),Quantum Computing.Lord Krishna is God Of MathAna-Maria Bucur @bucuram
1K Followers 738 Following PhD Student at Interdisciplinary School of Doctoral Studies, University of Bucharest. Researcher at @PRHLT. Working on NLP for Mental Healtheaclmeeting @eaclmeeting
4K Followers 24 Following The European Chapter of the Association for Computational Linguistics An annual Top-tier *ACL conference. #EACL2024 #NLProc 17-22 March 2024Chao-Wei Huang @cwhuang_wh
61 Followers 428 Following PhD student at National Taiwan University. Former intern @AmazonScience and @MetaAI. NLP, Retrieval, and Dialogue Systems.Lihu Chen @LihuChen
16 Followers 73 Following A postdoc at @Inria_Saclay, working on NLP and Language Models. | Prev @AlibabaGroup @IP_Paris_Eleftherios Avramidis @lefterav
297 Followers 382 Following Senior Research on Machine Translation/sign language, Natural Language Processing, Computational Linguistics, Fan of open source/free software and radioMustafa Jarrar @mjarrar
1K Followers 912 Following Professor of #NLProc | #Ontology | #KnowledgeGraphs. Birzeit UniversityMatt Valoatto @mvaloatto
2K Followers 646 Following Entrepreneur, designer, investor @huggingface 🤗, @deforum_art, @talktomem1, wingmate / interested in AI, design, art, tech, science / happy dad of 2Marvin Lavechin @LavechinMarvin
376 Followers 337 Following Postdoc at @GipsaLab in machine learning, speech processing, and cognition. Formerly at @ENS_ULM and @FacebookAI Machine learning and language acquisitionTiwa Eisape @tiwa_eisape
1K Followers 1K Following PhD student at @MIT working on NLP and cognitive science - @NSF grfp fellow. Previously with @GoogleAI and @Meta FAIRThomas Fel @Napoolar
678 Followers 488 Following PhD student, Explainability, @tserre Lab, Brown University and @ANITI_toulouse. Prev. intern @Google, @GoProTri Dao @tri_dao
19K Followers 365 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.Juraj Vladika @JurajVladika
328 Followers 500 Following 🧑🏻🎓PhD student in #NLProc at @TU_Muenchen 🥨🇩🇪 // working on scientific fact verification & LLM factuality // 🇭🇷Raj Dabre @prajdabre1
3K Followers 758 Following NLP/Machine Translation/NLG/Deep Learning. Researcher-@NICT_Publicity. Adjunct Faculty-@iitmadras. Visiting Professor-@iitbombay. Ex-@KyotoU_News. #nlprocAdil D. Ztn 👒 @AdilZtn
244 Followers 1K Following A boring guy who does things. Currently, I'm trying to make reinforcement learning boring. PhD Student/ Research Engineer in RL @irtSaintEx & @ISAE_officielBiswesh Mohapatra @bis1602
111 Followers 200 Following Multimodal Dialog Systems | PhD Student @Inria | Previously student of @IIITB_official | Interned @IBMResearch, @Siemens | GSOC 2018Conference on Languag.. @COLM_conf
2K Followers 6 Following https://t.co/GhGCMEoa4A Abstract submission: March 22, 2024Carina Kauf @KaufCarina
589 Followers 489 Following Ph.D. Candidate at MIT | Brain and Cognitive Sciences https://t.co/D6vnrfKbbmFanny Jourdan @Fannyjrd_
1K Followers 222 Following French mathematician doing XAI for #NLProc at @irtSaintEx. Soon: PhD on CS at @ANITI_Toulouse. Prev: MSc on maths at @polytechnique.Sasha Luccioni, PhD �.. @SashaMTL
19K Followers 4K Following AI & Climate @HuggingFace, Board Member of @WiMLworkshop and @ClimateChangeAI. @techreview 35 Innovators under 35, @TEDTalks speaker. She/her/Dr/ 🦋Excited to introduce MAD Speech: a new set of metrics to measure acoustic diversity in speech. Work done @GoogleDeepMind w/ @_andrea_agos, @MTagliasacchi, @neilzegh and @n0mad_0 Paper link: arxiv.org/abs/2404.10419 1/5
@nthngdy Cool work! I haven't read it fully, but my current understanding of it says smaller model might benefit from smaller tokenizer, (so it doesn't saturate) is that correct?
Model Collection: huggingface.co/collections/li…
@andreasgrv @LeopolisDream Also, another cool paper just came out suggesting that the SM bottleneck causes small LMs to stop learning
🤏 Why do small Language Models underperform? We prove empirically and theoretically that the LM head on top of language models can limit performance through the softmax bottleneck phenomenon, especially when the hidden dimension <1000. 📄Paper: arxiv.org/pdf/2404.07647… (1/10)
The softmax bottleneck is an interesting problem; it has many side effects which we do not yet fully understand! If you want to build an intuition for the problem, here is an interactive visualisation I made grv.unargmaxable.ai/static/files/s… (best viewed on desktop).
(6/10) This problem is in fact very much related with the softmax bottleneck issue (arxiv.org/abs/1711.03953) Basically, we try to map "low" dimensional contextual representations to potentially high-dimensional contextual probability manifolds, using a simple linear layer:
Check out this amazing new paper from @nthngdy on analyzing saturation of small LMs. Honored to have read through the initial drafts. Congrats my friend !! 🥳💪
🤏 Why do small Language Models underperform? We prove empirically and theoretically that the LM head on top of language models can limit performance through the softmax bottleneck phenomenon, especially when the hidden dimension <1000. 📄Paper: arxiv.org/pdf/2404.07647… (1/10)
Something that I noticed is that factorized embeddings seem to be more detrimental if the output layer is factorized than the input layer
🤏 Why do small Language Models underperform? We prove empirically and theoretically that the LM head on top of language models can limit performance through the softmax bottleneck phenomenon, especially when the hidden dimension <1000. 📄Paper: arxiv.org/pdf/2404.07647… (1/10)
very cool work analyzing the properties of small LMs using Pythia!
🤏 Why do small Language Models underperform? We prove empirically and theoretically that the LM head on top of language models can limit performance through the softmax bottleneck phenomenon, especially when the hidden dimension <1000. 📄Paper: arxiv.org/pdf/2404.07647… (1/10)
🥳Very pleased to announce that our paper on the work carried out on the EHRI Online Editions (@EHRIproject) has been accepted for the Workshop on "Holocaust Testimonies as Language Resources" at @LrecColing!
[#Parution] Benoît Sagot, “Apprendre les #langues aux machines”, @EditionsCdF, coll. “Leçons inaugurales”, en librairie à partir d’aujourd’hui college-de-france.fr/fr/editions/le… @lcdpu @cdf1530 @bensagot #apprentissage #IA #chatGPT #informatique
Excited to share our latest work on improving LLM pre-training! 🚀 The amazing @yuzhaouoe et al. found that focusing on how pre-training sequences are composed and attended over can significantly improve the generalisation properties of LLMs on a wide array of downstream tasks,…
Nathan Godey (@nthngdy), Éric de La Clergerie (@DeVillemonte), Benoît Sagot (@bensagot) On the Scaling Laws of Geographical Representation in Language Models arxiv.org/abs/2402.19406
Rian Touchent (@riantouchent), Laurent Romary (@laurentromary), Éric de La Clergerie (@DeVillemonte). CamemBERT-bio: Leveraging Continual Pre-training for Cost-Effective Models on French Biomedical Data hal.science/hal-04528508
Wissam Antoun (@wissam_antoun), Benoît Sagot (@bensagot), Djamé Seddah (@zehavoc). From Text to Source: Results in Detecting Large Language Model-Generated Content arxiv.org/abs/2309.13322
Lauriane Aufrant, Lucie Chasseur. UkraiNER: A New Corpus and Annotation Scheme Towards Comprehensive Entity Recognition
Niyati Bafna (@BafnaNiyati), Cristina España-Bonet, Josef van Genabith, Benoît Sagot (@bensagot) and Rachel Bawden (@RABawden). When your Rich Cousin Has the Right Connections: Unsupervised Bilingual Lexicon Induction for Related Data-Imbalanced Languages hal.science/hal-04523029