Michael E. Sander @m_e_sander
Ph.D. student @ENS_ulm, with @gabrielpeyre and @mblondel_ml. michaelsdr.github.io Paris Joined February 2021-
Tweets183
-
Followers722
-
Following196
-
Likes356
We will talk about work done with / by A. Guilloux, A.S. Jannot, @PierreMari0n, @m_e_sander, @gerardbiau, @gabrielpeyre, @PierreAblin, @jeanphi_vert and many more… slides will be out soon 🌞
Kicking off our one-day lecture at Greifswald on neural ODEs w. @AdelineFermani1 ! 🚀 More infos ⬇️ raph-ai.github.io/sig-workshop-g…
"Second-order derivatives are impossible to use in deep learning..." Wrong !! Here's a blog post with all you need to know about Hessian-vector products and autodiff iclr-blogposts.github.io/2024/blog/benc… Includes a benchmark of the different techniques in PyTorch and JAX.
Very proud to share the project I have been working on these last few months with @gabrielpeyre and F-X Vialard. We show convergence of gradient flow for the training of infinitely deep an arbitrarily wide ResNets (i.e. mean-field NODEs) arxiv.org/abs/2403.12887
Very proud to share the project I have been working on these last few months with @gabrielpeyre and F-X Vialard. We show convergence of gradient flow for the training of infinitely deep an arbitrarily wide ResNets (i.e. mean-field NODEs) arxiv.org/abs/2403.12887
If there are "enough" neurons at initialization, there is a P-L inequality for the training loss (so no local minimum and convergence if the loss is small enough). This is an infinite-dimensional extension of the results of @PierreMari0n @m_e_sander arxiv.org/abs/2309.01213
If you like ResNet and enjoy Optimal Transport, you might enjoy this paper with Raphaël Barboni and F-X Vialard. We show that infinite width/depth ResNet are ("conditional") Wasserstein flows. arxiv.org/abs/2403.12887
This looks cool: Mathieu Blondel (@mblondel_ml) and Vincent Roulet have posted a first draft of their book on arXiv: arxiv.org/abs/2403.14606
The cats in Okinawa are beautiful
How can we reduce the hypergradient error in Bilevel optimization when only an inexact inner solution is available? We study two strategies, preconditioning and reparameterization, in AISTATS24 paper with @gabrielpeyre , Daniel Cremers and @PierreAblin : arxiv.org/abs/2402.16748
@docmilanfar It is basically what we used for sinkformers proceedings.mlr.press/v151/sander22a…
TL;DR: From the continuous-time perspective, the parameter "\gamma / (1 -\beta)^2" intertwining step-size and momentum parameter governs the training trajectory (for any architecture!). Small values of this parameter helps in the recovery of sparse solutions in DiagLNs.
TL;DR: From the continuous-time perspective, the parameter "\gamma / (1 -\beta)^2" intertwining step-size and momentum parameter governs the training trajectory (for any architecture!). Small values of this parameter helps in the recovery of sparse solutions in DiagLNs.
"Differentially Private Representation Learning via Image Captioning". We've trained a DP image captioner on 200M+ image-text pairs, with SOTA DP image representation and better than non private MAE on certain tasks arxiv.org/pdf/2403.02506… @AIatMeta @berkeley_ai @Polytechnique
🚀🚀Check out the final version of our spotlight paper at ICLR 2024 on the convergence of the hidden states of Residual Networks to the solution of a Neural ODE! 🚀🚀 Paper: arxiv.org/abs/2309.01213 Code: github.com/michaelsdr/imp… with @PierreMari0n, Yuhan Wu and @gerardbiau
🚀🚀Check out the final version of our spotlight paper at ICLR 2024 on the convergence of the hidden states of Residual Networks to the solution of a Neural ODE! 🚀🚀 Paper: arxiv.org/abs/2309.01213 Code: github.com/michaelsdr/imp… with @PierreMari0n, Yuhan Wu and @gerardbiau https://t.co/zvtD9tXAMD
Implicit regularization of deep residual networks towards neural ODEs ift.tt/r7BLTWb
🥳 I’m very happy to announce our preprint biorxiv.org/content/10.110… ! scConfluence combines uncoupled autoencoders with Inverse Optimal Transport to integrate unpaired multimodal single-cell data in shared low dimensional latent space. @LauCan88 @gabrielpeyre
Looking for a postdoc to join my team @institutpasteur @CNRS @InstitutPrairie. The candidate will develop machine learning methods for single-cell omics data. Project funded by the #ERCStG MULTI-viewCELL
Meta presents Watermarking Makes Language Models Radioactive paper investigates the radioactivity of LLM-generated texts, i.e. whether it is possible to detect that such input was used as training data. Conventional methods like membership inference can carry out this detection…
OpenAI may secretly know that you trained on GPT outputs! In our work "Watermarking Makes Language Models Radioactive", we show that training on watermarked text can be easily spotted ☢️ Paper: arxiv.org/abs/2402.14904 @pierrefdz @AIatMeta @Polytechnique @Inria
Gabriel Peyré @gabrielpeyre
92K Followers 450 Following @CNRS researcher at @ENS_ULM. One tweet a day on computational mathematics.Michal Valko @misovalko
5K Followers 2K Following Llama @AIatMeta Paris & Inria & MVA - Ex: Gemini and BYOL @GoogleDeepMindSam Power @sp_monte_carlo
17K Followers 7K Following Lecturer in Maths & Stats at Bristol. Interested in probabilistic + numerical computation, statistical modelling + inference. (he / him)Pierre Alquier @PierreAlquier
8K Followers 5K Following Professor of Statistics @ESSEC_AP 🇸🇬 // Previously @RIKEN_AIP 🇯🇵 @ENSAEparis 🇫🇷 @ucddublin 🇮🇪 🇪🇺 // random posts about research & birds photos // 🌈Mathieu Dagréou @Mat_Dag
319 Followers 547 Following Ph.D. student in at @Inria_Saclay working on Optimization and Machine Learning @[email protected]Samuel Vaiter @vaiter
731 Followers 294 Following @CNRS Researcher in Optimization & Machine learningMathieu Blondel @mblondel_ml
9K Followers 419 Following Research scientist at Google DeepMind. Current research interests: differentiable programming, LLMs, Transformers.Linus Bleistein @bleistein_linus
580 Followers 2K Following PhD candidate @Inria Paris in statistics and ML • Alumni @ENS_ULM • Learning theory, time series, neural ODEs / CDEs • Currently Research Intern @OwkinScienceBénédicte Colnet @BenedicteColnet
450 Followers 520 Following 🔬💬🚲📚🎬 Working for the French social security Ph.D. in applied math, ML, & health at @Inria Engineer @Polytechnique My tweets are my ownPierre Ablin @PierreAblin
5K Followers 338 Following Machine learning research scientist at @Apple. I mostly tweet about optimization, stats and ML.Quentin Berthet @qberthet
2K Followers 2K Following Research Scientist at Google DeepMind Machine Learning - ParisAlexis Thual @alexisthual
653 Followers 651 Following Enthusiast open-source dev & data scientist, PhD student in neuroscience @Neurospin_91 @Parietal_inriaJuan Hmmm @JuanAH03488233
75 Followers 3K FollowingJames Thornton @JamesTThorn
805 Followers 379 Following Research Scientist @Apple ML Research, Paris | Stat / ML PhD Oxford @oxcsml Working on diffusions, optimal transport and samplingMartin Fan @perfectoid_ai
394 Followers 8K FollowingEva Louise Marie Gabr.. @e681554349
8 Followers 3K FollowingHarald Ludwig @haraldludwig_
153 Followers 751 Following MSc. AI @jkulinz | Into AI, ODEs/PDEs, AI4Science, RL and more | T1DBaud @Baud07308259274
0 Followers 35 FollowingTudor Cebere @TCebere
238 Followers 881 Following PhD 🧑🏻🎓 in Differential Privacy & Learning @Inria 🇫🇷 Research @openminedorg Alumnus @ENSdeLyon 🇫🇷, @upb1818 🇷🇴Elias @EliasRamzi
17 Followers 49 FollowingEdgar Jaber @Jabedgar
57 Followers 72 Following PhD Student in Applied Mathematics. Classical MusicianBob Lai @boblai1113
0 Followers 28 FollowingLaplace de la Républ.. @RogueHerd
438 Followers 2K Following Composer of shitposts.Acquired taste for doom poasting. Passionate procrastinator. Aspiring maths monk and High priest of mech sci. Ethically bigoted.Jérémie Kalfon @jkobject
210 Followers 1K Following Doing a Ph.D. AI in Bio. | Ex @WhiteLabGx @BroadInstitute @MIT | Built @PiPleteam | ML, Cancer, Genomics, Data Sci, Entrepreneur, FullStack Dev | All Views MineBingrui Li @bingruili_
25 Followers 248 FollowingEmile van Krieken @EmilevanKrieken
2K Followers 1K Following Postdoc @ University of Edinburgh | Neurosymbolic Machine Learningj @JTiahuallp44317
92 Followers 701 FollowingAntonin Schrab @AntoninSchrab
402 Followers 453 Following PhD student in Foundational AI at @ai_ucl & @GatsbyUCL. Kernel methods, hypothesis testing, generative models.ReinanBr @BezerraReinan
109 Followers 713 Following Living Easy🍃, Living Free😎. Dev Open Source 💜 academic physicist 📚 💻🎮❤️🔥Nathan Lévy @nathan_levyy
89 Followers 279 Following ML & Spatial Omics | PhD student @YosefLab | @ENS_ParisSaclay alumYvann Le Fay @YvannLeFay
2 Followers 37 Following PhD candidate in statistics @IP_Paris_ @CrestUmr @uniofjyvaskylaHuy Tran @huytransformer
87 Followers 3K FollowingA. Chaves @apachaves
957 Followers 5K Following Scientist 🔬 working with data 💻 to make Industry 4.0 🏭 come true. Love nature 🏞️ and art too 🎭.Itay Evron @itayevron
435 Followers 343 Following PhD student at @TechnionLive. Studying Continual Learning.wangzhuxi666 @wangzhuxi666
91 Followers 4K Followingliuyong @forrestbing
237 Followers 5K Following I am a researcher in AIGC, Multi-modality and VitrualHuman tech directiontushar khatri @tushark67675667
1 Followers 834 FollowingDimitri Meunier @DimitriMeunier1
384 Followers 326 Following 3rd year ML PhD @GatsbyUCL @ELLISforEurope | Previously CSML team @IITalkmaleccq @maleccq
7 Followers 3K FollowingI07XNbUI4 @DeepFeed2
47 Followers 3K FollowingShivam Rai @IMSHIVAMRAI282
180 Followers 3K FollowingJarrid Rector-Brooks @jarridrb
59 Followers 80 Following Co-founder @DreamFoldAI, PhD student @Mila_QuebecAbdessalam Ed-dib �.. @abdessalameddib
8 Followers 234 Following Engineering student @ École PolytechniqueMartin Jørgensen @JorgensenMart
479 Followers 599 Following Postdoctoral researcher, University of Helsinki, Statistical Machine LearningJules Samaran @JulesSamaran
51 Followers 155 Following PhD student @ Institut Pasteur. Research interests include optimal transport and computational biology.Valentin Iovene @valiovene
151 Followers 364 Following Yak Shaving at AlphaBrain (https://t.co/XruJwWeajR). Previously: PhD @Inria, ML Research at @CFM_AM Also at @[email protected]Georges Le Bellier @_lebellig
177 Followers 410 Following Ph.D. student @LeCnam on domain adaptation and self-supervised learning for remote sensing 🛰 Previously intern @SonyCSL, @Ircam, @InriaMr. Jack Tung @MrJackTung
216 Followers 3K FollowingJavier Stokes @javierstokesxxx
11 Followers 144 Following Chicken enchiladas? For here or to go? Official property of *** Z****Gabriel Peyré @gabrielpeyre
92K Followers 450 Following @CNRS researcher at @ENS_ULM. One tweet a day on computational mathematics.Yann LeCun @ylecun
710K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.AK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxPeyman Milanfar @docmilanfar
67K Followers 261 Following Distinguished Scientist at Google Research. Computational Imaging, Machine Learning, and Vision. Tweets = personal opinions. May change or disappear over time.Google DeepMind @GoogleDeepMind
943K Followers 275 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.Lénaïc Chizat @LenaicChizat
2K Followers 240 Following Assistant professor @epfl (previously @cnrs researcher). Interested by what's simple, beyond the obvious.Michal Valko @misovalko
5K Followers 2K Following Llama @AIatMeta Paris & Inria & MVA - Ex: Gemini and BYOL @GoogleDeepMindFrançois Fleuret @francoisfleuret
31K Followers 455 Following Prof. @Unige_en, Adjunct Prof. @EPFL_en, Research Fellow @idiap_ch, co-founder @nc_shape. AI and machine learning since 1994. I like reality.Pierre Alquier @PierreAlquier
8K Followers 5K Following Professor of Statistics @ESSEC_AP 🇸🇬 // Previously @RIKEN_AIP 🇯🇵 @ENSAEparis 🇫🇷 @ucddublin 🇮🇪 🇪🇺 // random posts about research & birds photos // 🌈Lucas Beyer (bl16) @giffmana
56K Followers 444 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Mathieu Dagréou @Mat_Dag
319 Followers 547 Following Ph.D. student in at @Inria_Saclay working on Optimization and Machine Learning @[email protected]Gael Varoquaux @GaelV.. @GaelVaroquaux
22K Followers 318 Following Research & code: Research director @inria ►Data, Health, & Computer science ►Python coder, (co)founder of @scikit_learn & joblib ►Art on @artgael ►Physics PhDSamuel Vaiter @vaiter
731 Followers 294 Following @CNRS Researcher in Optimization & Machine learningMathieu Blondel @mblondel_ml
9K Followers 419 Following Research scientist at Google DeepMind. Current research interests: differentiable programming, LLMs, Transformers.Linus Bleistein @bleistein_linus
580 Followers 2K Following PhD candidate @Inria Paris in statistics and ML • Alumni @ENS_ULM • Learning theory, time series, neural ODEs / CDEs • Currently Research Intern @OwkinScienceElie Bursztein @elie
63K Followers 126 Following AI Cybersecurity @Google & @DeepMind. Help advance AI cybersecurity capabilities and make AI safe & secure for all. @EtteillaOrg Art Foundation founder.Immigrant Justice Cor.. @IJCorps
2K Followers 766 Following The country's first fellowship program dedicated to meeting the need for quality legal counsel for immigrants.James Thornton @JamesTThorn
805 Followers 379 Following Research Scientist @Apple ML Research, Paris | Stat / ML PhD Oxford @oxcsml Working on diffusions, optimal transport and samplingArthur Gretton @ArthurGretton
9K Followers 730 Following Professor in Machine Learning, Gatsby Computational Neuroscience Unit Research Scientist, Google DeepMind @[email protected]Zhenzhang Ye @Zhenzhang_Ye
49 Followers 126 Following Ph.D. student in CV group @TU_Muenchen. Numerical Optimzation, Machine Learning.Jérémie Kalfon @jkobject
210 Followers 1K Following Doing a Ph.D. AI in Bio. | Ex @WhiteLabGx @BroadInstitute @MIT | Built @PiPleteam | ML, Cancer, Genomics, Data Sci, Entrepreneur, FullStack Dev | All Views MineAntonin Schrab @AntoninSchrab
402 Followers 453 Following PhD student in Foundational AI at @ai_ucl & @GatsbyUCL. Kernel methods, hypothesis testing, generative models.Itay Evron @itayevron
435 Followers 343 Following PhD student at @TechnionLive. Studying Continual Learning.Dimitri Meunier @DimitriMeunier1
384 Followers 326 Following 3rd year ML PhD @GatsbyUCL @ELLISforEurope | Previously CSML team @IITalkRicky T. Q. Chen @RickyTQChen
4K Followers 809 Following Research Scientist at FAIR NY, Meta. I build simplified abstractions of the world through the lens of dynamics and flows.Théo Uscidda @theo_uscidda
20 Followers 97 Following PhD @ENSAEparis, working on generative modeling & optimal transport with @CuturiMarco.Valérie Castin @VCastin_
17 Followers 38 Following PhD student in Machine Learning at Ecole Normale SupérieureMartin Jørgensen @JorgensenMart
479 Followers 599 Following Postdoctoral researcher, University of Helsinki, Statistical Machine LearningJason Ramapuram @jramapuram
788 Followers 391 Following ML Research Scientist MLR | Formerly: DeepMind, Qualcomm, Viasat, Rockwell Collins | Swiss-minted PhD in ML | Barista alumnus ☕ @ Starbucks | 🇺🇸🇮🇳🇱🇻🇮🇹Nicolas Zucchet @NicolasZucchet
224 Followers 243 Following PhD student in NeuroAI @CSatETH | prev. @PolytechniqueJules Samaran @JulesSamaran
51 Followers 155 Following PhD student @ Institut Pasteur. Research interests include optimal transport and computational biology.Rudy Morel @rdMorel
24 Followers 65 Following Resarch Fellow at @FlatironCCM, ex PhD student at @ENS_UlmAdrien Taylor @TaylorAdrien
340 Followers 164 Following Researcher at @inria_Paris in the @Sierra_ML_Lab teamMax Welling @wellingmax
32K Followers 429 FollowingAlaa El-Nouby @alaa_nouby
528 Followers 302 Following Research Scientist at @Apple. Previous: @Meta (FAIR), @Inria, @MSFTResearch, @VectorInst and @UofG . Egyptian 🇪🇬 Deprecated twitter account: @alaaelnoubyJosh Susskind @jsusskin
2K Followers 538 Following Apple ML research: foundations, perception, action, future technology, creativity, curiosity, compositionality, scientific jazz!Matthieu Terris @MatthieuTerris
175 Followers 470 Following Postdoctoral researcher in the MIND team at INRIA. https://t.co/nS7HYgPNZRLouis Béthune @LouisBAlgue
93 Followers 191 Following Please constrain the Lipschitz constant of your networks.Julia Linhart @jujulinhart
66 Followers 54 Following 2nd-year PhD student @InriaMind, @UnivParisSaclay. Machine Learning | Simulation-Based Inference | Deep Generative Models | Diagnostics | Neuroscience, EEGkyutai @kyutai_labs
6K Followers 6 FollowingFrédéric Encel @FredericEncel
35K Followers 4K Following Géopolitologue. Consultant conférencier. Spécialiste du Moyen-Orient. Professeur @sciencespo et @Psbeduparis. République Histoire Respect des femmes et J.Brel !Raja Giryes 💔 @RGiryes
279 Followers 59 Following Associate Professor at Tel Aviv University. Interested in deep learning, signal and image processing and computational imaging.Raphaël Enthoven @Enthoven_R
265K Followers 3K Following Le plus souvent, c'est le sentiment d'être trompé qui est trompeur.Samuel Hurault @HuraultSamuel
174 Followers 377 Following Postdoctoral researcher at ENS Paris. Machine Learning & Image Processing.Julián Tachella @TachellaJulian
481 Followers 348 Following @CNRS research scientist, based at @ENSdeLyon. I'm interested in AI for imaging inverse problems Looking to hire phds/postdocs! 🇦🇷 🇬🇧 🇫🇷CSEN (officiel) @CsenOfficiel
4K Followers 58 Following Le #CSEN a pour mission de faciliter, par son expertise, la prise en compte des apports de la recherche scientifique à la population éducative.Grant Rotskoff @GrantRotskoff
817 Followers 103 Following Assistant Professor of Chemistry at Stanford. Biophysics, applied mathematics, machine learning.Johannes Oswald @oswaldjoh
751 Followers 536 Following Research Scientist, Google Research & ETH Zurich alumniZeyuan Allen-Zhu @ZeyuanAllenZhu
8K Followers 273 Following physics of language models @ Meta / FAIR IOI - USACO - MCM - ACM/ICPC - Codejam Tsinghua - MIT - Princeton/IAS - MSR - FAIRBénédicte Colnet @BenedicteColnet
450 Followers 520 Following 🔬💬🚲📚🎬 Working for the French social security Ph.D. in applied math, ML, & health at @Inria Engineer @Polytechnique My tweets are my ownAISTATS Conference @aistats_conf
4K Followers 183 Following 27th International Conference on Artificial Intelligence and Statistics. AISTATS 2024.ICLR 2024 @iclr_conf
41K Followers 40 Following International Conference on Learning Representations #ICLR2024. SPC is @yisongyue and GC is @_beenkim OpenReview:https://t.co/OD1sg0r7F8Antoine Yang @AntoineYang2
703 Followers 410 Following Research Scientist @GoogleDeepMind, Gemini multi-modal 💎. Prev: PhD @Inria & @ENS_ULM, MEng @Polytechnique.Antoine Moulin @antoine_mln
337 Followers 396 Following PhD student in RL @DTIC_UPF, @ELLISforEurope.Arnaud Doucet @ArnaudDoucet1
4K Followers 627 Following Professor at Oxford @OxCSML, @oxfordstats, Professorial fellow at @Hertfordcollege and Senior Staff Research Scientist at @GoogleDeepMind.The optimal computation of gradients for the composition of functions is an optimal parenthesis problem. Forward and backward (backpropagation) are two extreme cases. Backward is optimal for scalar-valued functions. link.springer.com/article/10.100… en.wikipedia.org/wiki/Matrix_ch…
For those interested in Optimal Transport, I will give a tutorial next Tuesday morning @ENS_ULM and in the afternoon there will be scientific talks.
I’m excited to share that I joined Meta this week as a research scientist. I’ll be working again on AI for neural decoding with the amazing @agramfort and @ZaccharieRamzi ! I’m looking forward to contributing to the future of neural interfaces !
For those in NYC I am giving a seminar on conservation laws for gradient flows in the Center for Data Sciences tomorrow at 2PM cds.nyu.edu
Hello Twitter! A few weeks ago, I defended my PhD thesis (Title below). I want to thank everybody that joined, or helped along the way and especially my supervisors, jury members and colleagues. I joined since the Google Deepmind team here in Paris. Good things ahead (I hope 🤞!)
Kicking off our one-day lecture at Greifswald on neural ODEs w. @AdelineFermani1 ! 🚀 More infos ⬇️ raph-ai.github.io/sig-workshop-g…
We will talk about work done with / by A. Guilloux, A.S. Jannot, @PierreMari0n, @m_e_sander, @gerardbiau, @gabrielpeyre, @PierreAblin, @jeanphi_vert and many more… slides will be out soon 🌞
I am thrilled to announce that I have been promoted to the position of full Professor at the University of Tokyo. I am immensely grateful to everyone who has supported me along the way!
すでにご指摘いただいておりますが,この4月1日付で東京大学・大学院情報理工学系研究科・数理情報学専攻(計数工学科)の教授に昇任いたしました. これからも機械学習・人工知能・数理統計を盛り上げていきたいと思います.皆様,今後ともどうぞよろしくお願いします!
Don't read enough = you may re-invent the wheel Read too much = you may feel paralyzed The art of learning by reading more vs. creating new knowledge is a delicate balance -especially these days- because short term success is conflated w/ lasting impact and real progress.
Very proud to share the project I have been working on these last few months with @gabrielpeyre and F-X Vialard. We show convergence of gradient flow for the training of infinitely deep an arbitrarily wide ResNets (i.e. mean-field NODEs) arxiv.org/abs/2403.12887
If you like ResNet and enjoy Optimal Transport, you might enjoy this paper with Raphaël Barboni and F-X Vialard. We show that infinite width/depth ResNet are ("conditional") Wasserstein flows. arxiv.org/abs/2403.12887
If you like ResNet and enjoy Optimal Transport, you might enjoy this paper with Raphaël Barboni and F-X Vialard. We show that infinite width/depth ResNet are ("conditional") Wasserstein flows. arxiv.org/abs/2403.12887
If there are "enough" neurons at initialization, there is a P-L inequality for the training loss (so no local minimum and convergence if the loss is small enough). This is an infinite-dimensional extension of the results of @PierreMari0n @m_e_sander arxiv.org/abs/2309.01213
@docmilanfar It is basically what we used for sinkformers proceedings.mlr.press/v151/sander22a…
Implicit regularization of deep residual networks towards neural ODEs ift.tt/r7BLTWb
@gabrielpeyre - Most SOTA models for diffusions use self-attention and/or vision Transformer, not U-nets. An exception is arxiv.org/abs/2311.18257. (2/3)
@gabrielpeyre - Regarding theory: in-context learning is definitely important, but I think there are many other challenges regarding LLMs. For instance: why is it possible to scale the context size apparently to very large numbers (million of tokens in Gemini) without degrading accuracy? (3/3)
@PierreMari0n I have no idea why trained transformers works well for large number of token, but it was the motivation for @m_e_sander to directly to theory over the space of measures, ie without specifying in advance the # tokens.
Yes it does ☢️☢️☢️ Check out our paper: arxiv.org/abs/2402.14904
Watermarking makes language models radioactive ☢️ From FAIR & Inria