Hossein Mobahi @TheGradient
Senior Research Scientist @GoogleAI. I ∈ Optimization ∩ Machine Learning. Fan of @IronMaiden🤘.Here to discuss research 🤓 people.csail.mit.edu/hmobahi/ Mountain View, CA Joined December 2010-
Tweets1K
-
Followers6K
-
Following691
-
Likes5K
Applications for the @GoogleAI PhD Fellowship Program are now open. The program will be accepting student applications through May 8. It supports graduate students doing innovative research in computer science and related fields as they pursue their PhD and also connects them to…
We are looking for Summer/Fall grad interns as part of the USC Center on AI Foundations for the Sciences (AIF4S)(sites.usc.edu/aif4s/). Deadline is March 12 detail and form here sites.usc.edu/aif4s/openings/.
With increasing interest in the white-box deep representation learning via rate reduction, to help students and colleagues get started, I have organized all key and related papers at my website: people.eecs.berkeley.edu/~yima/Publicat…, from its original roots to its most recent developments.
[LG] On student-teacher deviations in distillation: does it pay to disobey? arxiv.org/abs/2301.12923 The main point of the article is about the research on knowledge distillation. It has been shown that despite being trained to fit the teacher's probabilities, the student…
At 4:15pm today @TheGradient will be at the #NeurIPS2023 Google booth to talk about the differences between Sharpness-Aware Minimization (that improves generalization) and similar methods (that don't), which can be explained by the structure of the Hessian of the loss function.
Very excited to share what we have been working on in the last several months: Gemini 1.0! Google Blogpost: blog.google/technology/ai/… DeepMind Blogpost: deepmind.google/technologies/g… Technical Report: storage.googleapis.com/deepmind-media…
Very excited to share what we have been working on in the last several months: Gemini 1.0! Google Blogpost: blog.google/technology/ai/… DeepMind Blogpost: deepmind.google/technologies/g… Technical Report: storage.googleapis.com/deepmind-media…
IMO, messages to the AC are taken seriously, because if the reported complaint remains unresponded or unresolved, the negligence arrow will move form pointing to reviewers to the AC :-) Use this opportunity to fight for any lost justice about your submission.
IMO, messages to the AC are taken seriously, because if the reported complaint remains unresponded or unresolved, the negligence arrow will move form pointing to reviewers to the AC :-) Use this opportunity to fight for any lost justice about your submission.
tl;dr submit a training algorithm* that is faster** than Adam*** and win $10,000 💸🚀 *a set of hparams, self-tuning algorithm, and/or update rule **see rules for how we measure speed ***beat all submissions, currently the best is NAdamW in wallclock and DistShampoo in steps
tl;dr submit a training algorithm* that is faster** than Adam*** and win $10,000 💸🚀 *a set of hparams, self-tuning algorithm, and/or update rule **see rules for how we measure speed ***beat all submissions, currently the best is NAdamW in wallclock and DistShampoo in steps
If you want to fully understand deep networks, Transformers in particular, please check out our latest paper: arxiv.org/abs/2311.13110 I hope this comprehensive study will convince you that there is absolutely nothing to worry about current AI systems being a threat to humanity.
Peyman Milanfar @docmilanfar
67K Followers 261 Following Distinguished Scientist at Google Research. Computational Imaging, Machine Learning, and Vision. Tweets = personal opinions. May change or disappear over time.Behnam Neyshabur @bneyshabur
18K Followers 689 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingYi Ma @YiMaTweets
71K Followers 120 Following Chair Professor in AI, Director of IDS, Head of CS, HKU; Professor of EECS, Berkeley; Author of Book: High-Dim Data Analysis, https://t.co/gwaqMJp8av.Dan Roy @roydanroy
45K Followers 2K Following Research Director, @VectorInst. Canada CIFAR AI Chair. Associate Professor of Stats/CS @UofT. I study machine learning and AI, emphasis on theory.Rosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pAmin Karbasi @aminkarbasi
7K Followers 2K Following Associate Professor at Yale University, staff research scientist at Google.Michael Bronstein @mmbronstein
43K Followers 4K Following #DeepMind Professor of #AI @UniofOxford / Fellow @ExeterCollegeOx / ML Lead @ProjectCETI / https://t.co/kZpGpDzYeVTom Goldstein @tomgoldsteincs
23K Followers 2K Following Professor at UMD. AI security & privacy, algorithmic bias, foundations of ML. Follow me for commentary on state-of-the-art AI.Csaba Szepesvari @CsabaSzepesvari
8K Followers 699 Following "If there is not folly in the world, then the world itself is folly. You must understand that mistakes are not always regrets." - Paul Tobin, Bandette🤠Peter Richtarik @peter_richtarik
6K Followers 591 Following Federated Learning Guru. Tweeting since 20.5.2020. Lived in 🇸🇰🇺🇸🇧🇪🇬🇧🇸🇦rohan anil @_arohan_
12K Followers 2K Following Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.Dmytro Mishkin 🇺�.. @ducha_aiki
18K Followers 590 Following Marrying classical CV and Deep Learning. I do things, which work, rather than being novel, but not working.Shane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)Animesh Garg @animesh_garg
21K Followers 1K Following Foundation Models for Generalizable Autonomy. Assistant Professor in AI Robotics @GeorgiaTech + @NvidiaAI. prev @Stanford @berkeley_ai @UofTCompSciAlex Dimakis @AlexGDimakis
13K Followers 2K Following UT Austin Professor. Researcher in Machine Learning and Information Theory. National AI Institute on the Foundations of Machine Learning (IFML) Co-director.yobibyte @y0b1byte
15K Followers 2K Following Kurin ViTaly, senior research scientist @IsomorphicLabs, ML PhD from @UniofOxford on RL, Multitask learning & Graphshossein esmi @hosseinesmi1
2 Followers 24 FollowingTim Lawson @tslwn
41 Followers 134 Following PhD student in AI @BristolUni. Previously physics @Cambridge_Uni and software @graphcoreai. Language, cognition, etc.Lê Hoàng Nam @Le_Hoang_Nam_24
9 Followers 42 FollowingUmpire_Utility_ @utility65586
13 Followers 1K FollowingAmid Gholampour @AmidGholampour
4 Followers 86 Following M.Sc. Student of Applied Mechanics at Ferdowsi University of Mashhad, Mashhad, Iran | Research Assistant at FUM Center of Advanced Rehabilitation and RoboticsJiannan Xiang @szxiangjn
257 Followers 417 Following PhD @UCSD | Previously MSML @mldcmu @SCSatCMU, BS in EE @ustc | ML, NLP, NLGQingfeng Lan @LanceLan3
321 Followers 601 Following PhD student @rlai_lab, University of Alberta. Working on Reinforcement Learning and Continual Learning. * I'm looking for internship in 2024.Debjani Mazumder @deb_mazu
84 Followers 2K Following Doctoral Research Fellow in Natural Language Processing. Studies data mining, information retrieval, machine learningTyne宇 @Tyne03720826082
109 Followers 3K FollowingKhurram Azeem Hashmi @khurramhashmi3
48 Followers 323 Following Researcher @DFKI | PhD candidate @uni_kl | Primarily, interested in Computer Vision, SSL, future of deep learningFrancesco Corti @Fra__Corti
28 Followers 196 Following PhD student at TU Graz, part of the Embedded Learning and Sensing Systems group. Passionate about efficient deep learning, with a focus on edge computing.Xuanming Zhang @XuanmingZhang07
281 Followers 571 Following PhD student at @ColumbiaCompSci @columbianlp @ColumbiaHCI | Prev at @ETSresearch @TsinghuaNLP | #NLProc #HCIsajad soltani @sajadsoltani_st
181 Followers 2K FollowingKashif Imteyaz @kashif_imteyaz
721 Followers 3K Following Comp Sci PhD @KhouryCollege / @NortheasternAI Studying Social Computing, Human-AI Interaction, FutureOfWorkRakesh Dey @RakeshD24137483
36 Followers 493 Following ML theory, Optimization, Statistics, Computer VisionJeet @Jeet02982232
14 Followers 38 FollowingKrish Dasgupta @officialKrishD
872 Followers 4K Following Forever Learner | Building Reinforcement Learning Systems | Healthcare | Robots and Brains | Graph ML for HealthMahan @MrMursielago
264 Followers 1K Following 'Here I lie, the wretched useless remnant of a dead man.'Kun (Kevin) SUN @Sharp_K_Sun
227 Followers 2K Following Scientist Researcher @ Tübingen University and Professorial Research Fellow @ Fudan University, and interested in LLMs, NLP, and computational cognition .Sima @Sima1623487
1 Followers 21 FollowingDataScienceWeekly @DataSciNews
33K Followers 207 Following An in-depth look at the world of Data Science. Sign up for our free weekly newsletter featuring curated news, articles and jobs related to Data Science...Hooman Mohajeri @LifeOfHooman
74 Followers 1K Following I get excited by new gimmicks! So bring me your toys! :D @Princeton PhD, former @PrincetonCITP Occupant! Interested in Privacy, Security and Cryptographyb @_l__ll__l_
190 Followers 5K FollowingNik @NIkJain1510
198 Followers 1K Following Entrepreneur (3 Tech Startups), Technologist, UVA Darden (Batten Scholar), IIT-BombayAwsaf @awsaf49
802 Followers 667 Following Contractor @google || Kaggle Grandmaster @kaggle || Dev Expert @weights_biases || Research Assistant in IRAB (Institute of Robotics & Automation, BUET)Sayak Saha Roy @SayakSahaRoy
116 Followers 269 Following PhD Student at @cseuta - Online Fraud Prevention and Usable SecurityRadoslav Krivak @rdkbio
335 Followers 5K Following Structural Bioinformatics / AI for Drug Discovery / Geometric DL (@IOCBPrague, prev. PhD @cusbg)Shubham @Shubham89202903
0 Followers 86 FollowingAhrimaan @ZartushtSpinoza
20 Followers 274 Following Back then you carried your ashes to the mountain; would you now carry your fire into the alley?Jiawei Liu @JiaweiLiu_
2K Followers 940 Following Simplifying the making of great software. PhD Student @plfmse @IllinoisCS.Tonmoy Hossain @TonmoyHDihan
48 Followers 354 FollowingArif Ahmad @ArifAhm92263086
152 Followers 5K Following All things AI, Computer Science and Circuits!Mohammad Mohammadi @MoMohammadi99
12 Followers 55 Following PhD Student in Computer Science at @UofTSamy @samymmmr
136 Followers 2K FollowingWeijia Shi @WeijiaShi2
5K Followers 963 Following PhD student @uwcse @uwnlp | Visiting Researcher @MetaAI | Undergrad @CS_UCLA | https://t.co/eLBQmgkvymAkash Gupta @aksgupta97
85 Followers 635 Following MS CS Courant @nyuniversity | Applied Scientist Intern @amazon | Past Researcher @val_iisc @IIIT_Hyderabad @NUSComputing @CERN | @alumni_thaparAbdelrahman Saeed @EdAbdelrahman
86 Followers 2K Following Machine learning guru ex: Machine learning engineer intern in Both (Meta, Apple)🌐Peyman Milanfar @docmilanfar
67K Followers 261 Following Distinguished Scientist at Google Research. Computational Imaging, Machine Learning, and Vision. Tweets = personal opinions. May change or disappear over time.Behnam Neyshabur @bneyshabur
18K Followers 689 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingYi Ma @YiMaTweets
71K Followers 120 Following Chair Professor in AI, Director of IDS, Head of CS, HKU; Professor of EECS, Berkeley; Author of Book: High-Dim Data Analysis, https://t.co/gwaqMJp8av.Gautam Kamath @thegautamkamath
44K Followers 501 Following Assistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Co-EiC @TmlrOrg. I lead @TheSalonML. Privacy, robustness, machine learning.Clément Canonne @ccanonne_
31K Followers 925 Following Senior Lecturer @Sydney_Uni. Postdocs @IBMResearch, @Stanford; PhD @Columbia. Converts ☕ into puns: sometimes theorems. He/him. @[email protected]Dan Roy @roydanroy
45K Followers 2K Following Research Director, @VectorInst. Canada CIFAR AI Chair. Associate Professor of Stats/CS @UofT. I study machine learning and AI, emphasis on theory.Lucas Beyer (bl16) @giffmana
56K Followers 443 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected]Kevin Patrick Murphy @sirbayes
42K Followers 328 Following Research Scientist at Google Brain / Deepmind. Interested in Bayesian Machine Learning.Rosanne Liu @savvyRL
32K Followers 965 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRPercy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pKyunghyun Cho @kchonyc
60K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre PC at @nyuniversity (@CILVRatNYU) & @genentech (@PrescientDesign).NeurIPS Conference @NeurIPSConf
111K Followers 35 Following New Orleans, Dec 10-16, 23. https://t.co/ga8aOw615g Tweets to this account are not monitored. Please send feedback to [email protected].Amin Karbasi @aminkarbasi
7K Followers 2K Following Associate Professor at Yale University, staff research scientist at Google.Michael Bronstein @mmbronstein
43K Followers 4K Following #DeepMind Professor of #AI @UniofOxford / Fellow @ExeterCollegeOx / ML Lead @ProjectCETI / https://t.co/kZpGpDzYeVJia-Bin Huang @jbhuang0604
51K Followers 285 Following Associate Professor @umdcs; Part-time Research Scientist @Meta. I like pixels.Jonathan Frankle @jefrankle
16K Followers 685 Following Chief Scientist, Neural Networks @Databricks via MosaicML. PhD @MIT_CSAIL. BS/MS @PrincetonCS. DC area native. Making AI efficient for everyone at @DbrxMosaicAIFerenc Huszár @fhuszar
40K Followers 1K Following Secular Bayesian. Associate Professor in Machine Learning @Cambridge_CL. Talent aficionado at https://t.co/RbJkoLguey Alum of @Twitter, Magic Pony and @BaldertonDataScienceWeekly @DataSciNews
33K Followers 207 Following An in-depth look at the world of Data Science. Sign up for our free weekly newsletter featuring curated news, articles and jobs related to Data Science...Aaron Defazio @aaron_defazio
6K Followers 352 Following Research Scientist at Meta working on optimization. Fundamental AI Research (FAIR) teamMengdi Wang @MengdiWang10
791 Followers 240 Following Princeton professor in AIML, optimization and data science. Program Chair @ICLR2023. Formerly @MIT @GoogleDeepmind @TsinghuaDmitry Krotov @DimaKrotov
3K Followers 730 Following I am a physicist working on neural networks and machine learning, @MITIBMLab @IBMResearch. Formerly: @the_IAS, @PrincetonOmar Khattab @lateinteraction
11K Followers 2K Following CS PhD candidate @StanfordNLP. 2022 Apple Scholar in AI/ML. Author of ColBERT (https://t.co/2ZtgXoa1np), DSPy (https://t.co/BH7WmMKDXR), & various retrieval & LM systems.Ehsan Shafiee @Ehsan_Shafiee
250 Followers 525 FollowingMehran Kazemi @kazemi_sm
1K Followers 496 Following Senior Research Scientist @GoogleAI. Research areas: machine/deep learning, large language models, artificial general intelligence. Views my own.Kording —-& Lab �.. @KordingLab
41K Followers 4K Following Konrad kording, @Penn Prof, deep learning, brains, #causality, rigor, https://t.co/tTJW05RjpC, https://t.co/qf7ZHxiCUt, Transdisciplinary optimist, Dad, Loves outdoors, 🦖Bahare Fatemi @BahareFatemi
2K Followers 689 Following Research Scientist @GoogleAI, PhD from @UBC_CS, Before: @MetaAI, @element_ai, and @borealisai.Vinay Ramasesh @vinayramasesh
546 Followers 725 Following Research scientist @DeepMind working towards a better understanding of deep learning. Physics PhD @UCBerkeleyJiantao Jiao @JiantaoJ
160 Followers 58 Following Assistant Professor at UC Berkeley EECS and Statistics. Co-founder and CEO of Nexusflow @NexusflowX I do research on ML, RL, and LLMs.Judea Pearl @yudapearl
76K Followers 186 Following Student of causal inference, human reasoning, and history of ideas, all viewed through the sharp lens of artificial intelligence.Dongyeop Lee @edong6768
22 Followers 128 Following Master's student at the Graduate School of Artificial Intelligence in @postech2020.Sungbin Shin @sungbin_shin
65 Followers 135 Following PhD Student in Computer Science and Engineering at @postech2020. Also working as a student researcher at @google.Nima Anari @nimaanari
323 Followers 129 FollowingEhsan Amid @esiamid
724 Followers 226 Following Senior Research Scientist @GoogleDeepMind 🌱🧠🌼🐝 (opinions my own just in case)👩💻 Paige Bai.. @DynamicWebPaige
59K Followers 2K Following ✨Keep it simple, make it scale. AI should be about empowering people, building understanding, & making dreams realities. 👩💻GenAI @GoogleDeepMind ex-@GitHubRyan R. Rosario @DataJunkie
20K Followers 996 Following Software Engineer @Google, AI/ML on Google Kubernetes Engine. Lecturer @UCLA Computer Science. Data systems, ML & NLP. Statistics Ph.D. Opinions my own.Jason Weston @jaseweston
9K Followers 568 Following Research @MetaAI+NYU. Pretrain+FT: NLP from Scratch (2011). Multilayer attention+position embed+LLM: MemNets (2015). Recent (2023+):Sys 2 Attn, Self-Rewarding..Matt Streeter @MattStreeter5
5 Followers 0 FollowingSong Mei @Song__Mei
1K Followers 547 Following Assistant Professor at UC Berkeley, Department of Statistics and EECS. Researcher working on foundations of generative AI.Katie Everett @_katieeverett
228 Followers 458 Following Machine learning researcher at @GoogleDeepMind (via Brain) + PhD student @MIT. Previously @chorus cofounder, @twitter, @MIT.Stephen Bates @stats_stephen
3K Followers 292 Following Assistant Professor, MIT EECS. Developing rigorous stats & ML methods for data-driven science and reliable AI systems.Tan Minh Nguyen @TanNguyen689
854 Followers 606 Following I received my Ph.D., MSEE, and BSEE from Rice University working with Prof. Richard Baraniuk. I am currently doing my PostDoc at UCLA with Prof. Stan Osher.Druv Pai @druv_pai
85 Followers 146 Following PhD @berkeley_ai | using theory to improve practice for deep learningAditi Raghunathan @AdtRaghunathan
1K Followers 18 Following Assistant professor at CMU @SCSatCMU @CSDatCMU | Machine learningMichael C. Mozer @mc_mozer
633 Followers 56 Following Research Scientist, Google Brain where cognitive science and machine learning meetSimon Zhai @simon_zhai
140 Followers 124 Following PhD @ Berkeley EECS, sometimes I click likes for trollsSebastian Raschka @rasbt
265K Followers 901 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.yobibyte @y0b1byte
15K Followers 2K Following Kurin ViTaly, senior research scientist @IsomorphicLabs, ML PhD from @UniofOxford on RL, Multitask learning & GraphsSHAGGYTHEIMMORTAL @TheJesko
2 Followers 44 Following No one can beat shaggy the most immortal being in the universe.Irena Cronin @IrenaCronin
14K Followers 5K Following AI, AR & Data | Co-Founder & SVP Product @DadosTechnology | CEO @InfiniteRetina | 4x Author — Wiley, Apress, PacktPub | Repped by Jon Malysiak for LiteraryShuran Song @SongShuran
7K Followers 421 Following Assistant Professor @Stanford University working on #Robotics #AI #ComputerVisionTML Lab (EPFL) @tml_lab
240 Followers 68 Following Theory of Machine Learning Lab at @EPFL led by Nicolas Flammarion. We develop algorithmic & theoretical tools to better understand ML & make it more robust.Regina Barzilay @BarzilayRegina
3K Followers 30 FollowingElizabeth Yang @eytyang
768 Followers 677 Following technical staff @openai, previously theory @berkeleyeecs | fan of graphs, crosswords, turtles, bad puns, running, and Survivor, among other thingsConference on Parsimo.. @CPALconf
699 Followers 1K Following CPAL is a new annual research conference focused on the parsimonious, low dimensional structures that prevail in ML, signal processing, optimization, and beyondMitchell Gordon @mitchellgordon
1K Followers 392 Following Incoming assistant prof @MITEECS (fall 2024), postdoc @uwcse. PhD @StanfordHCI. Former intern @Apple @Google @cmuhcii. HCI, human-centered AI, social computing.Arvind Satyanarayan @arvindsatya1
6K Followers 2K Following Assistant Professor @MIT_CSAIL @mitvis. Data visualization @vega_vis, ML interpretability, cognitively convivial interaction. He/him. @[email protected].Aakanksha Chowdhery @achowdhery
7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to changeBaharan Mirzasoleiman @baharanm
646 Followers 200 Following Assistant professor @UCLAComSci. Better ML via better data, Machine learning, OptimizationDaphne Ippolito @daphneipp
1K Followers 72 Following I am a senior research scientist at Google. I research topics in natural language generation.Tina Torkaman @TinaTorkaman
28 Followers 5 FollowingTalking to many junior faculty members and students in AI lately. Many seem to be somewhat lost with all the seemingly fast progresses made by the industry. My suggestion to them is: It is industry's job to find how to do better, but academia is to find out how to do it right.
Thanks to @IEEEsps for this great recognition, and congratulations to my co-author @hossTale who deserves all the credit. Paper: ieeexplore.ieee.org/document/83528…
Congratulations to Hossein Talebi & Peyman Milanfar for winning the IEEE Signal Processing Society's Best Paper Award for their paper titled "NIMA: Neural Image Assessment"! #ICASSP2024 signalprocessingsociety.org/community-invo…
Check out this really cool work w/@yidingjiang ! We built a simple system (PCA + Clustering) for quantifying how "features" are distributed across models and data. Using this tool, we can mathematically understand the Generalization Disagreement Equality. 🤝
Models with different randomness make different predictions at test time even if they are trained on the same data. In our latest ICLR paper (oral), we investigate how models learn different features, and the effect this has on agreement and (potentially) calibration. 1/
Models with different randomness make different predictions at test time even if they are trained on the same data. In our latest ICLR paper (oral), we investigate how models learn different features, and the effect this has on agreement and (potentially) calibration. 1/
Our 12 scaling laws (for LLM knowledge capacity) are out: arxiv.org/abs/2404.05405. Took me 4mos to submit 50,000 jobs; took Meta 1mo for legal review; FAIR sponsored 4,200,000 GPU hrs. Hope this is a new direction to study scaling laws + help practitioners make informed decisions
Congratulations to Yaodong Yu @yaodong_yu for his work on white-box deep networks to receive the Leon O. Chua award for outstanding achievement in the area of nonlinear science by the EECS department of UC Berkeley!
@rao2z I have an idea for how to fix it...
How does symmetry in #NeuralNetworks parameters impact learning? Check out our #ICLR2024 🔥spotlight🔥 paper on ``Improving Convergence and Generalization using Parameter Symmetries'' Paper: openreview.net/pdf?id=L0r0Gph… (1/3)
Are you interested in training large models in JAX but are set back by the complicated partition specs and sharding configurations required to scale up? I've recently created scalax, a small library to help developers easily scale up JAX models. github.com/young-geng/sca…
Our paper proves that optimizing an attention layer with next token objective discovers Strongly-Connected Components (SCC) of graphs induced by the training data (throwback to Tarjan's seminal algo).
I’m really excited to be starting a new adventure with multiple amazing friends & colleagues. Our company is called Physical Intelligence (Pi or π, like the policy). A short thread 🧵
How do we capture local features across multiple resolutions? While standard convolutional layers work only on a fixed input-resolution, we design local neural operators that learn integral and differential kernels, and are principled ways to extend standard convolutions to…
Causal self-attention encodes causal structure between tokens (eg. induction head, learning function class in-context, n-grams). But how do transformers learn this causal structure via gradient descent? New paper with @alex_damian_ @jasondeanlee! arxiv.org/abs/2402.14735 (1/10)
Extremely happy with this result! Mechanistic Understanding of how Transformers Learn Causal Structure!
Causal self-attention encodes causal structure between tokens (eg. induction head, learning function class in-context, n-grams). But how do transformers learn this causal structure via gradient descent? New paper with @alex_damian_ @jasondeanlee! arxiv.org/abs/2402.14735 (1/10)
Turns out even linear transformers can be versatile in-context learners! In a new paper, we reverse-engineer a novel, high-performance optimization algorithm for noisy linear regression discovered by models trained with just linear attention. 📜arxiv.org/abs/2402.14180 🧵 below:
A few themes and exciting directions in RL+Robotics+Control+GenAI+LLM Thank you, @samcharrington and the whole @twimlai team, for hosting me. It was fun talking together.
Today we’re joined by Kamyar (@Azizzadenesheli), a staff researcher at @nvidia. We talk about the rise of AI agents powered by LLMs, and how RL is being used today to solve real-world problems. 🎧/📷: twimlai.com/go/670 Here are some key takeaways; (1/6)
What are eigenvalues? For non-normal matrices (ie, essentially all matrices), geometric interpretation no longer applies. Instead you can think of eigenvalues as a unique signature of the matrix, one that stays the same regardless of coordinate system. (1/..)
It's such an honor to be considered in the company of the many brilliant people that have been promoted to the rank of associate professor. #humblebrag
What shall it be Twitter: platitude, hot take, or humble brag? #itakerequests
With increasing interest in the white-box deep representation learning via rate reduction, to help students and colleagues get started, I have organized all key and related papers at my website: people.eecs.berkeley.edu/~yima/Publicat…, from its original roots to its most recent developments.
@gabrielpeyre @GuillaumeG_ Interesting. Related -- SGD of quadratic objective fails to converge for any learning rate when noise is Cauchy