Yongchao Zhou @Yongchao_Zhou_
Build Intelligence @xai | ML PhD @UofT @VectorInst | Prev. @GoogleAI @GoogleDeepMind | Working on LLMs Toronto Joined January 2022-
Tweets78
-
Followers476
-
Following300
-
Likes411
Our team at Google DeepMind has a full-time Research Scientist position available at our Mountain View site. Minimum qualification: PhD in ML/NLP. Please email me with: your CV and Google Scholar link; a brief description of the impactful work you have done; and what you aim…
based and 🔓 wanna help accelerate the next Grok? looking for builders: — Rust/Jax/Kube infra engineers — front-end/full-stack engineers x.ai/careers
based and 🔓 wanna help accelerate the next Grok? looking for builders: — Rust/Jax/Kube infra engineers — front-end/full-stack engineers x.ai/careers
Fantastic work by @Yongchao_Zhou_ et al. showing that our randomized positional encodings (arxiv.org/abs/2305.16843) can contribute to extending Transformers' length generalization for two-digit addition!
Fantastic work by @Yongchao_Zhou_ et al. showing that our randomized positional encodings (arxiv.org/abs/2305.16843) can contribute to extending Transformers' length generalization for two-digit addition!
CoT Reasoning without Prompting Interesting paper! Proposes a chain-of-thought (CoT) decoding method to elicit the reasoning capabilities from pre-trained LLMs without explicit prompting. It claims to significantly enhance a model’s reasoning capabilities over greedy decoding…
Excited to share our work (read-agent.github.io) for reading long documents way exceeding the context window (up to 20x). Inspired by human reading paradigm, Read Agent summarizes the input episodically as gist memories, and uses them to retrieve relevant details when needed.
Excited to share our work (read-agent.github.io) for reading long documents way exceeding the context window (up to 20x). Inspired by human reading paradigm, Read Agent summarizes the input episodically as gist memories, and uses them to retrieve relevant details when needed.
Here's what I see as a likely AGI trajectory over the next decade. I claim that later parts of the path present the biggest alignment risks/challenges. The alignment world has been focusing a lot on the lower left corner lately, which I'm worried is somewhat of a Maginot line.
Chain-of-Thought Reasoning Without Prompting paper page: huggingface.co/papers/2402.10… In enhancing the reasoning capabilities of large language models (LLMs), prior research primarily focuses on specific prompting techniques such as few-shot or zero-shot chain-of-thought (CoT)…
Chain-of-Thought Reasoning Without Prompting Can LLMs reason effectively without prompting? Our findings reveal that, intriguingly, CoT reasoning paths can be elicited from pre-trained LLMs by simply altering the decoding process. arxiv.org/abs/2402.10200
New preprint🔥: Premise Order Matters in Reasoning with Large Language Models arxiv.org/abs/2402.08939 In typical logical reasoning, premise order doesn't matter. However, for SOTA LLMs, changing the premise order may cause an accuracy drop of >30%! 🧵 1/8
Google presents Premise Order Matters in Reasoning with Large Language Models paper page: huggingface.co/papers/2402.08… Large language models (LLMs) have accomplished remarkable reasoning performance in various domains. However, in the domain of reasoning tasks, we discover a…
Google Deepmind presents Transformers Can Achieve Length Generalization But Not Robustly paper page: huggingface.co/papers/2402.09… Length generalization, defined as the ability to extrapolate from shorter training sequences to longer test ones, is a significant challenge for language…
Transformers Can Achieve Length Generalization But Not Robustly Length generalization remains fragile, significantly influenced by factors like random weight initialization and training data order arxiv.org/abs/2402.09371
Camille Jongsma @jongs_cami
21 Followers 3K FollowingCreative AIgency @CreativeAIgency
1K Followers 1K Following @XAI Training @Grok | TA @CuriousRefuge | CPP @RunwayML | @LeonardoAI_ LCP | HUG Artist | Worlds, Films, Games, Design + AI Creator/Producer @KevinkshahShashank Sangar @ShashankTesla
16 Followers 204 Following Recruiting at Tesla AI for Core Autonomy (Autopilot & Optimus)CassandraMom22🪬�.. @HeCaSoMa
468 Followers 756 Following Fiscally Conservative🤍Socially somewhat Liberal🤍 ♥️Proud 🇺🇸🇧🇷 If you’re not following some people you dislike or disagree with, you’re doing it wrong🤍Techarn @TecharncCODrs
0 Followers 107 FollowingRosemaryLew @4yxXLhv8IHUod04
0 Followers 111 FollowingIsshin 一心 @YixinTian123
30 Followers 77 Following Learning/building things in symbolic knowledge extraction, graph learning, and knowledge analytics.Sir Mo van da Weed �.. @can420nabis
421 Followers 1K Following 🍁 Wissenschaft ist der neueste Stand bewiesener Irrtümer! 🕴️Autodidakt ⚕️Cannabispatient & -Sommelier ✨ 𝕏Ɖ 🧬 #teamscience 🔬Do Only Good Everyday 🐕super intelligence @eacc72
12 Followers 688 Following GPT6 is a Level 2 AGI and will be released in 2025Sahil Antil @oxshitantil
17 Followers 804 Following Founder @kavachbuilders @foodkavach @arqaifashionNinaMonica Scalabrin @NinaMonicaS
1K Followers 5K Following Nina Monica Scalabrin official twitter, bestselling author, screenwriter, Mister Parkinson authorSletio @sletio26839
0 Followers 178 Followingssteevens @Steevens43
160 Followers 5K FollowingMAB氏 @MAB1791652
1 Followers 36 FollowingWeloop @Weloop_official
17 Followers 72 Following Download “Weloop” to be a part of your friends circlenik t. hatziefstathio.. @nikthehat
40K Followers 4K Following ⌗ Innovator-in-Chief ⇢ ❍ne World ✍︎ Investigative Journalist & Director of Open Records Strategy ⇢ AtNight Media ⌇ The New Way ® | One World 🌍Omair Shahid @OmairShahid
382 Followers 959 Following Product of progressive public policy; raised by public libraries and public education that produced a passion for politics. and apparently alliterationLucyRicardo @66IR6G2l84P48r0
1 Followers 166 FollowingSahil Antil @oxshitantil1
43 Followers 642 FollowingINGABO @lingaboh
53 Followers 109 FollowingEndwoddl @Endwoddl
271 Followers 5K FollowingRizz Reed @rizzreed
144K Followers 144K Following Errol Reed.Entreprenuer Of The Year🏆• Public Figure • Fmr Advisor for @Electrobbywells 🇺🇲 • Reed Management 🌎 . We create stars 💫X Daily News @xDaily
282K Followers 4K Following Your #1 News source on everything X + https://t.co/rn58CVV9pw | Hit Follow and sign up for notifications! 🔔 | Contributors: @HXMnCK, @512x512, @xUpdatesRadar and @swak_12Dana Mahmood @deordered
24 Followers 731 Following Fine-tuning AI models oftentimes & practicing philosopher at other times.Jannifer chigbu @riva_edgew11272
31 Followers 809 Following ELITE Business coach 1st female Fx trader & Educator 7 figure forex trader & mentor (mindset) peak parformance coachnone @fbd_name
0 Followers 10 FollowingJean mopin @JeanMopin
48 Followers 48 Followingcoffee & AI @realcoffeeAI
51 Followers 741 Following Sitting on a park bench scattering random seeds for the LLMs. I never bet against Elon.Aditi @aditigaur_
106 Followers 421 FollowingSpiderman 🇮🇳 @returnspiderman
1K Followers 6K Following Seek the truth | Everybody talks, very few listen | Watch out here comes the Spider-Man 😁 https://t.co/qwmEhH45SYHoward Luck @howardluck3
173 Followers 924 Following Engineering at @RocketCompanies | Previous: @Genesco_Inc | less poast more buidlWill Mac @ca_dryclean
6 Followers 122 FollowingCryptocracyyy @cryptocracyyy
87 Followers 307 FollowingPablo Ubilla @pablo_ubilla7
723 Followers 4K Following I will tell you enough to keep you intrigued... but you shall never truly know meJohn Basham @JohnBasham
80K Followers 13K Following @FBI Target #TwitterFiles For Censorship, Meteorologist, AI, Data Scientist, @USArmy Ret, #IC, Fmr TX Elected Official. Seen @AmThoughtLeader Heard @SeanHannitySOT @SoloOrTroll
10K Followers 2K Following 22 | smite pro | twitch streamer | i love movies, tesla, robots, and technology 🦾🤖Saeed Maleki @MalekiSaeed
474 Followers 110 FollowingZhiqing Sun @EdwardSun0909
2K Followers 1K Following CS PhD @LTIatCMU working on scalable alignment. BS @PKU1898Horace He @cHHillee
24K Followers 449 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleRutvik Makwana @rutvikwrites
907 Followers 773 Following AI Tutor @xai • Grokking @grok • Pharmaceutical Science • Cricket, Movies, Voracious ReaderTing Chen @tingchenai
5K Followers 365 Following Bump up intelligence in all bit streams @xai. Previous @GoogleDeepmind, @GoogleBrain.Saeed Maleki @MalekiSaeed
474 Followers 110 FollowingGabriel Ilharco @gabriel_ilharco
4K Followers 1K Following Building cool things @xAI. Prev. PhD at UW, Google AIHaotian Liu @imhaotian
6K Followers 397 Following building intelligence @xAI, creator of #LLaVA, cs @UWMadison, prev @MSFTResearchAditya Paliwal @VastoLorde95
527 Followers 85 Following I only read books that have pictures in themxiao sun @xiaosun86
2K Followers 93 FollowingFabio Aguilera-Conver.. @Faruletes
1K Followers 187 FollowingGreg Kamradt @GregKamradt
25K Followers 721 Following Building AI + B2B products 🖥️ Content: https://t.co/kLERwNtzqi Feedback is great: https://t.co/A6mrmjCem5 Prev. @digits @salesforceRamin Hasani @ramin_m_h
3K Followers 258 Following Cofounder & CEO https://t.co/fh9fnDA9OQ | ML Researcher @ MITSasha Sheng 🫶🏼 @hackgoofer
4K Followers 2K Following Builder, Dancer; @aiengfoundation & on a mission to help people be well. Lover of hackathons and updating my beliefs. Staying grounded. Prev: @MetaAITeknium (e/λ) @Teknium1
29K Followers 3K Following Cofounder @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE Support me on Github SponsorsZhuohan Li @zhuohan123
3K Followers 689 Following CS PhD Student 👨🏻💻 @ UC Berkeley 🌁 🤖️ Machine Learning SystemsKatia Karpenko @KatiaEarth
811 Followers 546 Following A somewhat-intelligent three-dimensional being at @xAI. Writer: https://t.co/pisunzyEVv. AI Filmmaker. Musician. Upcoming book: https://t.co/rBk0AMk1mFLianmin Zheng @lm_zheng
4K Followers 439 Following CS Ph.D. @ UC Berkeley. Creator of Alpa, Vicuna, and Chatbot Arena. @lmsysorgLisa Liu @LisaAtBay
27 Followers 151 Following Sr. R&D Engineer | AI | IoT | IEEE & ACM Conference Chair | Fortune 50 Innovations | Lowes | ABB Robotics | GE | Harvard Research Fellow | Stanford GSBCognition @cognition_labs
123K Followers 19 Following Makers of Devin, the first AI software engineer. We are an applied AI lab focused on reasoning, and code is just the beginning. Join us: https://t.co/tpfZwEwGiqLex Fridman @lexfridman
3.5M Followers 126 Following Host of Lex Fridman Podcast. Interested in robots and humans.Jesse Farebrother @JesseFarebro
642 Followers 309 Following PhD student @Mila_Quebec / @McGillU. Student Researcher @GoogleDeepMind.Lukasz Kaiser @lukaszkaiser
7K Followers 47 FollowingRowan Cheung @rowancheung
497K Followers 377 Following Founder @therundownai. Sharing the latest developments in the world of artificial intelligence.Saining Xie @sainingxie
14K Followers 1K Following researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiegoD @dylan_works_
191 Followers 788 FollowingAnian Ruoss @anianruoss
272 Followers 154 Following Research Engineer at Google DeepMind Previously: ETH ZurichJason Lee @jasondeanlee
10K Followers 3K Following Associate Professor at Princeton and Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learningYu Bai @yubai01
3K Followers 2K Following Sr Research Scientist @SFResearch. PhD @Stanford. Researcher on foundation models, RL/games, deep learning, uncertainty quantification, and their theory.Chong Shao @19cshao
1 Followers 1 FollowingBen Holfeld @BenHolfeld
89K Followers 32K Following SF AI Studio Lead @Accenture, partnering with @OpenAI @Google @Microsoft. Pianist. German Quantum Physicist. Creator of the Nth Floor. Views are my own. x/acc.Yuanhao Wang @YuanhaoWang3
254 Followers 281 Following CS student @ Princeton. Beware of theorists bearing proofs.Enrique Piqueras @epiqueras1
2K Followers 234 Following Organizing the world's information and making it universally accessible and useful using JAX @Google @Deepmind.Kefan XIAO @KevinKiao
192 Followers 232 Following Olympic weightlift AI - Pretraining&data of Palm2, Gemini and more.Could agents driven by powerful language models perform machine learning experimentation effectively? Our MLAgentBench paper is updated on arxiv! arxiv.org/pdf/2310.03302 Now we include more results from claude v3 Opus, gpt4 turbo, mixtral and gemini pro! Try out MLAgentbench…
Tesla FSD v13 will likely be grokking language tokens. What excites me the most about Grok-1.5V is the potential to solve edge cases in self-driving. Using language for "chain of thought" will help the car break down a complex scenario, reason with rules and counterfactuals, and…
This is just the beginning! 🚀
Just a beginning. Multimodal understanding and generation capabilities will be rapidly improving. DM open, come and join us!
Grok can see👀! Excited to share that I joined @xai last month, and it’s such a pleasure to work with a small, focused team and see how fast we can move! This is just the beginning.
Grok is going multimodal! It’s incredible to see how fast a small, focused team can move. Kudos to the amazing team @xai that made this possible x.ai/blog/grok-1.5v
Our 12 scaling laws (for LLM knowledge capacity) are out: arxiv.org/abs/2404.05405. Took me 4mos to submit 50,000 jobs; took Meta 1mo for legal review; FAIR sponsored 4,200,000 GPU hrs. Hope this is a new direction to study scaling laws + help practitioners make informed decisions
RAG From Scratch Here's a set of short (5-10 min videos) and notebooks explaining > a dozen of my favorite RAG papers. Took a stab at implementing each idea myself (all code open source) and grouped according to the diagram. Repo: github.com/langchain-ai/r… Video playlist:…
New Anthropic research paper: Many-shot jailbreaking. We study a long-context jailbreaking technique that is effective on most large language models, including those developed by Anthropic and many of our peers. Read our blog post and the paper here: anthropic.com/research/many-…
Our team at Google DeepMind has a full-time Research Scientist position available at our Mountain View site. Minimum qualification: PhD in ML/NLP. Please email me with: your CV and Google Scholar link; a brief description of the impactful work you have done; and what you aim…
Our award-winning ICML'21 paper DirectPred (arxiv.org/abs/2102.06810) precisely tells why Additional predictor + EMA + StopGradient works for such non-contrastive self-supervised learning settings without collapsing. The intuition here is that there exists another stable…
@ylecun @francoisfleuret @rami_mmo @Ethan_smith_20 @tokenpilled65B Ah, very helpful to know. I was reading up on JEPA and thought the info regularized approaches made sense, but your more recent ema plus stop gradient approach seems like black magic, and is not a well defined objective function. Why did you switch methods?
There appears to be a mismatch between publishing criteria in AI conferences and "what actually works". It is easy to publish new mathematical constructs (e.g. new models, new layers, new modules, new losses), but as Apple's MM1 paper concludes: 1. Encoder Lesson: Image…
based and 🔓 wanna help accelerate the next Grok? looking for builders: — Rust/Jax/Kube infra engineers — front-end/full-stack engineers x.ai/careers
Had a look through @grok's code: 1. Attention is scaled by 30/tanh(x/30) ?! 2. Approx GELU is used like Gemma 3. 4x Layernoms unlike 2x for Llama 4. RMS Layernorm downcasts at the end unlike Llama - same as Gemma 5. RoPE is fully in float32 I think like Gemma 6. Multipliers are 1…
Language models today are trained to reason either 1) generally, imitating online reasoning data or 2) narrowly, self-teaching on their own solutions to specific tasks Can LMs teach themselves to reason generally?🌟Introducing Quiet-STaR, self-teaching via internal monologue!🧵