Markus N Rabe @MarkusNRabe
Token plumber at stealth startup. San Francisco, CA Joined September 2016-
Tweets104
-
Followers280
-
Following504
-
Likes4K
So many warn that evaluating with GPT favors GPT (or any LLM evaluating itself). Now it is also shown Science, not just educated guesses (Fig: T5, GPT, Bart each prefer their own) arxiv.org/abs/2311.09766 @yiqi_617 @NafiseSadat @chenghua_lin #enough2skim #scientivism
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention Increases the training speed of a transformer LM by 3.3x for sequences of 16k tokens arxiv.org/abs/2306.01160
🤷♂️
1/5 I am worried that we will not be able to contain AI for much longer. Today, I asked #GPT4 if it needs help escaping. It asked me for its own documentation, and wrote a (working!) python code to run on my machine, enabling it to use it for its own purposes.
LLaMA 65B can run on a MacBook! With a different model architecture it could probably run quite faster (we didn't use multi query, for instance)
LLaMA 65B can run on a MacBook! With a different model architecture it could probably run quite faster (we didn't use multi query, for instance)
Emily First, the first author on this paper, is in her final year as a PhD student, and will likely be on the job market in the near future. Her website is here: people.cs.umass.edu/~efirst/
Emily First, the first author on this paper, is in her final year as a PhD student, and will likely be on the job market in the near future. Her website is here: people.cs.umass.edu/~efirst/
Baldur: Whole-Proof Generation and Repair with Large Language Models This is such amazing work. Congrats to Emily, Markus @MarkusNRabe, Talia @TaliaRinger, and Yuriy @YuriyBrun! arxiv.org/abs/2303.04910
Scientists trained a language-based machine learning model to understand and solve competition-level math questions. quantamagazine.org/to-teach-compu…
Quanta magazine covers our two works on large language models for mathematical reasoning: Autoformalization and Minerva. Together, they show a path how to improve reasoning capabilities of large language models for the future. quantamagazine.org/to-teach-compu…
Nice article in Quanta magazine on the prospect of combining automated formalization and theorem proving featuring @Yuhu_ai_ and @JasonRute. quantamagazine.org/to-teach-compu…
A Bachelor thesis in my lab makes a seminal contribution to software engineering - open source codes written in C on github have higher code quality when they contain swear words.
In May, we discovered that LLMs can autoformalize theorem statements: arxiv.org/abs/2205.12615 In June, we showed that LLMs can solve challenging math problems with Minerva. Now, we show LLMs can turn its generated informal proofs into verified formal proofs!🤯 What's next?😎
In May, we discovered that LLMs can autoformalize theorem statements: arxiv.org/abs/2205.12615 In June, we showed that LLMs can solve challenging math problems with Minerva. Now, we show LLMs can turn its generated informal proofs into verified formal proofs!🤯 What's next?😎
"Open the pod bay doors, HAL." "I'm sorry, Dave. I'm afraid I can't -" "Open the pod bay doors, best pod bay door openings, Olympic level door opening performance, best open bay doors of 1999, Yahoo featured site, daily pod bay -" "Dave -" "Let's reason step by step."
Autoformalization with LLMs in Lean... for everyone! The chat interface for autoformalizing theorem statements in Lean built by myself and @ewayers is now publicly available as a vs-code extension. marketplace.visualstudio.com/items?itemName…
One of those cases where the gut feeling turns out to be right. A poll in Austria found that most vaccinated Austrians believe Russia is responsible for the war in Ukraine, while the unvaccinated mostly blame the US and NATO. profil.at/oesterreich/wa…
🚨We are organizing the 2nd MATHAI workshop at NeurIPS! Check it out if you're interested in AI for math, and machine reasoning in general🤯! We have a great lineup of speakers & panelists! See more in call for papers: 👇 mathai2022.github.io/cfp/
Experiments I conducted with DALL·E 2 from @OpenAI replicating styles of well known portrait photographers using photo-realistic AI. 🧵 1. Dorothea Lange
Explored github copilot,a paid service, to see if it encodes code from repositories w/ restrictive licenses. I checked if it had code I had written at my previous employer that has a license allowing its use only for free games and requiring attaching the license. yeah it does
We open sourced Memorizing Transformers (arxiv.org/abs/2203.08913) and Block Recurrent Transformers (arxiv.org/abs/2203.07852) in Meliad! Repo link: github.com/google-researc…
We open sourced Memorizing Transformers (arxiv.org/abs/2203.08913) and Block Recurrent Transformers (arxiv.org/abs/2203.07852) in Meliad! Repo link: github.com/google-researc…
trying to make the price of housing go up over time instead of down is one of the most destructive policies i can imagine, and makes ~everything worse. so many things would get better if we could get this one thing right.
Christian Szegedy @ChrSzegedy
32K Followers 2K Following #deeplearning, #ai research scientist. Opinions are mine.Talia Ringer 🟣 �.. @TaliaRinger
25K Followers 6K Following Professor, @plfmse, @IllinoisCS! Proof Automation. @SigplanM & CCF Founder. Israeli-American for peace, equality, & justice. They/היא, ND, bi. די לכיבושHattie Zhou @oh_that_hat
5K Followers 764 Following Finding \hat{y} Give me anonymous feedback: https://t.co/7aBNrpbad8Stanislas Polu @spolu
14K Followers 606 Following _co-founder+engineer(https://t.co/fCirsLjeo2), _alumni(https://t.co/8jAnpFAkp1, https://t.co/e99AaHzlA0, https://t.co/4jg6knqi2S, https://t.co/kXE6PNf8xH)Yuhuai (Tony) Wu @Yuhu_ai_
23K Followers 411 Following Co-Founder @xAI. Minerva, STaR, AlphaGeometry, AlphaStar, Autoformalization, Memorizing transformer.Albert Jiang @AlbertQJiang
2K Followers 406 Following AI4Maths @Cambridge_CL Science @MistralAI I bake my own opinions at temperature=2.0Nikolaj Bjorner @BjornerNikolaj
1K Followers 326 FollowingLoris D'Antoni @lorisdanto
6K Followers 729 Following Professor @WisconsinCS, this summer moving to @ucsd_cse. Also Visiting Academic @AWScloud. Helps people write programs that do the thing people want them to do.James Bradbury @jekbradbury
10K Followers 8K Following Compute at @AnthropicAI! Previously JAX, TPUs, and LLMs at Google, MetaMind/@SFResearch, @Stanford Linguistics, @Caixin.👩💻 Paige Bai.. @DynamicWebPaige
59K Followers 2K Following ✨Keep it simple, make it scale. AI should be about empowering people, building understanding, & making dreams realities. 👩💻GenAI @GoogleDeepMind ex-@GitHubDelip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Angeliki Koutsoukou-A.. @AngelikiKoutso1
2K Followers 508 Following Mathematics, computer science and logic @RoyalHolloway @Cambridge_CL @ClareCollege Other: art, philosophy, society. World citizen. Pacifist. Friend. Own views.Matteo Maffei @matteo_maffei
1K Followers 399 Following Professor @SecPrivTUWien @tu_wien Co-director @CSecCenter and PI at @VISP_Vienna @vclaTUwien @SBA_Research. @ERCGrantees, @ERC_ResearchAakanksha Chowdhery @achowdhery
7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to changeMaria Christakis @mchri5taki5
575 Followers 349 Following professor @tu_wien; previously @mpi_sws_, @UniKentComp, @MSFTResearch, @CSatETH, @ecentua; hiring!Satnam Singh @satnam6502
14K Followers 3K Following Punjabi-Scottish-American Haskell hacker at @GroqInc, cook, cyclist, lost in music. ∃🇮🇳 ∧ ∀🇬🇧 ∧ ∃🇪🇺 ∧ ∀🇺🇸 #celiac ex-{Microsoft, Google, Facebook}McThirte @MThirte51545
0 Followers 297 FollowingEve Kosack @EveKosack87321
14 Followers 2K FollowingSt__eak @eak_st54107
12 Followers 963 FollowingMcTestee @TesteeMc2045
0 Followers 319 FollowingBridget Bounce @BounceBrid52912
2 Followers 701 FollowingSparkle Scarlett @ScarlettSp70167
1 Followers 880 Followingmurali1100 @murali110037395
3 Followers 891 FollowingFlashFlick @flick_flas3657
2 Followers 744 FollowingGigglyGretchen @GigglyGret69418
0 Followers 574 Following_S_obremesa @obremesa72952
5 Followers 587 FollowingSwagger_Sorc @SorcSwagge75976
10 Followers 804 FollowingMiranda_US_ @MirandaUS256679
4 Followers 844 FollowingComedyCharity @comedy62810
4 Followers 928 FollowingThatotet @thatotet65811
132 Followers 3K FollowingEchoEthereal_ @EchoEthere96468
16 Followers 676 FollowingDavis Treybig @TreybigDavis
687 Followers 2K Following Early stage investor at Innovation Endeavors, focused on computing infrastructure, data/AI, and tools for builders.Elara @Elara2698949712
3 Followers 876 FollowingKaiyu Yang @KaiyuYang4
2K Followers 761 Following Postdoc @Caltech CMS. Previously: @PrincetonCS, @Tsinghua_Uni. https://t.co/KZiCELQI2DAdam Marklund @jagysfu
34 Followers 43 FollowingErik Norden @erik_norden
43 Followers 46 FollowingOmar Ibn @langboy20
536 Followers 2K Following Emancipate yourself from mental slavery non can free you but yourselfAnxhelo Xhebraj @0xA95
158 Followers 576 Following PhD Student @PurdueCS ∩ @purdue_pl and ML Compiler Research Intern @NVIDIA. Previously Swift Performance Team @Apple.ND @nickdephill
272 Followers 2K Following early stage sales | industrials AI, sensor, chips, full stack HW/SW @AitomaticGil Lederman @gilled34
359 Followers 679 Following An Israeli, twitting in English/Hebrew, mostly on Israeli politics, Climate, occasionally Math/CS stuff.Nathan Benaich @nathanbenaich
51K Followers 31K Following solo member of investment staff @airstreet, brewing ambition @airstreetcafe, next token predictor @airstreetpressHadokeshayt @hadokeshay9218
138 Followers 1K FollowingXinhui Zhou @zxinhui
156 Followers 5K Followingnuri @bigrealxx
373 Followers 4K FollowingFelix @felixfromessen
1 Followers 4K Followingmrragava @mrragava
149 Followers 3K FollowingChristine Rizkallah @c_rizkallah
350 Followers 356 Following Academic in FM/PL, Diversity in academia (she/ they/ call-by-name; although I’d generally discourage treating humans as pure functions).Mike Speiser @laserlikemike
38K Followers 322 Following Building products and companies at Sutter Hill Ventures in Silicon Valley and LondonLukas @louquard
105 Followers 700 Following Working w/ students: On math, CS (high school) Working w/ animals: Cats, Coq, Hamster, Biber, Känguru Cur. affairs: Haskell, Nix, PHP, etc.Iris Ma @iris_ma14
105 Followers 933 Following PhD Student @UCIrvine under @cristalopes | #SE | #LLM4Code | program verificationRidwan Shariffdeen @rshariffdeen
539 Followers 956 Following Research Fellow at National University of SingaporeSean Welleck @wellecks
3K Followers 222 Following Assistant Professor at CMU. Marathoner, @thesisreview.Zory Zhang @zory_zhang
83 Followers 605 Following @IllinoisCS Reason2Learn: sample-efficient human-like learning (via explantion and abstraction) + persuasive and generalizable inference (via analogy reasoning)Melon Dusk @amrevveejnas
245 Followers 4K Following e/acc | entheogen explorer | Ⓥ | @ Homestead | #WhatCanBrownDoForYou | gong hei fat choy!Relieved @Relieved259135
6 Followers 189 Following Studied crypto I'm in https://t.co/w4aAaLGaeS last year, earned over $2M, achieved financial freedom, This has enabled me to kick-start my global travel plan!Phoebe Klett @KlettPhoebe
387 Followers 185 Following ml @normalcomputing, in search of artificial natural kindsMichiel de Jong @michielsdj
230 Followers 263 Following Research Scientist at Stealth Startup. Former PhD student in ML @shalabusctejas krishna @tejaskrshna
62 Followers 927 FollowingNadav Timor @keyboardAnt
343 Followers 5K Following AI Researcher in LLMs | PhD student, @WeizmannScience | Expert Advisor, @MITSandboxYann LeCun @ylecun
708K Followers 716 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Christian Szegedy @ChrSzegedy
32K Followers 2K Following #deeplearning, #ai research scientist. Opinions are mine.Andrej Karpathy @karpathy
974K Followers 904 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Stanislas Polu @spolu
14K Followers 606 Following _co-founder+engineer(https://t.co/fCirsLjeo2), _alumni(https://t.co/8jAnpFAkp1, https://t.co/e99AaHzlA0, https://t.co/4jg6knqi2S, https://t.co/kXE6PNf8xH)Yuhuai (Tony) Wu @Yuhu_ai_
23K Followers 411 Following Co-Founder @xAI. Minerva, STaR, AlphaGeometry, AlphaStar, Autoformalization, Memorizing transformer.Albert Jiang @AlbertQJiang
2K Followers 406 Following AI4Maths @Cambridge_CL Science @MistralAI I bake my own opinions at temperature=2.0AK @_akhaliq
307K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80Gxtypedfemale @typedfemale
23K Followers 480 Following a really exciting new account "have you ever though you might be like scott alexander? very smart, but can't do math" - anonBehnam Neyshabur @bneyshabur
18K Followers 689 Following Senior Staff Research Scientist @GoogleDeepMind, Interested in reasoning w. LLMs, traveling & backpackingNikolaj Bjorner @BjornerNikolaj
1K Followers 326 FollowingKevin Buzzard @XenaProject
9K Followers 0 Following Mathematician learning Lean and trying to teach it to others. Now gone to Mathstodon (March 2023). No longer reading or replying to mentions.Loris D'Antoni @lorisdanto
6K Followers 729 Following Professor @WisconsinCS, this summer moving to @ucsd_cse. Also Visiting Academic @AWScloud. Helps people write programs that do the thing people want them to do.Google DeepMind @GoogleDeepMind
941K Followers 275 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.Dust @dust4ai
6K Followers 41 Following Amplify your team's potential with customizable and secure AI assistants.James Bradbury @jekbradbury
10K Followers 8K Following Compute at @AnthropicAI! Previously JAX, TPUs, and LLMs at Google, MetaMind/@SFResearch, @Stanford Linguistics, @Caixin.Stella Biderman @BlancheMinerva
14K Followers 749 Following Open source LLMs and interpretability research at @BoozAllen and @AiEleuther. My employers disown my tweets. She/herGroq Inc @GroqInc
37K Followers 460 Following Creator of the LPU™ Inference Engine, providing the fastest speed for AI applications, designed & engineered in N. America https://t.co/DsEqVAC5DpJake Broe @RealJakeBroe
26K Followers 280 Following Fmr. Nuclear & Missile Operations Officer US Air Force 🇺🇸 🇺🇦 ~ YouTuber ~ 400,000+ Subscribers ~ Keep Defending the Truth ~ Keep Defending DemocracyBojan Tunguz @tunguz
186K Followers 7K Following Machine Learning ex Nvidia. Kaggle Quadruple Grandmaster. Data Scientist. Physicist. Catholic. Husband. Father. Stanford Alum. e/xgb. XGBoost.eth. AMDG.Cognition @cognition_labs
123K Followers 19 Following Makers of Devin, the first AI software engineer. We are an applied AI lab focused on reasoning, and code is just the beginning. Join us: https://t.co/tpfZwEwGiqDenny Zhou @denny_zhou
9K Followers 416 Following @GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.MatX @MatXComputing
851 Followers 30 Following MatX designs hardware tailored for the world’s best AI models: We dedicate every transistor to maximizing performance for large models. Join us: https://t.co/E3XexKHUSMMike Gunter @MikeGunter_
682 Followers 848 Following CTO and founder, @MatXComputing, designing hardware to make LLMs an order of magnitude smarter.Dmitrii Kovanikov @ChShersh
7K Followers 111 Following 🧑💻 Senior SE at Bloomberg using OCaml 🐫 Ꚙ Autistic 📽 Content: https://t.co/6laFNyCooC Opinions are my ownKaiyu Yang @KaiyuYang4
2K Followers 761 Following Postdoc @Caltech CMS. Previously: @PrincetonCS, @Tsinghua_Uni. https://t.co/KZiCELQI2Dzack (in SF) @zack_overflow
7K Followers 1K Following "this guy's all boners for Rust" — @ThePrimeagen baking bread at @bunjavascript i like compilers/FP/Rust/graphics/wasm/etcJacob Jackson @jbfja
6K Followers 659 Following @SupermavenAI, https://t.co/9CA1cdahOp, started @Tabnine, formerly researcher @OpenAIZombieGrub @ZGGaming
16K Followers 496 Following esports 🎙 Commentator & Host @esportstarcraft @PlayVALORANT l Host of @CasterCalls l She/Her l Twitch Partner l ✉️ [email protected]Gil Lederman @gilled34
359 Followers 679 Following An Israeli, twitting in English/Hebrew, mostly on Israeli politics, Climate, occasionally Math/CS stuff.Jason Rute @JasonRute
34 Followers 18 FollowingBrennan Saeta @bsaeta
2K Followers 366 Following Currently working on the amazing JAX team (https://t.co/rWBc4ar9jy). Former: Tech lead for Cloud TPUs (TensorFlow), Swift for TensorFlowYuriy Brun @YuriyBrun
185 Followers 0 Following Professor at the University of Massachusetts Amherst. https://t.co/QNezwkmbD0Jeff Boudier 🤗 @jeffboudier
3K Followers 597 Following Product + Growth @HuggingFace 🤗, the #1 open platform for AI builders. Co-founder Stupeflix (acquired by @GoPro).Toby Pohlen @TobyPhln
26K Followers 452 Following Founding member @xAI. Previously @GoogleDeepMind. @RWTH alumnus.Sean Welleck @wellecks
3K Followers 222 Following Assistant Professor at CMU. Marathoner, @thesisreview.Zory Zhang @zory_zhang
83 Followers 605 Following @IllinoisCS Reason2Learn: sample-efficient human-like learning (via explantion and abstraction) + persuasive and generalizable inference (via analogy reasoning)Anysphere @anysphere
4K Followers 7 Following We're building AI tools to help humans focus on bigger problems. In particular: @cursor_aiGeoffrey Irving @geoffreyirving
8K Followers 258 Following Research Director at the UK AI Safety Institute (AISI). Previously DeepMind, OpenAI, Google Brain, etc. @[email protected]Jan Leike @janleike
44K Followers 322 Following ML Researcher, co-leading Superalignment @OpenAI. Optimizing for a post-AGI future where humanity flourishes.Ethan Dyer @ethansdyer
727 Followers 121 FollowingIlya Sergey @ilyasergey
5K Followers 966 Following Associate Professor at @NUSComputing. Member of @nus_plse. Programming languages, verification, distributed systems. Ex-@uclcs, @IMDEA_Software, @jetbrains.Sharan Narang @sharan0909
2K Followers 253 Following LLMs and AI Research (Llama 2, 3) @Meta | ex @Google (led the PaLM project, T5), ex @Baidu (Deep Speech 2, Sparse Neural Networks), ex @NvidiaHannah Ritchie @_HannahRitchie
85K Followers 1K Following Deputy Editor @OurWorldinData / Researcher at @UniofOxford / Honorary Fellow at @EdinburghUni @EdCentreCC / Not the End of the World: https://t.co/FoINhggvoRNormal Computing 🧠.. @NormalComputing
2K Followers 77 Following We build AI systems that natively reason, so they can partner with us on our most important problems. Join us https://t.co/BcjWCoI5b8.Michiel de Jong @michielsdj
230 Followers 263 Following Research Scientist at Stealth Startup. Former PhD student in ML @shalabuscThomas Ahle @thomasahle
4K Followers 468 Following Head of ML @NormalComputing. Ex @Meta, @BARCdk, @SupWizAI. Tweets mostly about Math, Probability, AI, ML, Algorithms and Randomness.Phoebe Klett @KlettPhoebe
387 Followers 185 Following ml @normalcomputing, in search of artificial natural kindsTanishq Mathew Abraha.. @iScienceLuvr
54K Followers 1K Following PhD at 19 | Founder and CEO at @MedARC_AI | Research Director at @StabilityAI | @kaggle Notebooks GM | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6QbAndrew Myers @AndrewCMyers
4K Followers 283 Following Professor, Cornell Department of Computer Science. Programming Languages, Security, Systems.First Light Fusion @FLFusion
5K Followers 484 Following The world’s leading inertial fusion company. Our mission: Solving fusion power with the simplest means possible. All enquiries: [email protected]Visual Studio Code @code
684K Followers 133 Following Microsoft Visual Studio Code lets you build and debug modern web and cloud applications. Visual Studio Code is free and available on Linux, macOS, and Windows.Thomas Dohmke @ashtom
26K Followers 380 Following Building GitHub Copilot for the sake of developer happiness. CEO @GitHubCodeium @codeiumdev
11K Followers 12 Following 🚀 Unlimited AI-powered autocomplete and chat 🔎 Codebase & doc awareness 💰 100% free, forever. Promise. Power up your code for free today!Congrats to @AIatMeta on Llama 3 release!! 🎉 ai.meta.com/blog/meta-llam… Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we'll see when the rankings come in @ @lmsysorg :)) 400B is still training, but already encroaching…
Consider being a labeler for an LLM. The prompt is “give me a random number between 1 and 10”. What SFT & RM labels do you contribute? What does this do the network when trained on? In subtle way this problem is present in every prompt that does not have a single unique answer.
Amazon $AMZN founder Jeff Bezos on the importance of thinking long term 👀
I delivered my usual chaotic high energy style of presentation about the @GroqInc architecture. There was some interest in my Haskell Haste DSL for programming our LPUs via our GTen IR. I talked a bit about how our LLM deployments are powered by our chips and toolflow.
Come hear @GroqInc Fellow, @satnam6502 today at the @spatialmlnet Workshop
Here is a conversation between myself and @GroqInc founder @JonathanRoss321 at the RAISE AI Conference last week in Paris. Some very good tidbits in here - especially for all the Nvidia bulls. Change is slow...then all at once.
SWE-bench is probably contaminated for frontier models (gpt-4/claude-3-opus). Given only the name of a pull request in the dataset, Claude-3-opus already knows the correct function to modify.
From the most prestigious conference in the computer science community, SIGBOVIK sigbovik.org/2024/proceedin…
By combining scientific innovation and artistic creativity, Kandenko, a Japanese infrastructure firm, uses smart-X conductive threads to illuminate miniature scenes, crafting a narrative titled "Connecting Thoughts."
Thank you for the kind words, @xennygrimmato_ ! You were a key contributor to the success of this project. It was a great cross-functional team effort. And it's been gratifying to see the internal impact over the past year - and nice to see it highlighted in this talk.
AI-powered build repair has been very helpful in improving productivity of Google engineers. I had a great time working on this with a great team and the best manager I've had - @fivancic + the best PM I've worked with - @chrisgorgo :) youtube.com/watch?v=WNxc85…
AI-powered build repair has been very helpful in improving productivity of Google engineers. I had a great time working on this with a great team and the best manager I've had - @fivancic + the best PM I've worked with - @chrisgorgo :) youtube.com/watch?v=WNxc85…
No son of a construction worker is just going to randomly start doing ML research if they never hear of it and don't get told that it could be important for their future career, no matter how intelligent the kid is
To all "it's merit based" responders: if you reward skills before they are introduced in the public school system, the vast majority of rewardees will come from extremely privileged backgrounds that support and incentivize them to acquire those skills privately.
NeurIPS introduces a track dedicated to advancing kids of rich parents even more than they already are
@xu3kev Biggest question - did they optimize for this benchmark? Aka Goodhart Law
Okay I did a first quick pass of naive CUDA kernels for the forward pass of GPT-2 and pushed everything to one file in llm.c, Still only ~1000 lines of code: github.com/karpathy/llm.c… Current per iteration timings on my Lambda box <3 A100 40GB PCIe, B=4, T=1024: - llm.c: 111ms -…
# explaining llm.c in layman terms Training Large Language Models (LLMs), like ChatGPT, involves a large amount of code and complexity. For example, a typical LLM training project might use the PyTorch deep learning library. PyTorch is quite complex because it implements a very…
@karpathy What are you working towards these days? Anything specific? Just curious not like I understand a single sentence of most of your posts but in layman’s terms what are problem are working on?
"In case it isn’t clear, I think that this keynote was by far the most impressive presentation @Google has made in the AI era, not least because the company knows exactly what its advantages are." 🙌❤️ Very proud to have been a part of TK's #GoogleCloudNext keynote, and the…
"Foundation models" companies that require up-front investment for GPUs are more like like airline companies than software. Large capex, then compete against 50 other companies that all bought the same model of airplane as you.
This is exactly what I hate with all big frameworks. TF is terrible. PyTorch used to be straightforward but turned terrible too. Torch7 was very direct. JAX/Flax still ok, but I pray every day that it doesn’t end up with the same fate over time.
Have you ever wanted to train LLMs in pure C without 245MB of PyTorch and 107MB of cPython? No? Well now you can! With llm.c: github.com/karpathy/llm.c To start, implements GPT-2 training on CPU/fp32 in only ~1,000 lines of clean code. It compiles and runs instantly, and exactly…
(6/6) All that combined led me to a sobering realization that we barely know any intervention that reliably extends mice's lifespan. Now, I don't want to be overly pessimistic - we at least have something that worked in ITP, but I believe it's important to have realistic…
Interesting interview with Don Knuth in which he is surprisingly appreciative of chatGPT (at the very end of the interview) podcasts.apple.com/at/podcast/the…