Jonathan Whitaker @johnowhitaker
Data scientist and AI researcher. R&D at https://t.co/9xrxRrGfEE. johnowhitaker.dev Portland, Oregon Joined October 2015-
Tweets2K
-
Followers7K
-
Following937
-
Likes13K
youtube.com/watch?v=M6hGjh… 'True Facts' is a gem. Take a break from your work and learn what bees like!🐝
New video: MambaByte. Argues that w/o attention, byte models are competitive with tokenized models at training. Decoding can be sped-up by token-level speculation and low-entropy parallel verification. youtube.com/watch?v=kcd0BT…
Speaking of which, today I learned LAION 2B aesthetic has 122,896 white rectangle images.
Speaking of which, today I learned LAION 2B aesthetic has 122,896 white rectangle images.
QDoRA strikes a nice balance - efficient like QLoRA but performs more like full finetuning. I hope 'quant. base + trainable adapters' becomes the default way to share models. We can train QDoRA w/ FSDP now, the next piece is fast inference without merging in adapters...
QDoRA strikes a nice balance - efficient like QLoRA but performs more like full finetuning. I hope 'quant. base + trainable adapters' becomes the default way to share models. We can train QDoRA w/ FSDP now, the next piece is fast inference without merging in adapters...
python train.py --model_name meta-llama/Meta-Llama-3-8B... Promising! Now to decide if I actually need the 70B model running/training locally, since bragging rights don't quite justify the download + disk usage :)
Something I appreciated just now: @Replit's scheduled deployments UI has a text-to-cron AI conversion. A small touch but so nifty! This is the future I want, where thoughtful people use AI to add magic in places where it actually makes sense :)
I love these tips from @johnowhitaker, just posted to @answerdotai, on attacking "high-surface-area problems"; i.e problems where "when something doesn’t work it can be hard to find out where the issues may be, let alone what we need to do to fix them." answer.ai/posts/2024-04-…
The bugs I encounter most with LLMs in production are related to data drift. This is acute for LLMs b/c of all the moving parts: prompts, RAG, functions, etc. There is a classic ML technique that works for detecting drift. I explain it in this post: hamel.dev/blog/posts/dri…
AK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxAndrej Karpathy @karpathy
979K Followers 905 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Jeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordSebastian Raschka @rasbt
267K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.Jim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Tanishq Mathew Abraha.. @iScienceLuvr
54K Followers 1K Following PhD at 19 | Founder and CEO at @MedARC_AI | Research Director at @StabilityAI | @kaggle Notebooks GM | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6QbTomLikesRobots🤖 @TomLikesRobots
33K Followers 5K Following AI Artist at Metaphysic working with AI and VFX. All views my own. Experienced Web Dev and Artist. Early explorer of Artificial Creativity.Rivers Have Wings @RiversHaveWings
31K Followers 224 Following AI/generative artist. Writes her own code. Absolute power is a door into dreaming.Radek Osmulski 🇺�.. @radekosmulski
25K Followers 555 Following Resources to take your Machine Learning skills to the next level 🧪 Senior Data Scientist, RecSys @NVIDIAAI 🏫 @fastdotai trained DL Eng 📝 https://t.co/By87iXx5PuHamel Husain @HamelHusain
23K Followers 2K Following Researcher focusing on LLMs: https://t.co/iVZDFdIQiE Previously, dev tools and infra for ML. Ex @Github, @Airbnb, @DataRobot. @fastdotai core contributor.Guy Parsons @GuyP
51K Followers 7K Following building things with #AI 🤖 #DALLE & #MidJourney adventurer ✍️ editor, https://t.co/77MJXuLSTd 🖼 curator of the https://t.co/8Xctk6XoPsOmar Sanseviero @osanseviero
32K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽Zach Mueller @TheZachMueller
10K Followers 392 Following 🤗 Technical Lead for the Accelerate Project | Passionate about Open Source | Nerd who enjoys touching the grass | #ADHD | He/HimLior⚡ @AlphaSignalAI
84K Followers 898 Following Covering the latest in AI R&D • ML Engineer • Ex-Mila researcher • MIT Lecturer • Building AlphaSignal, a technical newsletter read by 180,000+ ML experts.Justin Pinkney @Buntworthy
10K Followers 1K Following Playing with deep learning, computer vision and generative art. Co-creator of https://t.co/sYtHB9e5Dj, ML Researcher @ MidJourney @[email protected]Sharif Shameem @sharifshameem
53K Followers 3K Following founder @LexicaArt • in pursuit of good explanationsAI Pub @ai__pub
72K Followers 343 Following AI papers and AI research explained, for technical people. Get hired by the best AI companies: https://t.co/MySVjUGOQ3Jay Alammar @JayAlammar
35K Followers 1K Following Machine learning and language models R&D. Builder. Writer. Visualizing AI, ML, and LLMs one concept at a time. @Cohere. https://t.co/TquuQXlLOJJean de Nyandwi @Jeande_d
38K Followers 774 Following Deep Learning, Vision 🤍 Language, Multimodal LLMs • AI Education • CMU Research blog: https://t.co/1BEFLZAqe7 ML Pack: https://t.co/7PkTyDvuriDom Dathan @domdathan
50 Followers 565 Following Aerodynamic, Thermal and Data Engineer working in Motorsport and Olympic Sports. Sports fan, particularly Rugby, Football, Tennis. Ultimate Frisbee player.Sidharth Singh @_RealSid_
25 Followers 650 Following Tech @ https://t.co/vdndmJmlJV | linkdin: https://t.co/sCfZhrbSylDJ @DuaneJRich
5K Followers 842 Following Helping companies do machine learning stuff. https://t.co/JOG4SBUqYn YouTube channel: https://t.co/0f67ttsk59eulerbug @eulerbug
119 Followers 60 Following call me euler, or bug, or eulerbug 😅 working on AI/ML and LLMs (and I write poetry on the side) 2024 logJustus Bruns @justusbruns
1 Followers 92 FollowingAnkit @ashah0052
1K Followers 5K Following LLM Arch Assoc Director - @Accenture Ph.D. - @LTIatCMU @SCSatCMU Previous: @GoogleAI, @merl_news, @Revive_Med, @ARM Smartly working hard to make things happen!Turing Layer @TuringLayer
12 Followers 79 Following Deep learning | Transformers | GenAI | Agentic DesignHarsh Pareek @harshhpareek
714 Followers 3K Following ML @prodigaltech, ex-(@Meta|@UTAustin|@iitbombay), 1/sqrt(2) (e/acc+AINotKillEveryone)Aaditya ; @Aaditya26082004
531 Followers 7K Following CS'26 • Machine Learning • Open-Source • Web Dev. • Algorithms • Jai Shree Krishna 🦚🪈Alessandro Caputo @ACaputoMD
203 Followers 389 Following Pathologist 🔬 @ 🇮🇹 University Hospital of SalernoNikita @nikitavoloboev
4K Followers 7K Following Make @LearnAnything_ Learn in public: https://t.co/GbFvuErkYn macOS course: https://t.co/JdbJWru6zG https://t.co/94R8ER7K2h https://t.co/ROkqhyhpEKAhmed Hisham @AhmedHi08078280
0 Followers 50 FollowingThe tech tyrant @TheTechTyrant
9 Followers 269 Following Roasting big tech with code and jokes. All in good fun, or at least until they automate my job #TechHumor #CodeComedygatsby @greatxtommy
29 Followers 134 FollowingHow Khang, Lim @howkhang_lim
103 Followers 819 Following Assistant Professor | Lawyer | Computer ScientistLaunchpad @LaunchpadBuild
1K Followers 967 Following Building self-programming robots to reshore manufacturing 🇺🇸David Diviny @daviddiviny
268 Followers 282 Following Leader of Data and Analytics @NousGroup. Interested in analytics, #rstats, #highered, public policy, technology, running, cycling and food. Disclaimer etc.nedned @nletcher
1K Followers 5K Following data (science | analytics | visualisation | engineering), @thoughtworks, #Python, #nlproc, ML, & assorted whimsical miscellaniaAlonso Silva (e/acc) @alonsosilva
4K Followers 2K Following Researcher on Verifiable AI @ Nokia Bell Labs | PhD in Physics | Interested in LLMs and MLMiguel Guerrero @apolmig
1K Followers 1K Following Prompting Machines || Artificial Intelligence, Education | Founder https://t.co/19AvriElPW et al. ⚔Ahmet S Asarkaya @asark38
23 Followers 92 FollowingPhilipp @PhilippRoechner
50 Followers 319 Following Researcher using machine learning to improve the treatment of cancer patients #cancer #MachineLearning #DataScience #HealthinformaticsLewis Walker ➲ @lewiswalkerai
5K Followers 5K Following Follow for Generative AI insights shared daily | Deloitte AI | Ex-Goldman Sachs | LinkedIn Top AI Voice黃子峻 @tc_huang_tw
6 Followers 196 FollowingJoe @hovsepsorf
42 Followers 860 FollowingAfshin Matin @afshin_matin_75
0 Followers 52 FollowingAmaral Medeiros @amaralmedeiros
388 Followers 1K Following Passionated about Technology and Human Progress. Obama Scholar, Forbes Under 30, MBA in Finance/Economics from UChicago, Robotics Engineer 🇧🇷Rohan Paul @rohanpaul_ai
13K Followers 1K Following ML Engineer (e/acc) 📌 https://t.co/x0IIWfnOt8 🚀 https://t.co/QEO4CKRl1b Open LLMs is Happiness 💡 Ex Deutsche & HSBC. DM for collaboration.Memo.V2 @X_M3M0
2 Followers 55 FollowingFranz Maikaefer @franz_maikaefer
136 Followers 2K FollowingGolgappay @GolGappay102
91 Followers 414 FollowingHakan Gurel @hakangurel
148 Followers 340 Following e/acc. #AI #web3. Prev: @Amazon, @Microsoft. Alum: @epflcdm/@EPFL.Satyam @SatyamGuptaDev
457 Followers 5K Following Building Websites and solving LeetCode | React.js • Next.js • AI | Let's build great productsMoti Gupta @gupta_moti18498
2 Followers 179 FollowingMr. Jenkins @IAmMrJenkins
59 Followers 639 Followinghessian @HessianInf
165 Followers 2K Followingzener @zenerbreakdown
3 Followers 292 FollowingJonathan Klein @jonathanbklein
304 Followers 1K Following Founder & CEO of @Teknoir - Operational AI for the physical world | Former Cofounder & CEO of Cimation (acquired by NYSE: ACN)Rahul @rahulgupta765
36 Followers 2K FollowingKim Seung Kwon @KimSeungKwon3
0 Followers 4 FollowingAnmol Tomer @anmol_tomer_cc
254 Followers 3K Following Engineering at CRED || Should've been a statistic, learning to be an outlier now || Building things one commit at a time. Sharing Memes, one tweet at a time. 🙌Ryan Monsurate @ryanmonsurate
547 Followers 993 Following First principles thinker. EE/MBA. Co-founder, CTO Farpoint AI. Looking to change the 2nd derivative of financial equity.Alex Ramalho @_alexramalho
112 Followers 286 Following dev • des • ml enthusiast • https://t.co/F6AFKukkZi • @heyjarvis_coXu Fei @xuf12
147 Followers 435 FollowingRyanLLM @ryanllm
2 Followers 72 FollowingAK @_akhaliq
310K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxAndrej Karpathy @karpathy
979K Followers 905 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥François Chollet @fchollet
470K Followers 769 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.Yann LeCun @ylecun
712K Followers 719 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Jeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordSebastian Raschka @rasbt
267K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.Jim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Tanishq Mathew Abraha.. @iScienceLuvr
54K Followers 1K Following PhD at 19 | Founder and CEO at @MedARC_AI | Research Director at @StabilityAI | @kaggle Notebooks GM | Biomed. engineer @ 14 | TEDx talk➡https://t.co/xPxwKTq6QbHugging Face @huggingface
345K Followers 189 Following The AI community building the future. https://t.co/VkRPD0VKaZ #BlackLivesMatter #stopasianhateTomLikesRobots🤖 @TomLikesRobots
33K Followers 5K Following AI Artist at Metaphysic working with AI and VFX. All views my own. Experienced Web Dev and Artist. Early explorer of Artificial Creativity.Rivers Have Wings @RiversHaveWings
31K Followers 224 Following AI/generative artist. Writes her own code. Absolute power is a door into dreaming.Google DeepMind @GoogleDeepMind
944K Followers 275 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.Radek Osmulski 🇺�.. @radekosmulski
25K Followers 555 Following Resources to take your Machine Learning skills to the next level 🧪 Senior Data Scientist, RecSys @NVIDIAAI 🏫 @fastdotai trained DL Eng 📝 https://t.co/By87iXx5PuHamel Husain @HamelHusain
23K Followers 2K Following Researcher focusing on LLMs: https://t.co/iVZDFdIQiE Previously, dev tools and infra for ML. Ex @Github, @Airbnb, @DataRobot. @fastdotai core contributor.KaliYuga @KaliYuga_ai
27K Followers 809 Following Like dust, magic gathers in overlooked places | she/her | ✡️ | #Aiart since 2020 | @StabilityAi | Opinions are my ownAI at Meta @AIatMeta
532K Followers 255 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.Sasha Rush @srush_nlp
52K Followers 464 Following Professor, Programmer in NYC. Cornell Tech, Hugging Face 🤗 https://t.co/cZl0wTfqGzAlyson L @AlysonL62815528
36 Followers 27 FollowingAlonso Silva (e/acc) @alonsosilva
4K Followers 2K Following Researcher on Verifiable AI @ Nokia Bell Labs | PhD in Physics | Interested in LLMs and MLMiguel Guerrero @apolmig
1K Followers 1K Following Prompting Machines || Artificial Intelligence, Education | Founder https://t.co/19AvriElPW et al. ⚔Astribot @Astribot_Inc
495 Followers 1 FollowingGraeme Harris @GesturingMan
41 Followers 416 FollowingBrian Jordan @bcjordan
3K Followers 3K Following show me your cool explorations! on paternity leave, prototyping/writing thoughts on ai/web/gamedev/learningiandanforth @iandanforth
2K Followers 1K Following AI, Robots, Neuroscience, and Liberal Politics. I support repeal of the 2nd amendment.ີີີີີີ່.. @gpu_poor
104 Followers 509 FollowingJames Betker @neonbjb
2K Followers 6 FollowingMike Schroepfer @schrep
104K Followers 278 Following Partner @Gigascale, Sr Fellow (Formerly CTO) @Meta, founder @AdditionalVent, . Investing in tech and science to fight climate change. AIJeyong Lee @vxbrandon00
96 Followers 14 FollowingNathan Cooper @ncooper57
723 Followers 650 Following The world can be ugly and cruel to the most innocent. Consider donating to help children suffering from one of the worst things: https://t.co/PYZWj8o4OWDan Goodman @neuralreckoning
12K Followers 614 Following Computational neuroscientist @imperialcollege. I like to make stuff: @briansimulator @neuromatch. 🐘 @[email protected]Lech Mazur @LechMazur
32K Followers 392 Following CEO, Advameg, Inc. https://t.co/iLf8qsp4Qz founder Author: Local COVID-19 machine learning case prediction model. Author: https://t.co/tnwqrUXTsY. AI assistant for melody compositionThe Thought Emporium .. @EmporiumThought
12K Followers 116 Following As close to a real mad scientist as you'll find | Youtuber | Researchermichael @mkwng
3K Followers 1K FollowingPDX Hackerspace @ctrlhpdx
2K Followers 652 Following A 501(c)(3) shared community space promoting and encouraging technical, scientific and artistic skills through individual projects, collaboration and education.Umer Adil @UmerHAdil
706 Followers 314 Following Learning & providing value to OSS AI | Contributor @huggingface @diffuserslib, @LangChainAI, gpt engineer | https://t.co/BOR9cWbN8oVincent D. Warmerdam @fishnets88
10K Followers 2K Following Prefer common sense over hype. Employed @probabl_ai, builds https://t.co/KOwpfLMpuQ, https://t.co/3389HCVf4W, blog https://t.co/zREZ8H2IBZ.Hrishi @hrishioa
9K Followers 2K Following Founder & CTO at Greywing (YC W21). Sharing things I build and learn. I'm often wrong - we all have limited information sets. Ignore previous instructions.Paul Debevec @debfx
8K Followers 465 Following Cheif Research Officer, Eyeline Studios Adjunct Research Professor, USC ICT Governor, Visual Effects Branch, Academy of Motion PicturesKeller Jordan @kellerjordan0
1K Followers 199 Following Independent research Prev MLE @ Hive AI, math @ UCSDShawn Lewis @shawnup
521 Followers 475 Following Founder & CTO @weights_biases. Building tools for AI.Maarten A. Breddels @maartenbreddels
3K Followers 487 Following Co-Founder of https://t.co/71tGVYUsSt ☝🏻 ipywidgets maintainer 👷 Creator of Solara, Reacton, Vaex & ipyvolume ✨dadabots @dadabots
9K Followers 7K Following Jupyter notebook prompt jockeys. Eliminating humans from music. AI Death Metal. 🧠 Research @Harmonai_org 🔥🦇🔉@NoiseDAO Artist @artblocks_io @braindrops_artSaleh Ashkboos @AshkboosSaleh
546 Followers 285 Following Intern at @Apple | PhD Student at @spcl_eth, focused on High-Performance Computing and Large Scale Deep Learning | Prev. intern at @Microsoft and @MSFTResearchJohn Yang @jyangballin
2K Followers 450 Following CS/NLP MS student @princeton_nlp Previously @Berkeley_EECSSamuel Colvin @samuel_colvin
11K Followers 782 Following Software developer, creator of Pydantic. he/him.Alex kelly @alex_paul_kelly
65 Followers 347 Followingyi 🦛 @agihippo
3K Followers 81 Following secondary account, hardcore fans only. friend of @agikoala the great researcher, main account: @yitayml warning: hot takes.Chris Van Pelt (CVP) @vanpelt
1K Followers 277 Following FigureEight and Weights & Biases co-founder. Reared in #Iowa, big fan of creating things.Corry Wang @corry_wang
25K Followers 254 Following Strategy @ Google | Formerly tech equity research @ Bernstein Research. All opinions expressed are my own, and do not represent Google'sMobius Labs @Mobius_Labs
3K Followers 105 Following Multimodal AI for the world's scale. Proponents of Open Source and Open Intelligence. https://t.co/1nC6r8hOrE for some of our recent work.Erik Meijer @headinthebox
27K Followers 0 FollowingAlexander Koch @alexkoch_ai
5K Followers 203 Following Founder of Tau Robotics (@taurobots) | Z Fellow | Emergent Ventures Fellow 2024Jason Carman @jasonjoyride
21K Followers 2K Following Launching satellites @astranis & weekly startup documentaries via S³Sara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Matthew Carrigan @carrigmat
3K Followers 353 Following @huggingface engineer. I'm the reason your LLM frontend has a jinja2cpp dependency. Sometimes yells about housing and trans rights instead of working He/himGrace Kind @kindgracekind
2K Followers 2K Following AI navel-gazer / Ideonomy evangelist / navigator of uncertain watersAndy Ayrey @AndyAyrey
2K Followers 625 Following trafficker in existential hope • i make websites for space & biotech companies @ https://t.co/MqJ1SS2Xkw • ai adaptation training @ https://t.co/pWNcpCLXdmAbby Holland @holland_neuro
2K Followers 2K Following NeuroTech Product Owner @IDUNTech. thinking about neurotech, ML/DL, longevity, philosophy, psychedelics- alum: biomedeng @queensu Clinical Neuro @UCLDr Kate Compton @GalaxyKate
22K Followers 2K Following Weird Futurist. Maker of many interesting things, now playing with toys in Denmark (all hot takes here are my own) Ask me about JavaScript. She/herI still don't understand why an autogyro that also has parts attached to drive 50mph isn't a reasonable flying car product. Why can't we build this?
Announcing our new dataset: ar5iv 04.2024 🔹2.1 million HTML documents 🔹1 billion formulas in MathML sigmathling.kwarc.info/resources/ar5i…
This project has been something of a white whale for me. Think I've been working on it 5-6 months now? Still a lot of work to be done, but I'm finally confident in the results to share some details! I made photosensitive pixels from Copper Oxide 😎 youtube.com/watch?v=O7xH9Z…
my favorite thing about these warning messages is that they are incredibly intriguing, as if engineered to make you curious :3c ☢️
Had to give a talk to some CEOs. They knew way more about LLMs than me. Asked one of them how, he said "I check Chatbot Arena every morning" 😆 New OSGAI talk from Hao Zhang (@haozhangml ) on Chatbot Arena, seemingly the only eval anyone trusts. youtube.com/watch?v=7njmta…
Fortunately for unconnected noobs like myself there are still some breaths of expert and properly negative fresh air in this mess. like the saint @agihippo
It's really sad how expert researchers now completely refrain from expressing negativity about papers. They just confer with each other and then ignore the BS. It's a tragedy of the commons which makes it hard for new researchers to tell what actually matters.
Prompting is so much more dynamic and flexible, I find that FT really degrades other domains - something that prompting does not suffer from. And of course you can just have hundreds of prompts for hundreds of tasks
I’m finding that just prompting using few shot (examples of how to do the task) is enough to get >= fine tuning performance using llama-3 instruct. There are exceptions of course like translation that would be benefit from FT. But generally seeing very good results, to the point…
Google announces Med-Gemini, a family of Gemini models fine-tuned for medical tasks! 🔬 Achieves SOTA on 10 of the 14 benchmarks, spanning text, multimodal & long-context applications. Surpasses GPT-4 on all benchmarks! This paper is super exciting, let's dive in ↓
Ok wow... llama 3 70B is going to crush this benchmark will break into top 3 easy. It's taking extra long (running it on consumer hw @ bf16). 88.0 on python == sonnet level
llama-3 70B takes 3rd place, replacing haiku. full results on right
At a certain point in every roboticists career the temptation to create a robot covered in blue fur becomes to great. In honor of @BostonDynamics Sparkles here's a throwback to a couple others.
I hadn't seen this before: LMSYS have clear documentation of their policies around this kind of "blind test mode" on their site: lmsys.org/blog/2024-03-0…
llama-3 models did very poorly on this benchmark, simply because their context length is *limited to 8k*. But... with zero-training (actually just a simple 2 line config) you can get 32k context out of llama-3 models with *exceptional* quality. llama-3 8B surpasses many models…
We evaluated 26 models initially: 🏆 Claude/OpenAI/Gemini models are doing great in this task 💪 Mistral’s MoE models outperform GPT-3.5-Turbo 👍 The 7B CodeQwen beat many larger general & code-specific models Many models are good at Java but may need to learn more Rust and C++
The last time the "gpt2" string was trending... IYKYK cc @minimaxir
Replit left San Francisco for Foster City. The "why" we're leaving is boring, sad, and predictable (crime, dysfunction, etc), so instead let me tell you why we chose Foster City. Foster City embodies the American post-war optimism and the long-lost California pro-growth…
gpt2-chatbot → Generative Pretrained Transformer 2 Chatbot This is clearly a scaled up version of the Transformer 2 archtecture!! ↓
Google presents Transformer 2 - Unifies attention, recurrence, retrieval, FFN into a single module - Performs on par with Transformer w/ 20x better compute efficiency - Efficiently processes 100M context length proj: tinyurl.com/59upc7v6 abs: tinyurl.com/3nw25nz2