Vithu Thangarasa @vithursant19
Machine Learning Research at @CerebrasSystems, previously at @Tesla and @UberAILabs, and grad student at @uoguelph_mlrg and @VectorInst. Thamilan ௐ. vithursant.com San Francisco, CA Joined August 2018-
Tweets277
-
Followers405
-
Following507
-
Likes902
Phi-3 mini model released under MIT! 🚀 Last Week Llama 3, this week Phi-3 🤯 @Microsoft Phi-3 comes in 3 different sizes: mini (3.8B), small (7B) & medium (14B). Phi-3-mini was released today, claiming to match Llama 3 8B performance! 🚀 3.8B TL;DR: 2️⃣ Instruct Versions with 4k…
Not to mention this insane point-cloud of a plot for their Figure 1 (aka main result) that draws 3-lines for each of the "Approaches" to claim that all 3 yield "mostly similar" results. Since when did a 10x difference become "mostly similar" in literature? [6/7]
The Chinchilla scaling paper by Hoffmann et al. has been highly influential in the language modeling community. We tried to replicate a key part of their work and discovered discrepancies. Here's what we found. (1/9)
Introducing v0.5 of the AI Safety Benchmark from MLCommons This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use
More silicon news: @intel lifted the lid on Gaudi 3 today at #IntelVision. OAM and PCIe versions. ➡️TSMC N5 ➡️PCIe 600W ➡️OAM 900W air, 900W+ liquid ➡️128 GB HBM2e ➡️64 Tensor Cores ➡️24x200 GbE ➡️PCIe 5.0 x16 ➡️Supports clusters up to 8192 ➡️2xFP8, 4xBF16 vs Gaudi 2 ➡️10 tiles
It’s only been a day 😅
Introducing DBRX: A New Standard for Open LLM 🔔 databricks.com/blog/introduci… 💻 DBRX is a 16x 12B MoE LLM trained on 📜 12T tokens 🧠DBRX sets a new standard for open LLMs, outperforming established models on various benchmarks. Is this thread mostly written by DBRX? Yes! 🧵
Today is the beginning of our moonshot to solve embodied AGI in the physical world. I’m so excited to announce Project GR00T, our new initiative to create a general-purpose foundation model for humanoid robot learning. The GR00T model will enable a robot to understand multimodal…
Cerebras and G42 said they have broken ground on Condor Galaxy 3, an AI supercomputer that can hit eight exaFLOPs of performance. venturebeat.com/ai/cerebras-br…
"Making science fiction a reality." @CerebrasSystems CEO Andrew Feldman makes a major announcement and discusses how the company aims to lead the charge in AI innovation on #TakingStock with @trinitychavez #TSTC
📣ANNOUNCING THE FASTEST AI CHIP ON EARTH📣 Cerebras proudly announces CS-3: the fastest AI accelerator in the world. The CS-3 can train up to 24 trillion parameter models on a single device. The world has never seen AI at this scale. CS-3 specs: ⚙ 46,225 mm2 silicon | 4…
How many cores? OVER 900000! @CerebrasSystems just announced its new generation Wafer Scale Engine 3, a chip as big as your head - four trillion transistors, double the performance of WSE-2, built with TSMC N5. 125 PetaFLOPs of FP16/BF16 compute. TASTY. youtu.be/f4Dly8I8lMY
📣 ANNOUNCEMENT DAY AT CEREBRAS 📣 Today, we are thrilled to share some of the biggest announcements in our company’s history. 📢 Cerebras announces CS-3, the world’s fastest AI Chip with a whopping 4 trillion transistors 📢 Cerebras selects Qualcomm to deliver unprecedented…
Google presents: Stealing Part of a Production Language Model - Extracts the projection matrix of OpenAI’s ada and babbage LMs for <$20 - Confirms that their hidden dim is 1024 and 2048, respectively - Also recovers the exact hidden dim size of gpt-3.5-turbo…
Good stuff going on here. #MediSwift biomedical training benchmarks.
Good stuff going on here. #MediSwift biomedical training benchmarks.
(1/n) Introducing MediSwift, the first suite of biomedical language models that employ sparse pre-training techniques to significantly reduce computational costs, while outperforming existing models up to 7B parameters on benchmark tasks such as PubMedQA. Paper:…
It's only been 5 hours since Open AI announced Sora, and people are going crazy over it. Here are 10 wild examples you don't want to miss: 1. Snow dogs
🎉 Cerebras overtakes peers as #1 AI Semiconductor Startup 🎉 In the freshly-updated 2023 State of AI Report Compute Index from Air Street Capital and Zeta Alpha, Cerebras is highlighted for leading all AI semiconductor startups in research publication counts, open source…
This is neat! Explore PyTorch models effortlessly with TorchTrail! It allows you to easily trace and visualize your model's execution, seamlessly extracting torch function and module graphs💯github.com/arakhmati/torc…
Collected some of the amazing projects people are building with MLX in one place: github.com/ml-explore/mlx… Looking at that list, it's hard believe MLX is just 2 months old.
Claire Croshaw @ClaiCrosh
34 Followers 5K FollowingPirckoos @pirckoos59172
0 Followers 140 FollowingHertha Wiedyk @WiedyHerth
65 Followers 5K FollowingLaurelAbe @18hgx5hrd7o4Zn
0 Followers 161 FollowingAzalea Bend @azalea89088
101 Followers 5K FollowingDaphneJimmy @k4pvKZWADf77hy2
3 Followers 237 FollowingHeather Carrejo @HeathCarrej
87 Followers 5K FollowingDahlia Bolnick @boln_dahli
84 Followers 5K FollowingZora Huso @huso_zo
51 Followers 5K FollowingNelle Findling @FindlNel
74 Followers 5K FollowingLovella Dottin @LovelDott
47 Followers 5K FollowingMark Kovarski @mkovarski
2K Followers 5K Following Responsible AI, Cloud, SaaS, Product 🤖 🫶🌐💡 | https://t.co/2vuiFosXlm 📪Galilea Stiegman @GaliStieg
68 Followers 5K FollowingMuoi Balliett @MuoiB88696
79 Followers 5K FollowingRavid Shwartz Ziv @ziv_ravid
2K Followers 1K Following Faculty Fellow and Assistant Professor at @NYUDataScience, working with @ylecunLindy Bulow @LBulow6786
86 Followers 5K FollowingImaan Difilippo @DifilippoI52295
82 Followers 5K FollowingKhadijah Rehder @RehdeKhadij
61 Followers 5K FollowingPaige Fandrich @PaiFandr
40 Followers 5K FollowingLillie-mae Soffer @SofferMae35208
66 Followers 5K FollowingAditi Wilshusen @wilshu_adi
46 Followers 5K FollowingEsmai Vanbeveren @e_vanbevere
44 Followers 5K FollowingN @men_shin_kai
85 Followers 1K FollowingClaudia Milliron @ClaudMilli
11 Followers 3K FollowingCourtney Zalwsky @zalwsky77669
96 Followers 5K FollowingLenora Kaitz @LKaitz74989
35 Followers 5K FollowingAleeza Mose @AleezaMo
97 Followers 5K FollowingAlexandra Kirkbride @AlexandraK14133
41 Followers 5K FollowingYer Delmas @DelmasYer78402
77 Followers 5K FollowingFreida Stapleton @FreidStaplet
29 Followers 5K Following❤️🔥👑 MAG.. @freakoncrypto
195 Followers 4K Following Queen of real takes ❤️🔥 (investing $250-500k in mainstream-adoption-oriented web3 startups). Trading 100% cryptoHassan Hayat 🔥 @TheSeaMouse
5K Followers 4K Following Building the AI assistant for all @ https://t.co/D4gDyw97guToughsh @Toughsh372489
121 Followers 3K FollowingBob Komin @BobKomin
368 Followers 1K Followingk_zer0s @k_zer0s
747 Followers 2K Following VC, Quantum Computing, AGI, AI, SDXL, FPGA, Startup Consulting, Senior Venture Architect at Financial Institution.Nish Sinnadurai @NishSinnadurai
6 Followers 94 FollowingQiao Jin, MD @DrQiaoJin
1K Followers 859 Following Postdoc @NCBI @NLM_NIH. Tsinghua MD. JMIR AE. Democratizing medical knowledge. AgentMD, MedRAG, TrialGPT, GeneGPT, MedCPT, PMC-Patients, PubMedQA. Views my own.Wand AI @WandAI_
415 Followers 29 Following Wand platform, enables business users and data analysts to solve real-world business problems easily and quickly – Collaborative, measurable and scalable.Ravid Shwartz Ziv @ziv_ravid
2K Followers 1K Following Faculty Fellow and Assistant Professor at @NYUDataScience, working with @ylecunAathushan Kugendran @aathushankgn
4 Followers 6 FollowingKalyan KS @kalyan_kpl
749 Followers 511 Following Katikapalli Subramanyam Kalyan (shortly Kalyan KS), Research Scientist (NLP) working on Generative AI and LLMs at @AkmmusAI.Andrew Gao @itsandrewgao
27K Followers 2K Following techno optimist! currently: @nomic_ai @stanford; prev @LangChainAI; Z Fellow 🇺🇸Corey Lynch @coreylynch
10K Followers 1K Following AI at @figure_robot, previously research scientist at @GoogleDeepMind.Vivek Ponnaiyan @viveksworld
760 Followers 635 Following Founder & Angel investor. AI & Fintech junkie. Past: Chime, BMW Self driving, Bloomberg, health-tech founder. Tweets about AI, startups, learning, & football.Mehmet Perk @mmt
917 Followers 1K Following 🤹♂️ Tech&Innovation @Siemens 🌮 co-founded: https://t.co/q2oseFfdT2, https://t.co/nwos8GDR4f (exited) 🎙️ Occasionally speaker, PM instructorSingh, Satinder Paul @PaulSatinder
1K Followers 4K Following Chief Technical Officer - Hermes Semiconductor | Strategist in Semiconductor Technology domain | VLSI/SoC/ASIC/IC (Chip) Design SpecialistBen Pouladian @benitoz
4K Followers 1K Following Father (x3!), EE, entrepreneur, investor, real estate developer, super angel, AI 🤖, biotech 😇 @ypo @TerasakiInst ❤️ asymmetry 📈🇮🇱🇺🇸💪🏽Yannick Scholich (e/�.. @YannickScholich
550 Followers 2K Following Effectively accelerating. Working on not dying and on FUN. Always needs funding and compute. Math/Applied Physics/CSKim Ziesemer @kziese
736 Followers 1K Following Co-founder of ZM Communications. Boulder-based PR & marketing professional. mom of two incredible little humans. Love travel, tech, new perspectives.Isak Westerlund 🦇�.. @westis96
759 Followers 4K Following Exploring Amortized Inference, Language and Speech.Brett Adcock @adcock_brett
172K Followers 14 Following Founder @Figure_robot (AI Robotics) & Archer Aviation (NYSE: ACHR)Hassan Hayat 🔥 @TheSeaMouse
5K Followers 4K Following Building the AI assistant for all @ https://t.co/D4gDyw97guCuong Nguyen @cuong_qnguyen
205 Followers 645 Following Director of AI/ML Engineering @GSK. Previously: AI/ML @Genentech.Sahil Lihas @MrSahilLihas
84 Followers 367 Following Book is still blank MS Research Scholar, IIT Madras Deep Learning, Semantic WebWarcop @Warcop
2K Followers 2K Following Problems Demolitionist/Cloud Innovation Architect #TechFieldDay #Innovation #NetDevOps #AVTweeps #AVoIP #SMPTE #IPMXBraydon Dymm, MD @BraydonDymm
601 Followers 565 Following Stroke Neurologist @CAMCHealth | Trained @Duke_Neurology & @UMICHNeuroRes | Interested in the intersection of AI, neurology, and education 🤖🧠👨🏫Nish Sinnadurai @NishSinnadurai
6 Followers 94 Followingk_zer0s @k_zer0s
747 Followers 2K Following VC, Quantum Computing, AGI, AI, SDXL, FPGA, Startup Consulting, Senior Venture Architect at Financial Institution.Meysam @vcmeysam
1K Followers 182 Following Training LLMs @scale_ai | ex @google @mastercard @yumbrands | Founder @wikusventures | @Solana fan | Traveled 100+ countries 🌎Saleh Soltan @SalehSoltan
235 Followers 446 Following Principal Applied Scientist @Amazon AGI |Ph.D. @Columbia 2017 Views of my own.Jimmy @mrgemy95
378 Followers 1K Following Mahmoud G. Salem. ML Scientist @cerebrassystems. MSc @vectorinst , @uofg. ex @GoogleAIBob Komin @BobKomin
368 Followers 1K FollowingQuentin Anthony @QuentinAnthon15
999 Followers 129 Following I make models more efficient. Google Scholar: https://t.co/kzVsAKPdrpdinos @din0s_
805 Followers 438 Following IR & NLP Research @ZetaVector. Interested in Neural Information Retrieval, Autonomous Agents, and AI-assisted Evaluation. Prev: MSc AI @UvA_AmsterdamQiao Jin, MD @DrQiaoJin
1K Followers 859 Following Postdoc @NCBI @NLM_NIH. Tsinghua MD. JMIR AE. Democratizing medical knowledge. AgentMD, MedRAG, TrialGPT, GeneGPT, MedCPT, PMC-Patients, PubMedQA. Views my own.Cartesia @cartesia_ai
1K Followers 8 Following Cartesia is training next-gen foundation models with subquadratic deep learning architectures. Sign up for early access at https://t.co/c5og0yF1PzElad Hazan @HazanPrinceton
11K Followers 187 Following machine learning and optimization @PrincetonCS & Google DeepMind Princeton, dad^3Nikela Papadopoulou @_nikela_
251 Followers 620 Following never thinking straight | always thinking parallel Low Carbon & Sustainable Computing Lecturer @GlasgowCSArcadian Computers @ArcadianComp
339 Followers 906 Following We are a computer consulting / repair center. We fix (and build) laptops / desktops / servers. We offer onsite support for business, and website design/hosting.𝕏one — exo/acc �.. @xone_4
151 Followers 264 Following CrD. & VFX-Artist. As a kid i always wanted to become a pirate⚓️, Now i lost one eye and have a bad leg, I sacrificed both to the Ancient Ones, Am i one now 🤔?Ruslan Röhrich @Ruslan_0990
856 Followers 910 Following Software engineer @zeiss_micro, previosly @tngtech and PhD in physics @_amolf. Tweets on sustainability, computational imaging, and AI. 👨💻 🔬 🦇🔊1X @1x_tech
9K Followers 2 Following Androids built to benefit society and meet the world's labor demand.Jason Weston @jaseweston
9K Followers 569 Following Research @MetaAI+NYU. Pretrain+SFT: NLP from Scratch (2011). Multilayer attention+position embed+LLM: MemNets (2015). Recent (2024): Self-Rewarding LLMs & more!Alaa El-Nouby @alaa_nouby
522 Followers 302 Following Research Scientist at @Apple. Previous: @Meta (FAIR), @Inria, @MSFTResearch, @VectorInst and @UofG . Egyptian 🇪🇬 Deprecated twitter account: @alaaelnoubyEric @ericmitchellai
4K Followers 488 Following I like AI & music. Working on making LLMs easier & safer to use. Final year PhD student at Stanford advised by Chelsea Finn & Chris Manning.qnguyen3 @stablequan
3K Followers 1K Following Multimodal | Synthetic Data | Multimodal Lead at Ontocord AIIshani Thakur @ishanit5
222 Followers 2K Following " 'How's the water?' And the two young fish swim on for a bit, and then eventually one of them looks over at the other and goes 'What the hell is water?'" - DFWRicardo Ander-Egg @ricardoanderegg
428 Followers 2K Following (Machine Learning Engineer ⋃ Software Engineer) ∩ Medical doctor. Swimmer and dancer. https://t.co/ixUWl9iGFq @[email protected]Software Dev @swdevservice
365 Followers 2K Following I build/support SaaS w/ 300M users. Observability, identity, privacy. Interests: AI, personal finance, society. Solved my RSI/back pain. español 日本語 e/💻Maxime Labonne @maximelabonne
12K Followers 437 Following Author of Hands-On Graph Neural Networks https://t.co/Q8victWUmR • Machine Learning Scientistmarcel - so back / ng.. @mrclbschff
829 Followers 1K Following Staff Data Scientist, Mathematician, Father of two. Deep Learning / NLP / Computer Vision / MLOps OCaml Curious Not sponsored by SpindriftAn edgy post... 1 trillion edges, in fact!
Graph clustering merges similar items into groups to better understand relationships in data. Today, read about our recent works, including key techniques that enabled us to scale a high-quality algorithm that can cluster trillion-edge graphs. Read more → goo.gle/3y1iXMs
Delighted to share that I've been promoted to Professor (aka “Full Professor”) A huge thanks to my wife, students, collaborators, colleagues, family, & friends for everything. It's been an exhilarating, wondrous, fascinating climb, and what a view! Now, which peak to climb next?
All the Chinchilla scaling laws are wrong?!??!😱😱😱
The Chinchilla scaling paper by Hoffmann et al. has been highly influential in the language modeling community. We tried to replicate a key part of their work and discovered discrepancies. Here's what we found. (1/9)
Phi-3 mini model released under MIT! 🚀 Last Week Llama 3, this week Phi-3 🤯 @Microsoft Phi-3 comes in 3 different sizes: mini (3.8B), small (7B) & medium (14B). Phi-3-mini was released today, claiming to match Llama 3 8B performance! 🚀 3.8B TL;DR: 2️⃣ Instruct Versions with 4k…
Not to mention this insane point-cloud of a plot for their Figure 1 (aka main result) that draws 3-lines for each of the "Approaches" to claim that all 3 yield "mostly similar" results. Since when did a 10x difference become "mostly similar" in literature? [6/7]
The Chinchilla scaling paper by Hoffmann et al. has been highly influential in the language modeling community. We tried to replicate a key part of their work and discovered discrepancies. Here's what we found. (1/9)
Thanks, @JRussonHPC for this excellent writeup about the @MLCommons AI Safety v0.5 benchmark we announced earlier this week! If you want to learn more and contribute to our efforts to make AI safer for everyone, join our #AISafety working group: mlcommons.org/working-groups…
MLCommons Launches New AI Safety Benchmark Initiative ow.ly/V6NQ50RhsMv
Be sure to check out this coverage from @Business_AI of the MLCommons AI Safety v0.5 proof-of-concept benchmark that we announced earlier in the week to make AI safer for everyone: aibusiness.com/responsible-ai…
Want to dig deeper into the details of the @MLCommons AI Safety v0.5 benchmark proof of concept that we announced this week? Learn more about the approach, platform, and tests created by our open consortium for this first step toward evaluating AI safety: arxiv.org/abs/2404.12241
Introducing v0.5 of the AI Safety Benchmark from MLCommons This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use
One of my favorite things about MLX is it helps put ML research back in the hands of a single bold hobbyist. Don’t need a supercomputer to invent - just a nice laptop, a vision, and some persistence, (and maybe pip install mlx 😉)
Showcasing the powerful Idefics2, latest Vision LLM from Hugging Face, on a robot! 🚀
Time for the open-source AI robots revolution 🚀 We’ve been playing with a low-cost DJI robot controlled by 3 local open-source AI models (Whisper, Idefics2, Parler-TTS - all Apache2) & orchestrated by Dora-cs In comments a 250 lines code gist to build on top of it => enjoy!!
🌟 Cerebras is thrilled to be selected on the 2024 Forbes AI 50! 🌟 Here are a few reasons why we made the cut: 🎉 Cerebras is the only AI chip startup on this year’s list. Learn more about our latest generation of hardware, the CS-3: cerebras.net/product-system/ 🎉 We enable top…
More silicon news: @intel lifted the lid on Gaudi 3 today at #IntelVision. OAM and PCIe versions. ➡️TSMC N5 ➡️PCIe 600W ➡️OAM 900W air, 900W+ liquid ➡️128 GB HBM2e ➡️64 Tensor Cores ➡️24x200 GbE ➡️PCIe 5.0 x16 ➡️Supports clusters up to 8192 ➡️2xFP8, 4xBF16 vs Gaudi 2 ➡️10 tiles
I highly recommend this tutorial on Mamba and related models. Full of insights on model design and hardware-aware implementation!
A Mamba Primer (w/ Yair Schiff youtube.com/watch?v=dVH1dR… ) Mamba is a nice jumping off point to summarize foundational ideas in sequence modeling, parallel algorithms, continuous-time representations, and GPU aware algorithms. We try to put these together in the context of LMs.
If you have apple silicon and > 70GB of RAM, you can run DBRX on your laptop!! Kudos to @awnihannun :) huggingface.co/mlx-community/…
Databricks DBRX model is AMAZING, generally great but CRUSHES code. 132B parameters, 12T token, 16 experts, 4 per forward, 36B active. ~2.6e24 HumanEval5, 0-Shot DBRX - 70.1% GPT-4 - 67% Gemini 1.5 Pro - 71.9% Mixtral - 54.8% Grok - 63.2% LLAMA 2 - 32.2% databricks.com/blog/introduci…
Meet DBRX, a new sota open llm from @databricks. It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.
Introducing DBRX: A New Standard for Open LLM 🔔 databricks.com/blog/introduci… 💻 DBRX is a 16x 12B MoE LLM trained on 📜 12T tokens 🧠DBRX sets a new standard for open LLMs, outperforming established models on various benchmarks. Is this thread mostly written by DBRX? Yes! 🧵