Karan Goel @krandiash
Founder @cartesia_ai, Machine Learning PhD at @StanfordAILab, CMU / IIT-Delhi alum. krandiash.github.io Stanford, CA Joined January 2010-
Tweets884
-
Followers3K
-
Following882
-
Likes2K
Awesome work from @ScottWu46 and team, Devin is the beginning of real progress in deeper, long context reasoning for code. Can’t wait to see what this looks like in a year
Awesome work from @ScottWu46 and team, Devin is the beginning of real progress in deeper, long context reasoning for code. Can’t wait to see what this looks like in a year
We’re throwing a Guardrails company launch party in SF tomorrow. If you’re working on building LLM apps and care about AI reliability, come celebrate with us! partiful.com/e/Eo3dfJLif9pe… We’re saving some limited seats for AI builders and tinkerers
Someone pointed me to this fragment from Jensen's Wired article -- amazing to see the support around SSMs (and really cool that he's so technically plugged in)
Mamba vs Transformer
I'm beyond thrilled to make two pretty substantial announcements: 1. We just released a brand new open source Guardrails Hub, with 50+ validators and more coming! 2. We raised a round of seed funding round to execute on our vision of open source AI reliability 🧵
I've had an interesting year to say the least -- almost exactly a year ago I started moonlighting on @guardrails_ai in my spare time. To celebrate my ~1 year anniversary, I'm giving the keynote at the AI in production conference, with some 4000+ registrations already(!) Why you…
This was a really inspiring and IMO milestone paper in the space of efficient attention / RNNs! coming soon: a strong generalization of this from the view of SSMs ;)
This was a really inspiring and IMO milestone paper in the space of efficient attention / RNNs! coming soon: a strong generalization of this from the view of SSMs ;)
1/ With @tri_dao, we’re collaborating with @cartesia_ai and @togethercomputer and we’re releasing a Mamba 3B model trained on 600B tokens on the SlimPajama dataset. Mamba scales well with data size, matching some of the strongest 3B Transformers out there.
We've got a new 2.8B Mamba model trained to 600B tokens on SlimPJ out by @_albertgu @tri_dao with @cartesia_ai and @togethercompute And it's Apache 2.0 so use it however you like 🚀
We've got a new 2.8B Mamba model trained to 600B tokens on SlimPJ out by @_albertgu @tri_dao with @cartesia_ai and @togethercompute And it's Apache 2.0 so use it however you like 🚀
Excited about this incredible SSM from @_albertgu and @tri_dao, and excited to be working with @_albertgu on scaling SSMs at @cartesia_ai. Stay tuned for more.
Excited about this incredible SSM from @_albertgu and @tri_dao, and excited to be working with @_albertgu on scaling SSMs at @cartesia_ai. Stay tuned for more.
Mamba: Linear-Time Sequence Modeling with Selective State Spaces paper page: huggingface.co/papers/2312.00… Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module.…
One of my favorite validators in @guardrails_ai is the Provenance Guardrails, or the anti-hallucination guardrail. Today, I'll deep dive into how it works under the hood. The core idea behind Provenance is that establishing provenance (i.e. source/origin) of any LLM utterance in…
Excited about models that are sub-quadratic in sequence length and model dimension? Our Monarch Mixer paper is now on arXiv -- and super excited to present it as an oral at #NeurIPS2023! Let's dive in to what's new with the paper and the new goodies from this release: Monarch…
what if learning about a topic was as braindead as scrolling tiktok? showed up to a hackathon 2 hours before submissions were due with no idea and no code furiously coded an app that spits out an engagement-maxxed tiktok feed for topics you want to learn about won + stole all…
Check out our interactive demo collagediffusion.stanford.edu and make your own collages! Thanks @_akhaliq for sharing our work! Tutorial: youtube.com/watch?v=BX4ZW9… Code: github.com/linden-li/coll… Blog: vsanimator.github.io/collage_diffus… Work with @lindensli, @ardenwma, @HazyResearch, @kayvonf
Check out our interactive demo collagediffusion.stanford.edu and make your own collages! Thanks @_akhaliq for sharing our work! Tutorial: youtube.com/watch?v=BX4ZW9… Code: github.com/linden-li/coll… Blog: vsanimator.github.io/collage_diffus… Work with @lindensli, @ardenwma, @HazyResearch, @kayvonf https://t.co/pf8jihGPfD
🎉 Excited to announce Guardrails AI v0.2.0 is now live!! This was a huge release (blog with more details below), but here are the highlights ✅ Full @pydantic support ✅ String validation (!!!) ✅ Better interfaces for custom validators ✅ Many paper cuts fixed (contd.)
This is a substantial release, with a ton of new changes across documentation, validation, string templating, etc. However, the two substantial features that were most often requested are: 1. Fully native @pydantic support, and 2. String validation More tutorials coming soon!
This is a substantial release, with a ton of new changes across documentation, validation, string templating, etc. However, the two substantial features that were most often requested are: 1. Fully native @pydantic support, and 2. String validation More tutorials coming soon!
Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistJim Fan @DrJimFan
230K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Shreya Shankar @sh_reya
39K Followers 589 Following I study ML & AI engineers and try to make their lives a little better. PhD-ing in databases & HCI @Berkeley_EECS @UCBEPIC and MLOps-ing around town. She/they.Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Alex Ratner @ajratner
5K Followers 551 Following @SnorkelAI @uwcse / prev @StanfordAILab – Interested in data management systems for machine learning, weak supervision, and impactful applications.Ananya Kumar @ananyaku
4K Followers 472 Following Researcher at @openai Previously PhD at Stanford University (@StanfordAILab) advised by Percy Liang and Tengyu MaSander Dieleman @sedielem
50K Followers 2K Following Research Scientist at Google DeepMind. I tweet about deep learning (research + software), music, generative models (personal account).rishi @RishiBommasani
4K Followers 2K Following Stanford CS PhD @StanfordCRFM @StanfordNLP @StanfordAILab @StanfordHAI Advisers: @percyliang @jurafsky Previous: @CornellCIS @clairecardie #FoundationModelsTri Dao @tri_dao
19K Followers 365 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.Beidi Chen @BeidiChen
6K Followers 343 Following Asst. Prof @CarnegieMellon, Visiting Researcher @Meta, Postdoc @Stanford, Ph.D. @RiceUniversity, Large-Scale ML, a fan of Dota2.Sara Hooker @sarahookr
39K Followers 7K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.rohan anil @_arohan_
12K Followers 2K Following Principal Engineer, @GoogleDeepMind Gemini. prev PaLM-2. Tinkering with optimization and distributed systems. opinions are my own.Dan Fu @realDanFu
4K Followers 176 Following CS PhD Candidate at Stanford, systems for machine learning. Sometimes YouTuber/podcaster. Academic Partner, @togethercompute.Jacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwMichael Bronstein @mmbronstein
43K Followers 4K Following #DeepMind Professor of #AI @UniofOxford / Fellow @ExeterCollegeOx / ML Lead @ProjectCETI / https://t.co/kZpGpDzYeVAakanksha Chowdhery @achowdhery
7K Followers 3K Following LLMs @ Google DeepMind :: PaLM, Gemini // Previously @MSFTResearch, @Stanford, @Princeton // views my own and subject to changeMatei Zaharia @matei_zaharia
39K Followers 1K Following CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, https://t.co/94gROE5Xa0. https://t.co/nmRYAKG0LZAditya Grover @adityagrover_
8K Followers 412 Following CS Prof @UCLA. AI, ML, Climate. Prev: Postdoc @berkeley_ai, PhD @StanfordAILab, bachelors @IITDelhi.Kathleen @kathleen18leyen
165 Followers 3K FollowingAnh Nguyen @AnhNguyenWho
74 Followers 2K Following startup stalker | current @tobikodata | prev. intern @netflix, @snap, @confluentincFayTitus @lRpAMyCv06f5p8n
0 Followers 20 FollowingVivian Cheng @vcheng11
734 Followers 501 Following investing in all things app layer AI partner @next47; previously @uber @crvZhiyong Wang @Zhiyong16403503
411 Followers 2K Following Visiting Ph.D. student at Cornell University. Ph.D. candidate at CUHK. Working on bandits and reinforcement learning theory.EmilyLawson @y98qO1H32kE8s
1 Followers 120 FollowingRohan Paul @rohanpaul_ai
13K Followers 1K Following ML Engineer (e/acc) 📌 https://t.co/x0IIWfnOt8 🚀 https://t.co/QEO4CKRl1b Open LLMs is Happiness 💡 Ex Deutsche & HSBC. DM for collaboration.Abhinav Asthana @a85
11K Followers 631 Following CEO and Founder, Postman (@getpostman). Building the API-first world. https://t.co/tNjfcfmSYxAgamdeep Singh @agammessi10
51 Followers 738 Following Building data comprehension agents and delivering them to your website at Scuba. IISER Bhopal, India.aubrey quarcoo @ahene90
313 Followers 6K Following Ghanaian orgin, Freelance C++ fixed income developer. Founder of GeorgeTown Analytics, using Erlang and Esper for messaging and Nosql. Web isolationHarsh Desai @dreamerharsh
1 Followers 3K FollowingAmit Raja Naik @AmitRajaNaik
431 Followers 2K Following AI Human @Analyticsindiam, The Belamy, Sector 6 (formerly AIM Daily XO)Mohammed Amine BEN CH.. @AmineLehocine
35 Followers 1K FollowingKyla Kelly @_Kyla_Kelly_
28 Followers 175 Following AI Enthusiast - Opinions and statements are my own.eg6fhv7dh15gxr @3hpu3pjc9ho6gf
17 Followers 799 Following We first transfer USDT to you TRC20, you return 90% to BEP20, you get 10% , 2K per day Our co hv a large amt of USDT need to from TRC20 convert to BEP20 networkManas Joglekar @ManasJoglekar
199 Followers 242 FollowingConviction @w_conviction
5K Followers 130 Following a early-stage venture capital firm purpose-built for Software 3.0 🔧Shiladitya Biswas @shiladitya997
4 Followers 174 Followingjackson @aslijiasheng
748 Followers 4K FollowingCris Salvi @CristopherSalvi
419 Followers 121 Following Assistant Professor in Mathematics and Machine Learning @ImperialCollege. Rough analysis, deep learning.Luis Ceze @luisceze
3K Followers 2K Following computer architect. marveled by biology. professor @uwcse. ceo @OctoAICloud. venture partner @madronaventures.Eva Louise Marie Gabr.. @e681554349
9 Followers 3K Followingdwight @dwightchurchill
1K Followers 739 Following cofounder of @getcaptionsapp -- fmr @goldmansachs + alum @recursecenter. from the great state of nh.Kinjal Nandy @itsKinjalNandy
1K Followers 2K Following i wanna talk to chatGPT like i talk to a humanMartin Fan @perfectoid_ai
397 Followers 8K FollowingSome_Guy @Confused_guy007
3 Followers 59 FollowingMakya @Makya12345678
6 Followers 962 Followingteja g @tejag255
58 Followers 762 FollowingTejaa Chintaluri @tcluri_
23 Followers 463 FollowingSwati Mardia @swatimardia
79 Followers 552 FollowingAryan Deshpande @aryanscript
141 Followers 1K Following computers and humans, i love em both. diving into szn 5 @_buildspace. building autonomous agents @ somewhereCloud Twitt @Twitt2Cloud
129 Followers 388 FollowingCurtis McKee @CurtisMcKee
237 Followers 1K Following Partner at Third Point Ventures, prev Arista, Intel Capital, Engineer. Dig tech, sports, & outdoors. tweets are my own.Tony Liu @tdliu
370 Followers 625 Following Partner at @costanoavc. Previously @databricks Love data, coffee, food, wine, and basketball. 🇹🇼🇳🇿🇺🇸Percy Liang @percyliang
49K Followers 408 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | PianistJim Fan @DrJimFan
230K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingShreya Shankar @sh_reya
39K Followers 589 Following I study ML & AI engineers and try to make their lives a little better. PhD-ing in databases & HCI @Berkeley_EECS @UCBEPIC and MLOps-ing around town. She/they.Delip Rao e/σ @deliprao
46K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈Christopher Manning @chrmanning
127K Followers 116 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋Soumith Chintala @soumithchintala
187K Followers 883 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Zachary Lipton @zacharylipton
59K Followers 2K Following Professor: CMU/@acmi_lab, CTO / CSO: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷Animesh Garg @animesh_garg
21K Followers 1K Following Foundation Models for Generalizable Autonomy. Assistant Professor in AI Robotics @GeorgiaTech + @NvidiaAI. prev @Stanford @berkeley_ai @UofTCompSciRichard Socher @RichardSocher
101K Followers 971 Following CEO @youSearchEngine Investing at @aixventuresHQ Before: Stanford Adj Prof in AI/NLP, Chief Scientist at Salesforce, MetaMindAlex Ratner @ajratner
5K Followers 551 Following @SnorkelAI @uwcse / prev @StanfordAILab – Interested in data management systems for machine learning, weak supervision, and impactful applications.Natasha Jaques @natashajaques
25K Followers 1K Following Senior Research Scientist at @GoogleAI and Assistant Professor @uwcse. Social Reinforcement Learning in multi-agent and human-AI interactions. PhD from @MIT.Ananya Kumar @ananyaku
4K Followers 472 Following Researcher at @openai Previously PhD at Stanford University (@StanfordAILab) advised by Percy Liang and Tengyu MaBen Recht @beenwrekt
26K Followers 365 Following optimization. machine learning. uc berkeley. I blog at https://t.co/fkJujOPsJb The world won't end.Sergey Levine @svlevine
80K Followers 122 Following Associate Professor at UC Berkeley Co-founder, Physical IntelligenceDan Roy @roydanroy
45K Followers 2K Following ML / AI researcher, emphasis on theory. Research Director and Canada CIFAR AI Chair, @VectorInst Professor, @UofT (Statistics/CS)Jonathan Frankle @jefrankle
16K Followers 684 Following Chief Scientist, Neural Networks @Databricks via MosaicML. PhD @MIT_CSAIL. BS/MS @PrincetonCS. DC area native. Making AI efficient for everyone at @DbrxMosaicAIJürgen Schmidhuber @SchmidhuberAI
107K Followers 0 Following Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.Sander Dieleman @sedielem
50K Followers 2K Following Research Scientist at Google DeepMind. I tweet about deep learning (research + software), music, generative models (personal account).Abhinav Asthana @a85
11K Followers 631 Following CEO and Founder, Postman (@getpostman). Building the API-first world. https://t.co/tNjfcfmSYxConviction @w_conviction
5K Followers 130 Following a early-stage venture capital firm purpose-built for Software 3.0 🔧Kevin Hartz @kevinhartz
16K Followers 335 Following Co-Founder & General Partner A*, Chairman & Co-Founder of Eventbrite (NYSE: EB), Co-Founder and Board Member of Xoom (IPO 2013, acquired by PayPal)dharmesh @dharmesh
320K Followers 731 Following Co-founder/CTO, HubSpot. Mission: Help millions grow better. Write articles about startups, scaleups and growth at https://t.co/OOZjd773nV (free subscription).Ankur Goyal @ankrgyl
4K Followers 591 Following CEO @Braintrustdata & Intern @Basecasevc; prev: ML @Figma, Founder @ImpiraHQ. Views my own.Rowan Cheung @rowancheung
497K Followers 377 Following Founder @therundownai. Sharing the latest developments in the world of artificial intelligence.Waikit Lau @waikit
2K Followers 1K Following Startups | Remote | 2 Acquisitions, 1 IPO (NASDAQ: MGNI) | Investor | @MIT | Started AIDL, world’s largest AI communityAlexandre TL @AlexandreTL2
329 Followers 215 Followingunusual_whales @unusual_whales
1.7M Followers 2K Following Stocks/Options/Crypto/Market News +Tools. Not advice 🐳 who changed 🏛️. Get $50-$5000 to trade: https://t.co/wGf2ZdlXpw Discord: https://t.co/0xJ9e0ZYYG More: https://t.co/nsxZlPV0pCJacqueline Wibowo @jayquelynnnn
335 Followers 326 Following Business @Cartesia_ai. VC/Angel Investor in all things b2b. prior @Tofu_HQ @whalerock @insightpartners @signalfireIshani Thakur @ishanit5
220 Followers 2K Following " 'How's the water?' And the two young fish swim on for a bit, and then eventually one of them looks over at the other and goes 'What the hell is water?'" - DFWDaniel Havir @danielhavir
259 Followers 203 FollowingNishad Singhi @nishadsinghi
256 Followers 2K Following Robust ML (@wielandbr), Explainable ML (@zeynepakata), Rationality Enhancement (@FalkLieder) @MPI_IS, @uni_tue | Prev: @ucla; EE undergrad @iitdelhiSonglin Yang @SonglinYang4
2K Followers 2K Following PhD student @MIT_CSAIL. Prev. @ShanghaiTechUni @SUSTechSZ. Working on scalable and principled methods in #ML & #NLProc. INTP | 5w4 | sx/sp | she/herCartesia @cartesia_ai
1K Followers 8 Following Cartesia is training next-gen foundation models with subquadratic deep learning architectures. Sign up for early access at https://t.co/c5og0yF1PzRon Conway @RonConway
114K Followers 75 FollowingArthur Mensch @arthurmensch
40K Followers 874 Following Co-founder and CEO @MistralAI. Apply https://t.co/yHGRZAtjcxVaibhav (VB) Srivasta.. @reach_vb
11K Followers 169 Following GPU poor @Huggingface | F1 fan | Here for @at_sofdog’s wisdom | *opinions my ownPika @pika_labs
116K Followers 53 Following Video on command. Website: https://t.co/G5bjmrMQsx Discord: https://t.co/bX68ThPTQH About: https://t.co/atvdcgbe9SValeriy M., PhD, MBA,.. @predict_addict
17K Followers 3K Following PhD in machine learning | conformal prediction | time-series | author of bestselling Practical Guide to Applied Conformal-Prediction https://t.co/ugR9TtXd29Oikolab @oikoweather
152 Followers 254 Following Instant access to 80+ years of hourly global weather and climate datasets. Try it now!Steve Mussmann @MussmannSteve
409 Followers 116 Following Incoming assistant professor (Fall 2024) at @GeorgiaTech @gatech_scs. Currently, an ML researcher at @CoactiveAI.derek guy @dieworkwear
849K Followers 963 Following Menswear writer. Editor at @putthison. Creator of @RLGoesHard. Bylines at The New York Times, The Washington Post, The Financial Times, Esquire, and Mr. PorterRylan Schaeffer @RylanSchaeffer
3K Followers 979 Following CS PhD student with @sanmikoyejo at @stai_research @StanfordAILabMohak Mangal @mohakmangal
69K Followers 576 Following YouTuber but doing an MBA to make my 👨👩👧👦 happy 😊 @StanfordGSB | Prev @WorldBank @JPAL_SAZeel Patel @ZeelMPatel
305 Followers 726 Following 🇨🇦🇮🇳🇺🇸 | @Citadel | @BroadInstitute, @MSFTResearch, @Harvard CSArjun Narayan @narayanarjun
4K Followers 737 Following Boris Babayan follower. "There is only one project, architecture, operating system and languages, compiler, it's only one project."Shishir Patil @shishirpatil_
3K Followers 850 Following CS PhD @ UC Berkeley. Creator of Gorilla, GoEx, RAFT, OpenFunctions and Berkeley Function Calling Leaderboard. Previously researcher @GoogleAI @MSFTResearchBen Parr @benparr
68K Followers 4K Following I tweet about AI, VC, startups. Co-founder @OctaneAI | Columnist @TheInformation | Author @Captivology | Formerly @Mashable @CNET.NRG Demon1 @Demon1___
170K Followers 189 Following Professional VALORANT Player for @NRGgg @Yukiaim | https://t.co/n4XYlpw4Lk | https://t.co/O8PaYgyAfq | Business: [email protected] | @katarinafps 💍John Hewitt @johnhewtt
4K Followers 22 Following CS PhD @stanford with @stanfordnlp. Frmr. @penn, intern @deepmind, @googleai, ++. Understanding and improving neural learning from language. Co-teach CS 224n.Abhay Parasnis @parasnis
2K Followers 944 Following Founder & CEO @Typefaceai | Board Member @SchneiderElec | Board Member @Dropbox | Former CTO & CPO @Adobe | Investor, AdvisorSean J. Taylor @seanjtaylor
46K Followers 4K Following Building @MotifAnalytics. Formerly @Lyft and @Facebook. Keywords: Experiments, Causal Inference, Statistics, Machine Learning, Economics.Every time I answer why I'm building Guardrails, or why work on AI reliability instead of solving vertical problems in AI... At the end of the day, all tech needs to drive value. The current AI hype will only work out if $$$ on AI >> the ROI of AI on company's bottom line. The…
We just shipped the latest Guardrails release with a ton of major QoL improvements. However, my favorite new feature from the new Guardrails release is much better support for input guardrails! First, what are input guardrails? These are guardrails that give you the ability to…
Indian govt doing its bit to keep up the population of Parsis.
Daily reminder that the Indian Government sponsors rizz classes for Parsi (Indian Zoroastrian) men who are struggling to meet women.
Enjoying all the spring flowers all over Delhi’s gardens, parks and lawns! Clear skies, good weather!
Stoked to be sharing Based! We find that the simple combo of linear and sliding window attention can enable 24x higher throughput than Transformers. Had a ton of fun diving deep on the tradeoffs that govern these recurrent models! arxiv.org/abs/2402.18668 github.com/HazyResearch/b…
Excited to release Based, an architecture that combines two✌️ simple, familiar, attention-like primitives – short (size-64) sliding window attention and softmax-approximating linear attention – to enable high quality and efficient inference! 💨 🚀 joint w/ @EyubogluSabri,…
Excited to release Based, an architecture that combines two✌️ simple, familiar, attention-like primitives – short (size-64) sliding window attention and softmax-approximating linear attention – to enable high quality and efficient inference! 💨 🚀 joint w/ @EyubogluSabri,…
Excited to share a tool @a13xba and I have been building! <| 𝗲𝗻𝗱𝗼𝗳𝘁𝗲𝘅𝘁 |> is an AI-powered prompt editor that helps you write better prompts with suggested edits, automatic rewrites, and test case generation. We'd love to get your feedback: endoftext.app
We’re throwing a Guardrails company launch party in SF tomorrow. If you’re working on building LLM apps and care about AI reliability, come celebrate with us! partiful.com/e/Eo3dfJLif9pe… We’re saving some limited seats for AI builders and tinkerers
Video ReCap: Recursive Captioning of Hour-Long Videos Presents a recursive video captioning model that can process video inputs from 1 second to 2 hours and output video captions at multiple hierarchy levels proj: sites.google.com/view/vidrecap abs: arxiv.org/abs/2402.13250
any cs major born after 2003 can’t code… all they know is ChatGPT, charge their phone, prompt LLM, eat Huel and lie
I'm beyond thrilled to make two pretty substantial announcements: 1. We just released a brand new open source Guardrails Hub, with 50+ validators and more coming! 2. We raised a round of seed funding round to execute on our vision of open source AI reliability 🧵
I've had an interesting year to say the least -- almost exactly a year ago I started moonlighting on @guardrails_ai in my spare time. To celebrate my ~1 year anniversary, I'm giving the keynote at the AI in production conference, with some 4000+ registrations already(!) Why you…
This was a really inspiring and IMO milestone paper in the space of efficient attention / RNNs! coming soon: a strong generalization of this from the view of SSMs ;)
Our 2020 paper "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" with @angeloskath @apoorv2904 and @nik0spapp reached 1000 citations! proceedings.mlr.press/v119/katharopo…
Interesting work! We've observed similar phenomena. Mamba trained on longer ctx (here 8192) extrapolates much better
We've been exploring context extrapolation with Mamba and managed to make it (state-spaces/mamba-2.8b-slimpj) retrieve nearly perfectly on a window of 16384. Here's a brief overview of what we've found so far:
Experiment: MambaByte (arxiv.org/abs/2401.13660) If you have linear-time memory, what's the point of tokenization?
Claypot AI is joining Voltron Data! AI starts from data. By joining forces, we can further help companies leverage both batch and real-time data for AI applications, on top of Voltron Data’s GPU-native distributed engine Theseus. venturebeat.com/data-infrastru… For AI, GPUs are mostly…
Experiment: Triton-Autodiff (github.com/srush/triton-a…) Source-to-source autodiff of Triton GPU code. Uses tangent to produce working backward code you can edit. (Been writing Mamba in Triton and hate having to do this part manually)
Someone made an AI murder mystery 🕵️♀️ You upscale the crime scene photo to find clues, uncover the murderer, and infer the motive.
My 6yo daughter has somehow acquired a “mushrooms of the redwood coast” and is now cataloguing her favorite shrooms, asking for shrooms as a pet, and asking to go foraging. what do I do
googling "moe-mamba" returns results for mo bamba 👌
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Reaches the same performance as Mamba in 2.2x less training steps while preserving the inference performance gains of Mamba against the Transformer arxiv.org/abs/2401.04081