Ayush Kaushal @_AyushKaushal
Open Source LLMs @Mila_Quebec and nolanoAI Z Fellows Former Research @Google, @UTAustin, @IITKGP Joined February 2020-
Tweets86
-
Followers472
-
Following101
-
Likes186
AI community 2014 recognized cramming^ as bottleneck to model language. We got transformers/attention AI community 2024 will* recognize fixed downscaled image resolution as bottleneck to model vision+language ^ cs.utexas.edu/~mooney/crammi… * OpenAI already realized this >=2 yrs ago
Finally, an OpenAI alternative.
Excited to share our work on LLM continual pretraining. What excites me the most: - Continual pretraining can be used to extend model's capabilities to new domains and languages. - When done right, it can avoid catastrophic forgetting (see CodeLlaMa and LeoLM) over English.
Excited to share our work on LLM continual pretraining. What excites me the most: - Continual pretraining can be used to extend model's capabilities to new domains and languages. - When done right, it can avoid catastrophic forgetting (see CodeLlaMa and LeoLM) over English.
LoRD provides an alternative to quantization for LLM compression. The compressed model is differentiable and can use existing (float) GEMMs in PyTorch. Can also be combined with quantization. Monolingual Code LLMs can be decomposed in one-shot without need for retraining.
LoRD provides an alternative to quantization for LLM compression. The compressed model is differentiable and can use existing (float) GEMMs in PyTorch. Can also be combined with quantization. Monolingual Code LLMs can be decomposed in one-shot without need for retraining.
Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA). Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold). arxiv.org/abs/2305.14342 🧵⬇️
That is true. But Copilot and ChatGPT have crossed threshold required to productize AI. We now see exponential growth in *products* using AI. AI is rapidly becoming pervasive. I remember Geoff Hinton saying GPT2 impressed him, not ChatGPT/GPT3. (youtube.com/watch?v=qpoRO3…)
That is true. But Copilot and ChatGPT have crossed threshold required to productize AI. We now see exponential growth in *products* using AI. AI is rapidly becoming pervasive. I remember Geoff Hinton saying GPT2 impressed him, not ChatGPT/GPT3. (youtube.com/watch?v=qpoRO3…)
ELIZA still shows through in ChatGPT. It just takes longer to see it.
Weirdest part about LLaMa's architecture is that it doesn't have any additive parameter terms. It's missing bias in MLP since it uses GatedFFN. It's RMSnorm, unlike Layernorm only has a scaling factor. But residual & RoPe (and Matmuls internally) are doing addition operations.
This is either an intentional April Fool's prank or an unwitting error—LLaMa is not sparse. The reported 4GB RAM usage is a measurement error (check github.com/ggerganov/llam…); on my 16GB RAM M1-CPU, it leads to poor CPU utilization & more time spent accessing memory/swap-space.
This is either an intentional April Fool's prank or an unwitting error—LLaMa is not sparse. The reported 4GB RAM usage is a measurement error (check github.com/ggerganov/llam…); on my 16GB RAM M1-CPU, it leads to poor CPU utilization & more time spent accessing memory/swap-space.
It's been less than half an hour of Twitter's codebase release and 56 issues and 11 Pull Requests have already been opened. This will be exciting - though most of these are requests for rewrites in CPP/Rust and PRs have minor readme corrections.
It's been less than half an hour of Twitter's codebase release and 56 issues and 11 Pull Requests have already been opened. This will be exciting - though most of these are requests for rewrites in CPP/Rust and PRs have minor readme corrections. https://t.co/RHEFqmjIcR
Three new Open Source chat models released today. - Updated 20B GPT-NeoX ( huggingface.co/togethercomput… ) and new 7B Pythia based chat models ( huggingface.co/togethercomput… ) by @togethercompute - Vicuna by @lmsysorg . Here's their Alpaca-style demo: chat.lmsys.org
"Open" AI research is closing in on OpenAI/DeepMind Research. But there are a few things missing: - Larger Context Window. - Human Feedback signals. Latter can now be addressed in a decentralized solution involving models running on personal devices.
"Open" AI research is closing in on OpenAI/DeepMind Research. But there are a few things missing: - Larger Context Window. - Human Feedback signals. Latter can now be addressed in a decentralized solution involving models running on personal devices.
We are making it easier to build applications on LLMs that run locally via Python interface to fast CPP inference. Check out: github.com/NolanoOrg/cfor… In the next 24 hrs we will also be adding CodeGen and LLaMa/Alpaca. Let us know how we can make it easier for you to use.
We are making it easier to build applications on LLMs that run locally via Python interface to fast CPP inference. Check out: github.com/NolanoOrg/cfor… In the next 24 hrs we will also be adding CodeGen and LLaMa/Alpaca. Let us know how we can make it easier for you to use.
Results from experiments on quantizing LLMs. - int3 GPTQ quantized 13B LLaMa outperforms FP16 7B LLaMa. - GPTQ may not always be better than rounding-to-nearest when Zero-offset is fixed. - 2-bit quantization is still a longshot for 13B LLaMa, but it's better for larger models.
Results from experiments on quantizing LLMs. - int3 GPTQ quantized 13B LLaMa outperforms FP16 7B LLaMa. - GPTQ may not always be better than rounding-to-nearest when Zero-offset is fixed. - 2-bit quantization is still a longshot for 13B LLaMa, but it's better for larger models.
Soon personalized models, more powerful than ChatGPT will be residing and running locally on personal devices - every PC, tablet and smartphone. Get excited! We will be sharing more exciting news soon.
Soon personalized models, more powerful than ChatGPT will be residing and running locally on personal devices - every PC, tablet and smartphone. Get excited! We will be sharing more exciting news soon.
The price of ChatGPT/GPT-3.5-Turbo API is same as Curie (13B params), 10x cheaper than the current Davinci (175B params). This signals to ChatGPT being significantly smaller than 175B parameters. Perhaps even 13 Billion parameters.
The price of ChatGPT/GPT-3.5-Turbo API is same as Curie (13B params), 10x cheaper than the current Davinci (175B params). This signals to ChatGPT being significantly smaller than 175B parameters. Perhaps even 13 Billion parameters.
SayIt speaks your language - now supporting Hinglish and more!
Most important takeaway from the LLaMa paper by Meta is that even LLaMa 7B didn't converge after 1Trillion Tokens.
Fe_ijoa7 @Ijoa7Fe72305
0 Followers 421 Following Nice to meet you. My hobbies are reading, food and sports. I like cats😘 I like to meet new friends while traveling🎉🎉🎉Martin Fan @perfectoid_ai
373 Followers 8K FollowingAkash @sakash321
101 Followers 824 FollowingAlexandre Castanet Pr.. @castanet_drane
131 Followers 327 Following Chargé de mission à la DANE Aix-Marseille #JeCodePourMaPlanète #LeCodeSEnvoleqjnyshc4kw87r1 @6udw7oiqq
16 Followers 959 Following We first transfer USDT to you TRC20, you return 90% to BEP20, you get 10% , 2K per day Our co hv a large amt of USDT need to from TRC20 convert to BEP20 networkBurak Yuksel @burakeyuksel
10 Followers 108 FollowingJitendra Sharma @jkumarsharma998
758 Followers 6K Following Curious about Research in AI. NLP and Computer Vision Interest me. Curious about truth and existence. Views are personal.Ellie @Ellie0089291356
11 Followers 502 FollowingPankaj Gupta @pankaj_ipynb
25 Followers 843 Following The English language can not fully capture the depth and complexity of my thoughts. So I'm incorporating Emoji into my speech to better express myself 😉.Arif Ahmad @ArifAhm92263086
164 Followers 5K Following All things AI, Computer Science and Circuits! Prev. @GoogleAISinchani Chakraborty @sinchani
285 Followers 3K Following Pursuing Masters in CSE @IIT_Kharagpur Interests: NLP, DL, MLTony Ojelel @fire_tony123
690 Followers 4K Following Experienced Software Engineer & Mechanical Engineer. Algorithms & Open source enthusiast. Proficient in multiple programming languages.Sai Bhavana @saibhavana20
23 Followers 116 FollowingSundara Valli Natchiy.. @06e458d6d7b74ff
31 Followers 692 FollowingRed Panda @RedPandaya
62 Followers 1K FollowingChipmunk @AlvinMesser
118 Followers 783 Following3v2k @3v2k1
4 Followers 338 Following Undergrad student, Interested in AI/ML & DL Fascinated by intuitive mathematics behind deep learning algorithms.Peter Morales @PeterMoralesX
212 Followers 2K Following Founder of funded Stealth AI Startup. Interested in AI development at the edge? DM.Callum Higgins @higgins40499
14 Followers 195 Followingcrk1014 @crk1014
16 Followers 52 FollowingSigmally IO @SigmallyI
33 Followers 142 FollowingSparsh Burman @SparshBurman001
19 Followers 225 Followingashwin deshpande @ashwindeshpand6
11 Followers 457 FollowingPrashant Dixit @Prashant_Dixit0
116 Followers 762 Following AI/Computer Vision/LLM Researcher | Open-source ML | Building cool and exciting Stuff Connect- https://t.co/8wrqNPc2kPSiddharth VS @SiddharthVS6
8 Followers 63 Following Engineer, Thinker, Artist, Reader. IIT Kharagpur Alumnus. Developer.Saurav Sahay @sauravsahay
511 Followers 1K Following Research Science Manager, Multimodal Dialogue and Interactions at Intel LabsSwaroop CH @swaroopch
3K Followers 2K FollowingTushar @TKrishnia60980
86 Followers 285 Following Full Stack Web Dev, Ethical Hacking, Open Source⌨️🖱️ Building a Startup CS VIT(2023-27) DM is open📧Aviv Sinai @avivsinai
384 Followers 2K Followingraul @raul314314
178 Followers 3K Following El hombre inteligente busca una vida tranquila, modesta, defendida de infortunios; y si es un espíritu muy superior, escogerá la soledad" Arthur SchopenhauerDevr Inc. @DevrOfficial
260 Followers 5K Following Devr is a new Internet protocol for the governance of decentralized privacy networks (DPN), powering a new era for data sharing economiesShashank @5hv5hvnk
165 Followers 860 Following pre doc @prosemsft working mostly on ml, little on pl. | TIET23⚡️Neil⚡️ @starshipneil
456 Followers 1K Following infosec/ai, machine learning, electrical engineering, computer scienceJP Pugliese @JPPugliese
546 Followers 3K FollowingHariharan @harihar90
0 Followers 1K FollowingPedro Azevedo @Eppie_vux
164 Followers 1K FollowingSanyam Chaturvedi @SanyamChat15281
202 Followers 2K FollowingAaron Defazio @aaron_defazio
6K Followers 356 Following Research Scientist at Meta working on optimization. Fundamental AI Research (FAIR) teammartin_casado @martin_casado
49K Followers 2K Following GP @ a16z ... questionable heuristics in a grossly underdetermined worldAndrew Curran @AndrewCurran_
10K Followers 7K Following Atypically Friendly - I write about AI and human creativity. Will periodically make extremely unusual arguments.Quintin Pope @QuintinPope5
3K Followers 185 Following ML researcher focusing on natural language modeling and alignment.Arc Institute @arcinstitute
22K Followers 24 Following A new scientific institution for curiosity-driven biomedical science and technology.Irina Rish @irinarish
9K Followers 992 Following prof UdeM/Mila; Canada Excellence Research Chair; AAI Lab head https://t.co/UzlrC7ZrGF; INCITE project PI https://t.co/0rV7szd7rH; CSO https://t.co/XDhj6MEtUjGeorge Hotz 🌑 @realGeorgeHotz
247K Followers 172 Following President @comma_ai. Founder @__tinygrad__Vipul Ved Prakash @vipulved
5K Followers 839 Following Building an AI supercomputer out of spare internet parts. Founder, CEO @togethercomputeTDM (e/λ) @cto_junior
10K Followers 615 Following L-12 in ZIRP | Larry Ellison's bloodboy | One-man soonicorn | Softbank & FIITJEEs love child Google doc notes ⬇️Beff Jezos — e/acc .. @BasedBeffJezos
101K Followers 2K Following chief accelerator & founder @ e/acc // thermodynamic priest // Kardashev gradient climber // memetic warlord // building @extropic_aiBalaji @balajis
1.0M Followers 4K Following Immutable money, infinite frontier, eternal life. #BitcoinBojan Tunguz @tunguz
186K Followers 7K Following Machine Learning ex Nvidia. Kaggle Quadruple Grandmaster. Data Scientist. Physicist. Catholic. Husband. Father. Stanford Alum. e/xgb. XGBoost.eth. AMDG.Del Complex @DelComplex
5K Followers 67 Following An Alternate Reality Corporation accelerating human potential through AI, neural prosthetics, clean energy, fundamental scientific researchGuillaume Verdon @GillVerd
52K Followers 3K Following Founder & CEO @Extropic_AI • prev: Physics & AI R&D @ (Alphabet X / Google) • Founder @ TensorFlow Quantum • (PhD(ABD) + MMath) @ (IQC / UWaterloo / PI) • e/accExtropic @Extropic_AI
28K Followers 27 Following ... . .-.. ..-. -....- .- ... ... . -- -... .-.. .. -. --. / .. -. - . .-.. .-.. .. --. . -. -.-. . / ..-. .-. --- -- / - .... . / ..-. ..- - ..- .-. .Aydin @aydinwastaken
3K Followers 39 Following Electronic warfare by day, experimental cybernetics by night | @Saronic @8VC @GenCybernetics | 20DuckAI @TheDuckAI
622 Followers 8 Following An open-source ML research community at Discord: https://t.co/7YDTo6Mo1GIlya Sutskever @ilyasut
370K Followers 2 Following towards a plurality of humanity loving AGIs @openaikyutai @kyutai_labs
6K Followers 6 Following📅 ThursdAI @thursdai_pod
2K Followers 140 Following Welcome to 🎙️ ThursdAI Your weekly AI spaces, newsletter, podcasts and community Hosted by @altryne and available on https://t.co/xaPyX72YelSkunkworks AI @skunkworks_ai
3K Followers 7 Following Accelerating Open-Source AI https://t.co/B5v2ohlIbH https://t.co/9TNVZeJYjd no website for our groupVoyage AI @Voyage_AI_
2K Followers 164 Following Building embedding/vectorization models, customized for your domain and company, for better retrieval quality https://t.co/MEAhTpBQqdChris Lattner @clattner_llvm
79K Followers 181 Following Building beautiful things like Mojo🔥 and MAX @Modular, lifting the world of production AI/ML software into a new phase of innovation. We’re hiring! 🚀🧠Hattie Zhou @oh_that_hat
5K Followers 764 Following Finding \hat{y} Give me anonymous feedback: https://t.co/7aBNrpbad8Ideogram @ideogram_ai
38K Followers 0 Following Helping people become more creative. It's pronounced eye-diogram. Join our lovely community at https://t.co/aKDNl4OOQf.main @main_horse
8K Followers 464 Following AGI Believer. Haven't applied @OpenAI. Likes are not always endorsement.Nikhil Thorat @nsthorat
10K Followers 2K Following Co-founder of Lilac AI (@lilac_ai), now joining @databricks. Past: Co-created TensorFlow.js and Know Your Data. Google Brain // PAIR // Responsible AIAlignment Lab AI @alignment_lab
11K Followers 3K Following Devoted to addressing alignment. We develop state of the art open sourced AI. https://t.co/6aJDLUvuU5Sergey Levine @svlevine
79K Followers 122 Following Associate Professor at UC Berkeley Co-founder, Physical IntelligenceJeremy Howard @jeremyphoward
221K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordAlexia Jolicoeur-Mart.. @jm_alexia
10K Followers 1K Following AI Researcher at the Samsung SAIT AI Lab 🐱💻Susan Zhang @suchenzang
20K Followers 504 Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for compute.turboderp @turboderp_
369 Followers 18 FollowingNaveen Rao @NaveenGRao
28K Followers 782 Following VP GenAI @Databricks. Former CEO/cofounder MosaicML & Nervana/IntelAI. Neuro + CS. I like to build stuff that will eventually learn how to build other stuff.the tiny corp @__tinygrad__
33K Followers 63 Following We make tinygrad. Our mission is to commoditize the petaflop.cloud @cloud11665
4K Followers 1K Following SIMD fan | ctf player | ex OI-er | accelerate. CEO and co-founder @figura_labs DM FOR PRIVATE BETA ACCESSNous Research @NousResearch
18K Followers 30 Following The AI Accelerator Company. https://t.co/vrD0aDJeto@karpathy One goal of tinygrad is that it can *output* this code. We currently output a EfficientNet runner in C, no reason we can't do training. With a bit of work we can make the outputted code quite readable.
Engineers can do humanities better than humanities students can do humanities
Reward model and finetuning (SFT & RLHF) dataset are what we really need, but corporations make you think that you want LLM.
@GrantSlatton Same, I want to buy a home just so I can put solar panels on it lol
should we crowdfund openai to hire an engineer to keep us logged in?
should we crowdfund apple to hire an engineer to make personal hotspot work reliably?
AI will magnify the already great difference in knowledge between the people who are eager to learn and those who aren't.
Socialists are economic incels. They cherish the idea that the game is rigged so much that they'd rather talk about that than about how to improve their situation.
The reality is we could've easily mastered energy, food, and material abundance with 1970s era Technology. Instead, we loaded up on virtue signaling, stakeholder engagement, and regulatory capture. I call this general phenomenon "The Blight"
@AravSrinivas Or play the longer game and invest in Extropic 😉
A significant number of major tech figures have given TERRIBLE activists a LOT of money to do HORRENDOUS things over the last decade. What are they going to do to fix it? This should be question #1 at every public event they go to.
Vitalik had a chance to unify the AI and crypto camps and he just fumbled massively. Will I have to step up and unite the tribes myself? It seems so.
The masculine urge to disappear and go into a hyperbolic time chamber to cook alien-level tech and re-emerge with a multi-decadal lead.
China interfered constantly with GitHub to make it barely usable for Chinese residents: DNS poisoning, throttling, blocking subdomains. They don't let us run our apps there. We shouldn't let them run important apps in our country.
@itsandrewgao call me when it starts closing good first issues
Everyone is piling on Google, but I am personally impressed: it’s rare to see a company of that size testing in production these days.
Good fucking lord. What a travesty. Requiring government approval to deploy a model. This is the inevitable outcome of rhetoric like Vinod’s. It’s anti innovation. It’s anti public. And we all loose. Keep AI open!!!!