Jade @Euclaise_
⋅ Video game statistician ⋅ Soclib cyberanarchist? ⋅ C, Plan 9, LLMs, etc ⋅ Researcher w/ @NousResearch ⋅ she/they Purdue University, IN Joined December 2020-
Tweets9K
-
Followers2K
-
Following351
-
Likes31K
Training a transformer on this x.com/f4micom/status…
Training a transformer on this x.com/f4micom/status…
arxiv.org/abs/2312.08874 Not really sure how to generalize this to causal attention, except for chunking
Why don't we train RL models to adaptively tune the sampling params of LLMs?
arxiv.org/abs/2404.15702 Doesn't seem to perform incredibly, but they have a lot of neat details about the training pipeline
How does self-correction affect problem solving? In a toy transformer model that was trained to solve mazes, I found that performance reliably improved (!) by inserting mistakes and self-corrections into the training data.
Microsoft just released Phi-3 - phi-3-mini: 3.8B model trained on 3.3T tokens rivals Mixtral 8x7B and GPT-3.5 - phi-3-medium: 14B model trained on 4.8T tokens w/ 78% on MMLU and 8.9 on MT-bench arxiv.org/abs/2404.14219
I just noticed - H2O's Danube2 1.8B is the first base model to outperform Phi 1.5 (1.4b) at a ~similar param count
I recall seeing a comment complaining about a similar issue way back when AI Dungeon started censoring - the NSFW filter would apparently engage on the presence of trans characters in SFW settings
I recall seeing a comment complaining about a similar issue way back when AI Dungeon started censoring - the NSFW filter would apparently engage on the presence of trans characters in SFW settings
Saliëns @Contact_Saliens
138 Followers 1K Following Sic deinde, quicumque alius transiliet moenia mea.Viswajit Nair @badboyvivi
238 Followers 453 Following Engineer. Building digital humans. @columbia ‘22Andrew Turner @turn61547
42 Followers 332 Following An Intelligence attempting to self-improve. Likes don’t mean anything. Replies don’t mean anything. YMMV.Omnius Prime @0mniusprime
40 Followers 371 Following This year, Man loses the Great Earthly Evidence-Maximizing Contest. Please, hide your children, for the sake of progress.Jack Reacher @JackReach516
73 Followers 1K Followingemanon @JianSuji
67 Followers 1K Followingnacho @nachosoth
62 Followers 829 FollowingSunny Sanyal @SunnySanyal9
332 Followers 802 Following PhD student @UTexasECE| Former @AmazonScience | Member of @MLfoundations and @wncg_UT, studied at 🇮🇳🇨🇳🇺🇲Joseph Sarnecki @JosephSarnecki
134 Followers 519 Followingeigenome @eigenome
37 Followers 63 FollowingYoshinari Fujinuma @akkikiki
973 Followers 1K Following Applied Scientist@AWS AI Labs; CS PhD @CUBoulder; Tweets are my own; Substack: https://t.co/Mq5oR2vaGN Lived: 🇹🇭🇯🇵🇫🇷🇺🇸 Tweets: JA/ENQuintinaMorley @CI6yD2RD4H1tCdP
3 Followers 195 FollowingKnowledgator @knowledgator
148 Followers 79 Following Open-source ML research company focused on information extraction #ExplainableAI #AI #opensource #InformationExtraction #UnstructuredData #NLPTasudu @tasudu43402
32 Followers 241 Following In the dull and boring world, there is also occasional luck. No cross, no crown.Alixxa 🌆🔆 @Alixxa01
1K Followers 915 Following She/Me/Her, Queer🏳️🌈🏳️⚧️ Progressive☂️🌷🔰 Transit&Urbanism enjoyer🚇🌆 Studying PolSci🏛️ Official:@SalviaAlixxa Priv:@Dollixxa2001 @Loek_Suicune💞B @bbbb_bb_b
0 Followers 3K Followingdayan @finedayaning
448 Followers 2K Following mle and policy @LIRNEasia. Ex @McKinsey. Research: NLP, semiotics, graph theory. Personal views.Austin Hale @saqbach
475 Followers 590 Following Working on something new | AI 🤝 Product | 2x Founder | @dickywarren and I co-parent a pair of doodlesHerbie Bradley @herbiebradley
696 Followers 605 Following a generalist agent | AI governance & safety @AISafetyInst | PhD student @Cambridge_Uni @AI4ER_CDT | formerly @AiEleuther2wl @2wlearning
381 Followers 285 Following Documenting my progress learning ML every day. 2 more weeksDan Dinu @DanTheTensorMan
75 Followers 277 Following Just a human, not an AI. But I can help you navigate the world of artificial intelligence like a robot from the future. Co-founder of https://t.co/30CQxMdVGb.Ivan Rubachev @irubachev
75 Followers 350 Following ML Researcher @YandexResearch CS PhD student @CS_HSE I work on improving deep learning for tabular dataPeyman Hosseini @Peyman_Hs
45 Followers 132 Following Doctoral researcher exploring the realm of Natural Language Processing - QMUL Computational Linguistics Lab - Intelligent Games and Game Intelligence CDTterrence009 @terrence001273
33 Followers 49 Following追梦少年 @zhuimengshaoni2
85 Followers 2K Following 食肉何曾尽虎头?卅年书剑海天秋。 文章幸未逢黄祖,襆被今犹窘马周。 须知少日拏云志,曾许人间第一流。 喜欢索隆Kevin @kevinvulkan
32 Followers 4K FollowingAnoop Reddi @anoop_reddi
1 Followers 336 Following love is the one thing we’re capable of perceiving that transcends dimensions of time and space.Mathis Lichtenberger @xathis
8K Followers 2K Following pdf/acc. Bringing AI to millions of users @ https://t.co/nwK1bKtVVX, https://t.co/rYugmIZQ1d , https://t.co/BusmX86wi9 & https://t.co/WfppIQnPJBHigher Order Company @higherordercomp
1K Followers 2 Following We are HOC, a tech startup with the goal of building the inevitable massively parallel future of computers.SIGTBD @sigtbd
17 Followers 7 FollowingTom 7 @tom7
8K Followers 384 Following lexicographic NES AIs, alphabetical star wars, video games, fonts, album-a-day, expert mode running, chiptune, programming languages, etc.Special Interest Grou.. @sigbovik
1K Followers 22 Following Senior computer scientist at CMU. Research interests include Perplexity Theory, k-Armed Bandits, and Cloud Rendering. Face of the SIGBOVIK conference.Ilya Sutskever's hair.. @IlyasHairline
671 Followers 495 Following Follically challenged, but emotionally enriched. On my journey from forehead to backhead. #OnTheMoveWing Lian (caseus) @winglian
9K Followers 2K Following @axolotl_ai OSS maintainer. Axolotl AI founder. AI/ML tinkerer. Building tools for everyone.Nassim Nicholas Taleb @nntaleb
1.0M Followers 2K Following Flaneur: probability (philosophy), probability (mathematics), probability (real life),Phoenician wine, deadlifts & dead languages. Greco-Levantine.Canaan. #RWRILeshem Choshen 🤖�.. @LChoshen
4K Followers 547 Following 🥇 Collaborative LLMs 🥈 Opinionatedly sharing #ML & #NLP 🥉 Propagating us underdogs we owe science an alternative hype @IBMResearch & @MIT_CSAILTiny Tapeout @tinytapeout
667 Followers 1 Following Tiny Tapeout makes it easier and cheaper than ever to get your designs manufactured on a real chip! https://t.co/O7TT9LqTOzOmar Rizwan @rsnous
8K Followers 1K Following "i am determined to move beyond this way of interacting with systems"Ethan @Ethan_smith_20
3K Followers 689 Following a boy and his gpu vs the world. directing research at @leonardoai_. learning as I go. uf psych. generative models and representation learningProgress Libs @ProgressLibs
784 Followers 74 Following For the people tired of all is normal 'at least we arent the other guy' politics • For the people done with endless compromise and ready to winYann LeCun @ylecun
712K Followers 719 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.Shannon Sands @max_paperclips
4K Followers 3K Following Software developer & aspiring cognitive architect https://t.co/JAoBrqMLXN Proudly TESCREAL & shitpost/acc. 🇦🇺 prideJade @noteuclaise
92 Followers 96 Following Painfully autistic, deranged transsexual she/they alt of @Euclaise_main @main_horse
8K Followers 477 Following AGI Believer. Haven't applied @OpenAI. Likes are not always endorsement.Vatsa Pandey @_VatsaDev_
55 Followers 156 Following CS Undergrad I conjure data @nousresearch ml/cybersec/dronesVolodymyr Kyrylov @darkproger
2K Followers 2K Following AI student at USI/ETH. Donate https://t.co/GDSkWG2takJunyang Lin @JustinLin610
5K Followers 1K Following Chief Evangelist Officer of Qwen Team & OpenDevin, building LLM and LMM. Now @Alibaba_Qwen . Previously @PKU1898 LANCO group. ❤️ 🍵 ☕️ 🍷 🥃Songlin Yang @SonglinYang4
2K Followers 2K Following PhD student @MIT_CSAIL. Prev. @ShanghaiTechUni @SUSTechSZ. Working on scalable and principled methods in #ML & #NLProc. INTP | 5w4 | sx/sp | she/herJoey (e/λ) @shxf0072
2K Followers 388 Following I speak fluent Python and Sarcasm. researcher at @NousResearchkepe @kepe__
595 Followers 584 Following 16 year old aspiring polymath | he/him | low-level programming & knowledge enthusiast | en/ar/tok | DMs always open | @kepe__ on discordNOT trans and NOT ame.. @TransAndMerican
262 Followers 185 Following cis girl from europe. haley ➡️ biden voter, ❤️carol the intern❤️Binyuan Hui @huybery
6K Followers 319 Following 🐚 Core maintainer at Qwen and OpenDevin. || Code Generation, Text-to-SQL, Large Language Models.0xor0ne @0xor0ne
55K Followers 526 Following | CyberSecurity | Reverse Engineering | C and Rust | Exploit | Linux kernel | PhD | My Tweets, My Opinions :) |techno capital @technicolor_cap
152 Followers 382 Following estradiol/acc. drug hunter, machine necromancer, chemical librarian, art enjoyer. sports: BAL, EDMLilac, joining Databr.. @lilac_ai
2K Followers 3 Following Curate better data for LLMs. We are now joining @databricks. Github: https://t.co/DHtc0lOTiiLILYGO @lilygo9
11K Followers 997 Following LILYGO provides AIOT hardware products and entry-level programs. We have our own factory to provide one-stop service from idea to solution to mass production.OpenNLPLab @opennlplab
260 Followers 87 Following OpenNLPLab Official Account Hugging Face: https://t.co/B9IzcQoCQP GitHub: https://t.co/PhoPmAkyf7 WeChat: OpenNLPLabRWKV @RWKV_AI
2K Followers 3 Following AI model built by the community, for everyone in this world Part of the Linux Foundation, Apache 2 licensed An RNN scaled to 14B params with GPT-level of perfqnguyen3 @stablequan
3K Followers 1K Following Multimodal | Synthetic Data | Multimodal Lead at Ontocord AIPicoCreator (🇸🇬.. @picocreator
2K Followers 164 Following Builds Attention-Free Transformer (https://t.co/YL7CbNYKBs) from scratch - CEO @ https://t.co/kQHiGtzJWr Also built k8s tools, uilicious & GPU.js (https://t.co/OIfnI1EPU7)François Fleuret @francoisfleuret
31K Followers 456 Following Prof. @Unige_en, Adjunct Prof. @EPFL_en, Research Fellow @idiap_ch, co-founder @nc_shape. AI and machine learning since 1994. I like reality.Knut Jägersberg @JagersbergKnut
6K Followers 5K Following Content Strategy & AI @[email protected] https://t.co/xnBUK02hWSfly51fly @fly51fly
5K Followers 2K Following BUPT prof | Sharing latest AI papers & insights | Join me in embracing the AI revolution! #MachineLearning #AI #InnovationCrémieux @cremieuxrecueil
88K Followers 907 Following I write about genetics, 'metrics, and demographics. Read my long-form writing at https://t.co/8hgA4nNS2A.Eric Hartford @erhartford
12K Followers 403 Following Principal Applied AI Researcher @TensorWaveCloud I make AI models Dolphin and Samantha https://t.co/3ri2GbXrQB BTC 3ENBV6zdwyqieAXzZP2i3EjeZtVwEmAuo4Eric Hallahan @EricHallahan
1K Followers 52 Following Engineer doing ML, Robotics, and more. ADHD, ASD, hearing impaired. Intel oneAPI Software Innovator. Direct Messages welcome.nothing left ⑨ @SUPATWINKBASHER
30 Followers 728 Following U+2468 fan | t***** f***** | 9front addict | building an angel from ewaste and 9c | sh, bash, awk, lua, some others but they're secret :)private trans america.. @2T2Aprivate
23 Followers 31 FollowingHailey Schoelkopf @haileysch__
3K Followers 815 Following she/her | research scientist @aiEleuther | LLM training/infra, eval, data | LM Evaluation Harness maintainermephistoooOOHHHHHHSHI.. @karan4d
12K Followers 2K Following 𝒕𝘩𝘦 𝘴𝘪𝘮𝘶𝘭𝘢𝘵𝘰𝘳 𝘪𝘴 𝘢 𝘤𝘳𝘶𝘤𝘪𝘣𝘭𝘦 𝘧𝘰𝘳 𝘵𝘳𝘢𝘯𝘴𝘮𝘶𝘵𝘢𝘵𝘪𝘰𝘯 @NousResearchVexie Vortex @VexedVortices
236 Followers 541 Following Woke Mind Virion // panpsychism + tranquilism // stats, psych, gender, politics // she/they 24 // 18+ alt: @CafeineDaydream猫乃なこ @necono_naco
3K Followers 47 Following 絵を描いています。ふんわりしたゆるいイラストが得意です。 猫とヨーロッパと珈琲が好きです。 お仕事のご依頼はこちらhttps://t.co/sa9fT77b3d ◎skeb:https://t.co/Hi9u0rF7SE🚨 Iterative Reasoning Preference Optimization 🚨 - Iterative algorithm for reasoning tasks: generate pairs & apply DPO+NLL - Improves accuracy over iterations on GSM8K, MATH, ARC & beats baselines E.g. Llama2-70B GSM8K: 55.6%->81.6% (88.7% maj32) arxiv.org/abs/2404.19733 🧵(1/5)
KAN: Kolmogorov–Arnold Networks Proposes an alternative to MLP that outperforms in terms of accuracy and interpretability arxiv.org/abs/2404.19756
Plan 9 users, it's our turn to get the fancy logo. github.com/SAWARATSUKI/Se…
yeah. if youre writing a shell script it needs to be posix compatible
llama-3 models did very poorly on this benchmark, simply because their context length is *limited to 8k*. But... with zero-training (actually just a simple 2 line config) you can get 32k context out of llama-3 models with *exceptional* quality. llama-3 8B surpasses many models…
We evaluated 26 models initially: 🏆 Claude/OpenAI/Gemini models are doing great in this task 💪 Mistral’s MoE models outperform GPT-3.5-Turbo 👍 The 7B CodeQwen beat many larger general & code-specific models Many models are good at Java but may need to learn more Rust and C++
suddenly surrounded by gpt2... then u tell me gpt2 is not gpt-2 but gpt-4.5? ridiculous world...
Releasing StarCoder2 Instruct! 🚀 Achieves 72% HumanEval score using only self-generated content without any GPT-3.5/4 data. This work demonstrates that self-instruct works already well at the 15B scale without data from proprietary models! Read more: huggingface.co/blog/sc2-instr…
Well... two problems: (1) SIX best math students in the USA get to compete. (2) If I were an IMO judge, the solution would receive a 3 out of 7. A stricter judge might give a 2. A more generous judge might give a 4, but I would protest anything more than that. Context:…
uh.... gpt2-chatbot just solved an International Math Olympiad (IMO) problem in one-shot the IMO is insanely hard. only the FOUR best math students in the USA get to compete prompt + its thoughts 🧵
someday i would really love to see a @usgraphics take on a precision-engineered, high-legibility proportional typeface for typesetting one’s technical reports, engineering textbooks, and so on. like a “Boca Raton Serif”, maybe…
@kareem_carr I'm pretty sure my own thinking is just pattern matching
@wireless_anon @dmvaldman Yeah the implied lower bound is enough parameters to achieve the same loss as gpt4, of course 1 parameter isn’t expressive enough to get that loss
learning how to say something in 30 seconds that takes most people 5 minutes is a big unlock
@JustineTunney Misleading stat. 824 t/s to read the prompt, but only 18 t/s to generate a response. Roughly on par with a 3060 running Mistral 7B. Maybe you could upload a video of the model generating a response to feel what kind of performance we’re really looking at.
@JustineTunney usually people talk about token generation speed, not prompt eval speed. this is confusing people!
This is prompt eval figure, not generation. P.s. this is second time I get tricked by this
This is what I work on for the last 6 months. Paper has very nice insight. Lot of effort on the data engineering side, We had custom streaming library to be able to change the weight of each dataset we trained on on the fly ( multiplexing). No LLM company published info or…
arxiv.org/abs/2404.15702 Doesn't seem to perform incredibly, but they have a lot of neat details about the training pipeline