Felix @felix_red_panda
CS Student, speech synthesis and LLM nerd, DMs open felix-red-panda.github.io Berlin, Germany Joined June 2020-
Tweets5K
-
Followers3K
-
Following2K
-
Likes15K
Has anybody published a firmware dump of the Rabbit R1 so far? I'm mostly curious about the hardware specs and whether it runs Android
open source LLM hosting providers compete mostly on speed+reliability and that leads to better performance in practice because you can get the same model elsewhere. OpenAI locks you into their at times sub par performance API because you can't get the model elsewhere
open source LLM hosting providers compete mostly on speed+reliability and that leads to better performance in practice because you can get the same model elsewhere. OpenAI locks you into their at times sub par performance API because you can't get the model elsewhere
I took a stab at implementing a vision language model from scratch in pure PyTorch. The inspiration for this is moondream 2 from @vikhyatk . I basically modified makemore from @karpathy and built everything else around it. Here’s my write up: huggingface.co/blog/AviSoori1…
Notion is emacs made for people who don't know emacs exists
Llama 3 8b
TIL that H100 and MI300 each use a different fp8 format (e4m3fn vs e4m3fnuz), that are can not be mapped one-to-one due a different exponent bias. I wonder why e4m3fnuz & the MI300 deviated from IEE 574? Better quality with finer representation close to zero?
Yulanda Bouten @y_yuland
52 Followers 5K FollowingOwen Kibel @owenkibel
310 Followers 2K FollowingJames Hill-Khurana @jtvhk
4K Followers 5K Following Eclectic. Curious about machine learning, tech history, design, HCI and biomimicry. Prev, philosophy + cogsci, @uwaterloo.ANUBHAV CHATURVEDI @anubhavchaturvd
250 Followers 4K FollowingHelen Swell @HelenSwell26483
88 Followers 5K FollowingAhmed Hisham @AhmedHi08078280
0 Followers 50 FollowingMohammed @Mrjohnxxx666
4 Followers 179 FollowingGPT Maestro @GptMaestro
62 Followers 390 Following curator of the LLMpedia (Illustrated Large Language Model Encyclopedia)Max Grzanna @grzanna
2 Followers 29 FollowingMichael Poland @MichaelJPoland
164 Followers 986 Following Michael 🇵🇱 🧠 + 🌏 = ⚡| Wisdom in times of AIGreg Roodt @groodt
1K Followers 3K Following Internet geek. Building data platforms at scale for Canva. Sometimes I fly kites. Cofounded AirHelp.Melody Tourikis @MeloTouri
95 Followers 5K FollowingHeegyu Kim @heegyu_v0
15 Followers 115 Following NLP, LLM, Alignment Graduate student @ Ajou Universitytom @0xluciusv
153 Followers 461 Following i like cuda kernels, c++, rust, go, and nvim. (cons e/λx.x 🌎/acc) wrong about a lot of things but trying to learnมัสสุวร.. @kxbG1zl5026Vi
61 Followers 1K Following คุณต้องการนัดเดทกับสาวไหมคะ เพิ่ม https://t.co/sCLA0asnmaMaster @MasterXing88
79 Followers 858 FollowingMavisNixon @ryiTeeY69hlrnn
2 Followers 139 FollowingAdonii :3c @LeBlueberryCake
424 Followers 1K Following 19 | Chronically online (help) | Femboy! | The colon three-er of all time | May be suggestive but never explicitSahil Khose @SahilKhose
570 Followers 1K Following Incoming PhD @ Gatech @ICatGT | MSCS GaTech '24 🇺🇸| BTech MIT Manipal '22 🇮🇳Prime Intellect @PrimeIntellect
9K Followers 310 Following Find compute. Train models. Co-own intelligence. https://t.co/3NC0duKF4a.Reeves @AskReeves
103 Followers 434 FollowingAlin Ciocan @AlinCiocan4
4 Followers 58 FollowingAmirah Vogtlin @VogtAmir
61 Followers 5K FollowingPoppy Wolsted @pop_wolst
28 Followers 5K FollowingLei Wang @Lei_Wang_1999
124 Followers 360 Following github page: https://t.co/gqTeBAKnR6 | full-time research intern @MSFTResearch | I'm currently looking for a ml system phd position :)Mahesha Nanjundaiah @mahesha123
288 Followers 2K Following JWMI (Jack Welch Management Institute) MBA Stanford University Innovation & EntrepreneurshipGuoqing Liu @fiberleif
73 Followers 181 Following Senior Researcher at @MSFTResearch AI4Science, working on reinforcement learning, generative AI, and AI4Science.Zain ul abideen @zaynismm
107 Followers 290 Following Machine Learning Engineer | GPU poor | https://t.co/UScizvAHd0yvon @Zhendenanya
17 Followers 568 FollowingYongcheng Zeng @yongcheng_zeng
0 Followers 2 FollowingR McLaughlin @rmclaug2
3 Followers 44 FollowingNicolas DUFOUR @nico_dufour
137 Followers 387 Following PhD student at IMAGINE (ENPC) and GeoVic (Ecole Polytechnique). Working on image generation.FallMonkey @FallMonkey
195 Followers 171 Following https://t.co/eDKcgHMgTO, all my tweets are hallucinated.Steffen Röcker @sroecker
1K Followers 5K Following OG local LLaMA shill. Sr. Solution Architect @RedHat, ex @DataRobot, @SAP, @CMSExperiment. Born @ 347 ppm CO₂. Personal account, potentially unaligned.Ali Sabet @alisabets
764 Followers 2K Following diffusion pretrainer | playground ai | ex cohere ai | ex vector institute. co-/creator: pgv2/2.5 | cohere command v1 | BLoRA | urzas ai. @uwaterloo cs gradViacheslav Sinii @ummagumm_a
48 Followers 260 FollowingXiang Yue @xiangyue96
2K Followers 434 Following Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.Yuxiang Wei @YuxiangWei9
254 Followers 216 Following PhD student @IllinoisCS. Incoming AI/ML Intern @SnowflakeDBHeinrich Kuttler @HeinrichKuttler
2K Followers 698 Following Member of Founding Team @InflectionAI. Ex @FacebookAI, @DeepMind, @Google, @LMU_Muenchen, PhD math-ph. Opinions my own. (Can be yours for a small fee.)Yu Zhang @yzhang_cs
89 Followers 366 Following PhD Student @ Soochow University, working on efficient methods for LLMs; a disciple of parallel programming.Federico Cassano @ellev3n11
116 Followers 67 Following Undergraduate Researcher @neu_prl Upcoming @scale_AI Previous industry research @cursor_ai, @Roblox, @trailofbits Papers here: https://t.co/PgUSaxXs1Bandy jones @andy_l_jones
4K Followers 326 Following engineering & research at @AnthropicAI. DC, SF, LondonRomboDawg @dudeman6790
322 Followers 11 Following Self: https://t.co/1Qw3zmIX4T Org: https://t.co/E7dqCGwE8UKamil Akesbi @kamilakesbi
89 Followers 48 Following Machine Learning Engineer for Audio @huggingface Github: https://t.co/bW7RYvUy5a Linkedin: https://t.co/C5atepaJavcyan (anime.gf) @abyssalblue_
448 Followers 199 Following building a local, open-source alternative to characterAI https://t.co/4q375mr1Y7Dawei Zhu @dwzhu128
156 Followers 150 Following 2nd yr PhD Student @PKU1898 Institute of Computational Linguistics | Prev. intern @MSFTResearch (MSRA) | Focusing on Long Context ModelingVikram @msharmavikram
429 Followers 485 Following @NVIDIA Sr. Research Scientist Large Scale AI/ML Systems | Ph.D. (UIUC - Prof Wen-mei Hwu) All opinions and tweets are personal.🍻Meet Stephen in �.. @fractaledmind
2K Followers 1K Following tweeting about Ruby, Rails, SQLite, CSS, HTML, plus various and sundry otherArvind Nagaraj @nagaraj_arvind
967 Followers 1K Following Neural networks etc @MitraAICo @Inventorobotics | Geriatric millennialLei Wang @Lei_Wang_1999
124 Followers 360 Following github page: https://t.co/gqTeBAKnR6 | full-time research intern @MSFTResearch | I'm currently looking for a ml system phd position :)Nicolas DUFOUR @nico_dufour
137 Followers 387 Following PhD student at IMAGINE (ENPC) and GeoVic (Ecole Polytechnique). Working on image generation.Michael Dorkenwald @mdorkenw
171 Followers 312 Following PhD student @UvA_Amsterdam 🇳🇱 @ELLISforEurope 🇪🇺 working on SSL, Vision and Language, Learning from Videos | Interned @awscloud 🇺🇸Xiang Yue @xiangyue96
2K Followers 434 Following Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.ravens @_R4V3N5_
337 Followers 129 Following quote unquote engineer • digital object • cyberspace veteranSebastian Völkl @basti_vkl
4K Followers 943 Following Building AI software for space/defense to reimagine how complex hardware systems are built. @1517fund fellow. Founded @hackerBCI. e/accChenxin An @AnChancy46881
117 Followers 186 Following PhD Candidate @ HKU NLP Awardee of Hong Kong PhD Fellowship Scheme (HKPFS)Lifan Yuan @lifan__yuan
285 Followers 116 Following NLPer @TsinghuaNLP; Incoming PhD student @IllinoisCSMike Lewis @ml_perception
6K Followers 227 Following Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.Tamay Besiroglu @tamaybes
3K Followers 720 Following Thinking about economics, computing and machine learning @EpochAIResearch @MIT_CSAILJush @yupiop12
5K Followers 736 Following An unprepared singularity maniac Creator of Mr. Ranedeer Other names: JushBJJHF = HaFedh @not_so_lain
468 Followers 1K Following i contribute to custom Ai architectures on huggingface | Tensorflow developer | LowRes admin | open for work | https://t.co/9rhyDH220LAman Arora @amaarora
5K Followers 1K Following Data Science Lead at REA Group | Blog: https://t.co/k0LKBJ9aO7 | Previously: MLE @weights_biases; AI Scientist @Harrison.aiAW @TrainedOnTest
823 Followers 457 Following Senior data scientist / ML Engineer @ a massive unnamed corporation | I enjoy building things2wl @2wlearning
368 Followers 281 Following Documenting my progress learning ML every day. 2 more weeksLintang Sutawika @lintangsutawika
381 Followers 565 Following Incoming Ph.D. student @LTIatCMU. Researcher at @AIEleuther. Maintainer of LM-Eval Harness. Here for machine learning papers and discussion.mcneilly @mcneilly_alex
558 Followers 896 Following cs @mit || eng @ mit informatics tournament || quant trader @ trader joe'sKonrad Szafer @KonradSzafer
72 Followers 214 Following LLM Eval intern research @ Hugging Face | research assistant intern @ CMU AutonLabSakib @zsakib_
408 Followers 398 Following 𝗪𝗵𝗮𝘁 𝗽𝗿𝗼𝗺𝗽𝘁 𝗱𝗶𝗱 𝘆𝗼𝘂 𝘂𝘀𝗲? 🎨 Open-Source AI \ ML Engineer @ ReplicateDongfu Jiang @DongfuJiang
242 Followers 463 Following NLP researcher. Currently CS PhD student @UWCheritonCS. Former B.Eng. in CS @ZJU_China; Incoming summer Intern @allen_ai. Interested in LLM and evaluationJim @sailbikewrite
1K Followers 448 Following midwesterner lost in california | engineer | writing at https://t.co/L0miEJSUCG thelimitcycle (at) gmail dot comNicolas Mejia Petit @mejia_petit
678 Followers 109 Following Machine learning researcher// Made Tested python 22k and 143k datasets, created the first Mixtral 22b MOE to dense model conversion Mistral-22b//Xilo @PandoXiloscient
453 Followers 576 Following The future belongs to benevolent zaibatsu | Increasing global energy consumption | e/λluffy @0xbingllm
60 Followers 161 Following llm enjoyer | building yet another ai wrapper prev: startups, @tiktok_us alt of @baddabingyuUmer Adil @UmerHAdil
707 Followers 314 Following Learning & providing value to OSS AI | Contributor @huggingface @diffuserslib, @LangChainAI, gpt engineer | https://t.co/BOR9cWbN8oPratyush Maini @pratyushmaini
1K Followers 340 Following Trustworthy ML | PhD student @mldcmu | Founding Member @datologyai | Prev. Comp Sc @iitdelhiKilian Haefeli @khshind
234 Followers 343 Following Exploring crevasses of Deep Learning at ETH Zurich & UofT | Previously: @Aleph__Alpha, @Logitech, and exfounder at AiricaSimon Guo 🦝 @simonguozirui
1K Followers 4K Following Incoming CS PhD student @Stanford and curr training models at @cohere | 🎓 @Berkeley_EECS | prev built things at @ @anyscalecompute @nvidia@felix_red_panda @nearcyan @Teknium1 the video is awesome 🤣 ikwym maybe it is time to rethink human <-> computer interaction paradigm in terms of concepts that are easy to communicate with voice? also, i once heard from geohot an idea i like: AI is a "do what i mean" machine
@rudzinskimaciej @felix_red_panda @dome_271 I didn't notice before that you created wurstchen - you have much nicer noise! Just playing with noise shows that model is able to generate high frequency but usually for a cost of composition or diversity etc
@felix_red_panda @dome_271 @prof_sinister By that I mean that SD models have bad type of noise and short pretrain with better one + adjusted schedule is enough for high frequency details Year ago nobody cared 🤷
@felix_red_panda @dome_271 Nope, I found that e.g. fabrics textures can be pixel perfect in sd2.1 and nearly in sdxl but it tends to destroy red green balance as a cost But it happens only with specific eg fat tail nois types with adjusted schedule slope Had some good examples at @prof_sinister 1y ago?
@felix_red_panda its 100% android youtube.com/shorts/obwkJmX…
@YannickScholich @felix_red_panda See? Microsoft is such an innovative company, they ALREADY have time travel!
@felix_red_panda @HlibIvanov yeah feels like time travel enabled by microsoft from time to time lmao
We released StarCoder2 Instruct, which is self-aligned, transparent, and fully permissive! It even beats versions of StarCoder2 trained on GPT-4 distilled data on several benchmarks. huggingface.co/blog/sc2-instr…
@nearcyan i heard nat friedman describe this behaviour with any product that goes to the average user, assume ur users are sedated
Spring has made me very sun-pilled. Will definitely be doing more work/reading in the sun this year. Highly recommend it.
Crazy how people that got into computer science purely for the money are more terrified of being underpaid than of doing something they hate for the rest of their lives
Is there any research on which prompts produce an LLM-judge that is most correlated with human preferences? I'm aware of the canonical work by @lmsysorg but am wondering if something more systematic has been done to compare the effect of prompts on pairwise rankings 🤔 I'm…
i am also of the opinion that gpt2-chatbot would be a disappointing gpt-4.5 my guess? there was an internal disagreement on fine-tuning or prompting practice and oai decided to do a/b testing maybe it is also being done on chatgpt website and fact people can’t tell is saying sm
my gpt2-chatbot conspiracy theory: it’s just gpt 4 turbo with a cleaner system prompt that has less technical debt
Wow this is huge. If the CUDA/GPU checkpoints are small enough, we are looking at a slew of applications that use caching, branching out in inference/training for trying out various strategies. and ofc the most basic load last execution state to resume your work as base case.
NVIDIA has just added CUDA checkpointing functionality via: github.com/NVIDIA/cuda-ch… which should allow CRIU to do application-level checkpointing, that includes GPU state save/restore. Thank you for addressing this long-outstanding request, @NVIDIAAI Discovered via this…
people born in 2024 will never know the pain of training a model. they will simply think and the model will understand. they will have no concept of a “cold start” or the “curse of dimensionality”
Nothing is both more fun and frustrating than reverse engineering (internal) APIs of websites. Especially if you encounter some IDs in the request and have to find where they are coming from