Yuri Campbell @yuricampbll
Senior Scientist & Co-PI @Fraunhofer: LLMs, NLP & GraphML in Knowledge Transfer, Lecturer in AI @FOMHochschule. Former: Quantum Statistics @mpiMathSci 🇩🇪 Joined February 2019-
Tweets4K
-
Followers488
-
Following1K
-
Likes51K
📢New paper : "In-Context Principle Learning from Mistakes" Instead of prompting using only *correct* few-shot examples, we intentionally make *mistakes*, and then learn "principles" or "lessons" from them. Lead by @tianjun_zhang @aman_madaan @luyu_gao arxiv.org/pdf/2402.05403…
📢New paper : "In-Context Principle Learning from Mistakes" Instead of prompting using only *correct* few-shot examples, we intentionally make *mistakes*, and then learn "principles" or "lessons" from them. Lead by @tianjun_zhang @aman_madaan @luyu_gao arxiv.org/pdf/2402.05403… https://t.co/w0nY0KGU6s
we are so back in the open RLHF land, @lcastricato is back. a great example of synthetic data for a specific open ended problem (e.g. not code/math)
we are so back in the open RLHF land, @lcastricato is back. a great example of synthetic data for a specific open ended problem (e.g. not code/math)
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT The M2-BERT retrieval encoder on LoCoV1 outperforms competitive baselines by up to 23.3 points, despite containing 5 to 90x fewer parameters. arxiv.org/abs/2402.07440
Introducing GenAI-Arena! Inspired by the awesome Chatbot Arena, we built a web demo on @huggingface for testing Image generation/editing models in the wild. We include all the T2I models like SDXL and editing models like PNP, Prompt2Prompt, etc. We will include video models soon.
Introducing GenAI-Arena! Inspired by the awesome Chatbot Arena, we built a web demo on @huggingface for testing Image generation/editing models in the wild. We include all the T2I models like SDXL and editing models like PNP, Prompt2Prompt, etc. We will include video models soon.
"The boundary between trainable and untrainable neural network hyperparameter configurations is fractal" Amazing.
"The boundary between trainable and untrainable neural network hyperparameter configurations is fractal" Amazing.
Do you want to translate texts for a quantitative analysis, but using Google Translate & Co. is too expensive? Then use an open-source translation model instead – @RonjaSczepanski, @MoritzLaurer, @Jrn_rz, and I show that they are a reliable alternatives: doi.org/10.31219/osf.i…
📢We are excited to announce our paper " Deep Learning-based Computational Job Market Analysis: A Survey on Skill Extraction and Classification from Job Postings" has been accepted to #EACL2024 at the @nlp4hr workshop. In cooperation with @NLPnorth and @MaiNLPlab.
Weight averaging and model merging for LLMs seem to be the most interesting themes in 2024 so far. What are the benefits? Combining multiple models (or checkpoints) into a single one can improve training convergence, overall performance, and also robustness. I will probably do…
InternLM-Math demo: huggingface.co/spaces/internl… 7B and 20B Chinese and English Math LMs with better than ChatGPT performances. InternLM2-Math are continued pretrained from InternLM2-Base with ~100B high quality math-related tokens and SFT with ~2M bilingual math supervised data. We…
🚀 A game-changer benchmark: LLM-Uncertainty-Bench 🌟 📚 We introduce "Benchmarking LLMs via Uncertainty Quantification", which challenges the status quo in LLM evaluation. 💡 Uncertainty matters too: we propose a novel uncertainty-aware metric, which tests 8 LLMs across 5…
A new 7B LLM, Snorkel-Mistral-PairRM-DPO (@SnorkelAI), achieves SOTA on AlpacaEval 2.0! What's the key? They use a reward model (RM) to generate synthetic preference data for iterative DPO! In this case, they use our, PairRM-0.4B, to do rejection sampling and iteratively…
State Space Models: A Modern Approach This is an interactive textbook on state space models (SSM) using the JAX Python library. probml.github.io/ssm-book/root.…
Knowledge Fusion of LLMs Is it possible to merge existing models into a more potent model? We have already seen a few ways that show the potential to effectively do this using approaches like weight merging and ensembling of models. This work proposes FuseLLM with the core…
@lateinteraction Literally true even in theory
@lateinteraction Literally true even in theory
📢Tasks with > 10k classes (e.g. information extraction) are hard for in-context learning: typically a tuned retriever or many in-context calls per input are used ($$$) Infer-Retrieve-Rank (IReRa) is a SotA program using 1 frozen retriever with a query predictor and reranker.
Why should learning rate depend on network depth? Take a simple linear neural network, a product of d random matrices with standard init. Plot Hessian singular values as function of depth, and you'll see them grow, a classical Random Matrix Theory result
The Conference on Language Modeling 🦙 (colmweb.org) has the mission of "creating a community of researchers with expertise in different disciplines, focused on understanding, improving, and critiquing the development of LM technology." 🧵 Here are 17 papers from 17…
SFT+KTO ~= SFT+DPO, matching our own observations. Worth noting that this on the same data: doesn't yet take advantage of that fact that KTO can access way more data IRL. Also, if your base model is good enough, we've found that you can skip SFT before KTO!
SFT+KTO ~= SFT+DPO, matching our own observations. Worth noting that this on the same data: doesn't yet take advantage of that fact that KTO can access way more data IRL. Also, if your base model is good enough, we've found that you can skip SFT before KTO!
Bojan Tunguz @tunguz
187K Followers 8K Following Machine Learning ex Nvidia. Kaggle Quadruple Grandmaster. Data Scientist. Physicist. Catholic. Husband. Father. Stanford Alum. e/xgb. XGBoost.eth. AMDG.Nathan Benaich @nathanbenaich
51K Followers 32K Following solo member of investment staff @airstreet, brewing ambition @airstreetcafe, next token predictor @airstreetpressmerve @mervenoyann
55K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papers sometimes. RTs != endorsementsJeremy Howard @jeremyphoward
221K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordPasquale Minervini �.. @PMinervini
7K Followers 4K Following Researcher in ML/NLP at the University of Edinburgh (faculty @InfAtEd @EdinburghNLP), @ELLISforEurope, @UCL_NLP, PI for @Clarify2020, https://t.co/WydvfU8ugz he/theyRamsri Goutham Golla @ramsri_goutham
10K Followers 3K Following Shares learnings from bootstrapping 2 AI SaaS Apps to $100k ARR with no employees: https://t.co/fU8yoiYVDc https://t.co/DTyILliHVm My NLP courses: https://t.co/MYUyOxGSkATuringPost @TheTuringPost
62K Followers 16K Following Newsletter exploring AI & ML - Weekly trends - LLM/FM insights - Unicorn spotlights - Global dynamics - History Led by @kseniase_ Elevate your AI game 👇🏼yobibyte @y0b1byte
15K Followers 2K Following Kurin ViTaly, senior research scientist @IsomorphicLabs, ML PhD from @UniofOxford on RL, Multitask learning & GraphsRobert Scoble @Scobleizer
504K Followers 72K Following Follow me on my new podcast with AI startups, Unaligned. Tech industry color commentator since 1993. Author/Blogger. Former strategist @Microsoft.Martin Görner @martin_gorner
12K Followers 6K Following Product Manager for Keras and Tensorflow high-level APIs. Previously worked on Cloud TPUs (Tensor Processing Units). Passionate about democratizing ML.Carlos E. Perez @IntuitMachine
30K Followers 4K Following Artificial Intuition, Fluency & Empathy, DL Playbook, Patterns for Generative AI, Patterns for Agentic AI https://t.co/fhXw0zjxXpQuinn @quinn95789
2 Followers 1K FollowingKai-Fu Lee @kaiifulee
1K Followers 1K Following #AI Expert, CEO of @01ai_yi and Chairman of 创新工场 @sinovationvc , former President of Google China, Author of AI 2041 and NYT Bestseller AI SuperpowersTusenet @Tusenet13243
1 Followers 472 FollowingVyag7_hard @vyag764146
3 Followers 1K FollowingElaine_SD @ElaineSD1206
7 Followers 1K Followingminiahem @miniahem91830
0 Followers 712 FollowingBla__ckpudd @BCkpudd64777
6 Followers 926 Followingjacqueline zimmermann @jacquel84353275
6 Followers 84 Following always looking for unexpected insights.Charles @nightblue1922
71 Followers 381 FollowingSungnyun Kim @kim_sungnyun
160 Followers 222 Following PhD student at OSI LAB @kaist_ai. Ex-intern @AmazonScience, @naver_webtoon. Computer Vision, Representation Learning, SSL, GenAI.Elena Senger @ElenaSngr
13 Followers 39 Following Scientist at @Fraunhofer, @inno_data and PhD student at @LMU_Muenchen, @MaiNLPlab.Devr Inc. @DevrOfficial
264 Followers 5K Following Devr is a new Internet protocol for the governance of decentralized privacy networks (DPN), powering a new era for data sharing economiesGokul @gokstudio
1K Followers 5K Following ML Engineer @Apple, ex-IBM Research. Ex Intern, Google Research. MSc CS at @ETH. Views my own, Retweet != Endorsement.Alexander Kalian @AlexanderKalian
780 Followers 1K Following PhD candidate building AI-driven #QSAR models ⚕️🧬💊📈 @KingsCollegeLon #cheminformatics Passionate about #AI, #biotech + other innovation. Views are my own!Unsloth AI @UnslothAI
3K Followers 245 Following Making AI & LLMs more accessible + faster for everyone! 🦥 Github: https://t.co/2kXqhhvLsb Discord: https://t.co/1Gmc1SDEljCarlos Reyes @CarlosReyesAI
210 Followers 800 Following PhD Student, Artificial Intelligence and EngineeringEmbeddedLLM @EmbeddedLLM
210 Followers 594 Following Your open-source AI ally. We specialize in integrating LLM into your business. Expert consulting, robust support and AMD ROCm platform optimization.Matt Barta @MtBarta
154 Followers 680 Following Founder at @DeployQL, specializing in RAG. Get in touch!Daniel Han @danielhanchen
7K Followers 930 Following Building @UnslothAI. Finetune LLMs 30x faster https://t.co/aRyAAgKOR7. Prev ML at NVIDIA. Hyperlearn used by NASA. I like maths, making code go fastDr Martin Hiesboeck @MHiesboeck01
101 Followers 5K Following #blockchain | Investor | Private Account, Public Account @MHiesboeckMichel @Moi39017963
332 Followers 3K Following Just a curious, optimist and worried man in a changing world. My opinion may - and will- change over timeBlaze (Balázs Galamb.. @gblazex
1K Followers 967 Following A Smooth Guy; Developer of SmoothScroll for macOS, Windows & Google Chrome.Rahul Soni 🇮🇳�.. @rahulsoni9
26 Followers 227 FollowingJunyang Lin @JustinLin610
5K Followers 1K Following Chief Evangelist Officer of Qwen Team & OpenDevin, building LLM and LMM. Now @Alibaba_Qwen . Previously @PKU1898 LANCO group. ❤️ 🍵 ☕️ 🍷 🥃EarnestLagomarsino @EarnestLag39328
140 Followers 2K FollowingSugar Mommy @sugar_momm1285
2K Followers 5K FollowingVivianne @Neight689899
10 Followers 2K Following I don’t dare to expect koi-like luck, but I hope everything goes smoothly. The road is long and the future is promising.Judit 🤍 @ha5xxx
367 Followers 5K Following ✨ Software Developer Astronomer, trying to be useful to the humanity. I like AI tools, I plan to move in new directions. DM 🤍Puneet Dhanuka @DhanukaPuneet
319 Followers 4K Following Software @ PhonePe👨💻 |Tech + Entrepreneur, cricket + badminton | Now Health + Fitness enthusiast + Travel | Pictian.Ana Rojo-Echeburúa @arojomaths
726 Followers 3K Following Data Science & AI || PhD in Applied Mathematics || Spanish living in Scotland || Crossfit Athlete || Content CreatorDarko @Darko1521056
278 Followers 4K FollowingAndreas Köpf @neurosp1ke
5K Followers 452 Following Exploring ways to algorithmically model our world.Anne Kreuter @AnneKreuter_
4 Followers 33 Following Working on LLMs @fraunhofer 🧑🔬 MSc Computational Linguistics @UniStuttgart Institut für maschinelle Sprachverarbeitung 🎓Alexei Karamazov @CousinKaramazov
122 Followers 2K FollowingSicong @Leon_L_S_C
96 Followers 576 Following CS Graduate, Ph.D. majoring in Multi-Modality AI Research, Alibaba-NTU Talent Programme.Chiara Maria Cervetta @ChiaraCervetta
1K Followers 5K Following Florist | @Google Local Guides Guiding Star on @GoogleMaps | Feedback Aid Enhancing User Experience of AI Services | @kinpersonalai Beta TesterYuntian Deng @yuntiandeng
3K Followers 3K Following #NLProc Postdoc @ai2_mosaic | Assistant Professor @UWaterloo '24 | Faculty Affiliate @VectorInst '24 | PhD @HarvardNaman Jain @StringChaos
899 Followers 892 Following CS PhD @UCBerkeley | Projects - R2E, LiveCodeBench, Chatbot-Arena Coding, RAFT, Data Quality | Past: @AWS @MSFTResearch @iitbombayJimin Zhou @jimin_zhou95722
31 Followers 135 FollowingVarun Kumethi @xissaxknife
765 Followers 1K Following Futurist | Business Process Automations | AI/LLM Researcher | Co:mmunity Champion @Cohere | My views are my own.Mrs Carey Mulligan @CareyyMulligann
10 Followers 424 FollowingMaksura Jalal @jalalmaksura
117 Followers 1K Following #Women #Entrepreneur (#Cothing #business & #Digital #marketer, #Blogging, #Teespring, #Pinterest, #Fiverr #Virtual assistant)Chaeeun Kim @chaechaek1214
87 Followers 144 Following My research interest lies in Retrieval-Augmented Generation. I am currently a research intern at KAIST AI.Yann LeCun @ylecun
709K Followers 718 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.François Chollet @fchollet
469K Followers 770 Following Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own.Sebastian Raschka @rasbt
266K Followers 905 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.AK @_akhaliq
309K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo follow on Hugging Face: https://t.co/q2Qoey80GxAndrej Karpathy @karpathy
977K Followers 904 Following 🧑🍳. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥Bojan Tunguz @tunguz
187K Followers 8K Following Machine Learning ex Nvidia. Kaggle Quadruple Grandmaster. Data Scientist. Physicist. Catholic. Husband. Father. Stanford Alum. e/xgb. XGBoost.eth. AMDG.Google DeepMind @GoogleDeepMind
942K Followers 275 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.Jim Fan @DrJimFan
229K Followers 3K Following @NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.Hugging Face @huggingface
342K Followers 189 Following The AI community building the future. https://t.co/VkRPD0VKaZ #BlackLivesMatter #stopasianhateOmar Sanseviero @osanseviero
31K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽abhishek @abhi1thakur
81K Followers 662 Following 🤗 I build AutoTrain @huggingface 👨🏽💻 World's First 4x Grand Master @kaggle 🎥 YouTube 100k+: https://t.co/BHnem8fTu5 ⭐ GitHub StarRoss Wightman @wightmanr
18K Followers 1K Following Computer Vision @ 🤗. Ex head of Software, Firmware Engineering at a Canadian 🦄. Currently building ML, AI systems or investing in startups that do it better.Julien Chaumond @julien_c
46K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @Polytechniquemerve @mervenoyann
55K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papers sometimes. RTs != endorsementsMichael Bronstein @mmbronstein
43K Followers 4K Following #DeepMind Professor of #AI @UniofOxford / Fellow @ExeterCollegeOx / ML Lead @ProjectCETI / https://t.co/kZpGpDzYeVThomas G. Dietterich @tdietterich
50K Followers 505 Following Distinguished Professor (Emeritus), Oregon State Univ.; Former President, Assoc. for the Adv. of Artificial Intelligence; Robust AI & Comput. SustainabilityLogan Hallee @Logan_Hallee
38 Followers 57 Following Bioinformatics and Data Science PhD Student @UDelaware in @GleghornLab. Norway, ME - Newark, DEElena Senger @ElenaSngr
13 Followers 39 Following Scientist at @Fraunhofer, @inno_data and PhD student at @LMU_Muenchen, @MaiNLPlab.Barbara Plank @barbara_plank
9K Followers 1K Following Prof, Chair for AI & Computational Linguistics @LMU_Muenchen, Head of @MaiNLPlab & co-director @CisLMU Prof at @ITUkbh @NLPnorth @ELLISforEurope scholarUnsloth AI @UnslothAI
3K Followers 245 Following Making AI & LLMs more accessible + faster for everyone! 🦥 Github: https://t.co/2kXqhhvLsb Discord: https://t.co/1Gmc1SDEljConference on Languag.. @COLM_conf
2K Followers 6 Following https://t.co/GhGCMEoa4A Abstract submission: March 22, 2024Data Science for Inno.. @inno_data
14 Followers 32 Following Research unit Data Science for Innovation @Fraunhofer IMW. We make knowledge accessible, understandable and useful for science, industry, politics, and society.Daniel Han @danielhanchen
7K Followers 930 Following Building @UnslothAI. Finetune LLMs 30x faster https://t.co/aRyAAgKOR7. Prev ML at NVIDIA. Hyperlearn used by NASA. I like maths, making code go fastJie Huang @jefffhj
4K Followers 566 Following Ph.D. Candidate at UIUC🌽; Formerly @GoogleDeepmind @NVIDIAAI @AmazonScience. #NLProc Large Language ModelsOpenChat @OpenChatDev
2K Followers 42 Following Advancing Open Source LLMs with Mixed Quality Data through offline RL-inspired C-RLFT. ⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀𝗣𝗿𝗼𝗷𝗲𝗰𝘁 𝗟𝗲𝗮𝗱: Guan Wang, @AlpayAriyakOpenAI Developers @OpenAIDevs
71K Followers 0 Following Official @OpenAI account for anyone building on our APIs. Join us in building the future of AI. We ❤️ developers!LLM360 @llm360
1K Followers 50 Following A framework for open-source LLMs to foster transparency, trust, and collaborative research.Together AI @togethercompute
27K Followers 303 Following The future of AI is open-source. Let's build together.Nous Research @NousResearch
18K Followers 30 Following The AI Accelerator Company. https://t.co/vrD0aDJetoYuntian Deng @yuntiandeng
3K Followers 3K Following #NLProc Postdoc @ai2_mosaic | Assistant Professor @UWaterloo '24 | Faculty Affiliate @VectorInst '24 | PhD @HarvardKyle Lo @kylelostat
2K Followers 1K Following #nlproc #hci leading data research @allen_ai, he/him, bluesky https://t.co/5Hm9cx3UrzAnne Kreuter @AnneKreuter_
4 Followers 33 Following Working on LLMs @fraunhofer 🧑🔬 MSc Computational Linguistics @UniStuttgart Institut für maschinelle Sprachverarbeitung 🎓tomaarsen @tomaarsen
678 Followers 120 Following Sentence Transformers, SetFit & NLTK maintainer Machine Learning Engineer at 🤗 Hugging FaceJulian Michael @_julianmichael_
1K Followers 121 Following Researching stuff @NYUDataScience. he/himBindu Reddy @bindureddy
124K Followers 337 Following CEO of @abacusai, using Gen AI to build Applied AI and LLM agents and systems at scale, ex-AWS / Google, passionate about human behavior and open-source AGIAlvaro Bartolome @alvarobartt
759 Followers 411 Following ml @argilla_io / open-source and machine learningJunyang Lin @JustinLin610
5K Followers 1K Following Chief Evangelist Officer of Qwen Team & OpenDevin, building LLM and LMM. Now @Alibaba_Qwen . Previously @PKU1898 LANCO group. ❤️ 🍵 ☕️ 🍷 🥃Binyuan Hui @huybery
5K Followers 315 Following 🤔 Core maintainer at Qwen and OpenDevin. || Code Generation, Text-to-SQL, Large Language Models.DeepSeek @deepseek_ai
4K Followers 0 Following Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.Maxime Labonne @maximelabonne
12K Followers 433 Following Author of Hands-On Graph Neural Networks https://t.co/Q8victWUmR • Machine Learning ScientistKeiran Paster @keirp1
1K Followers 634 Following Currently PhD at the University of Toronto. Fall 2023 student researcher at Google. Training sequence models. Recent: APE, STEVE-1, OpenWebMath, Llemma.Bobby @bobby_he
520 Followers 225 Following Machine Learning postdoc @ETH. PhD from @UniofOxford and former research intern @DeepMind/@samsungresearchAdam Tauman Kalai @adamfungi
1K Followers 85 FollowingSheng Zhang @sheng_zh
491 Followers 152 Following Principal Researcher @MSFTResearch, PhD @JohnsHopkins @JHUCLSP | Building & using (multimodal) foundation models for real-world applications.jack morris @jxmnop
10K Followers 758 Following getting my phd in nlp @cornell_tech 🚠 // academic optimist // tweeting from the snack aisle at trader joesZhiting Hu @ZhitingHu
3K Followers 352 Following Assist. Prof. at UC San Diego; Artificial Intelligence, Machine Learning, Natural Language ProcessingRaj Movva @rajivmovva
646 Followers 484 Following CS PhD student at @Cornell_Tech, previously @MIT CS. interested in NLP, algorithmic fairness, and social justice. cooking & tennis fan. he/himZain Hasan @ZainHasan6
2K Followers 1K Following Learning & sharing ML | Paper Summaries | ML @weaviate_io, @UofT EngSci, Data Scientist, Lecturer, Digital Health/Biomed🇨🇦🇵🇰Yi-01.AI @01AI_Yi
5K Followers 8 Following A global company building AI 2.0 platform and applicationsHumane @Humane
109K Followers 3 Following The small tech company with the eclipse logo. Order your Ai Pin today: https://t.co/w6NwmmThsODavid van Niekerk @dpvanniekerk
169 Followers 1K Following Machine Learning Engineer. Interested in how learning is done in the brain and what neuroscience can teach AI.xAI @xai
998K Followers 36 FollowingThomas Ahle @thomasahle
4K Followers 468 Following Head of ML @NormalComputing. Ex @Meta, @BARCdk, @SupWizAI. Tweets mostly about Math, Probability, AI, ML, Algorithms and Randomness.Charles Packer @charlespacker
663 Followers 306 Following Building https://t.co/RKVR6kpMCl 📚🦙 | PhD student at @berkeley_ai @ucbrise @BerkeleySkyZhangir Azerbayev @zhangir_azerbay
895 Followers 543 Following Building an artificial mathematician @PrincetonCS.Mistral AI @MistralAI
90K Followers 0 Following Fast, open-source and secure language models. Join us https://t.co/INALdNGvCPNathan Godey @nthngdy
534 Followers 840 Following 3rd year PhD student @InriaParisNLP Working on the representations of language models, architectures, and pretraining methodsBenoît Sagot @bensagot
629 Followers 186 FollowingMartin Tutek @mtutek
395 Followers 733 Following Postdoc @ Technion | previously postdoc @ UKP Lab, TU Darmstadt | PhD @ TakeLab, UniZG | Working on interpretability & safety of LLMs.LDJ @ldjconfirmed
5K Followers 201 Following e/λ Currently: Working on something new Prev: @NousResearch @TTSLabsAI DM for business/consulting or interesting conversations.Sora is here! It's a diffusion transformer that can generate up to a minute of 1080p video with great coherence and quality. @_tim_brooks and I have been working on this at @OpenAI for a year, and we're pumped about pursuing AGI by simulating everything! openai.com/sora
Here is my selection of papers for today (15 Feb) on Hugging Face huggingface.co/papers Computing Power and the Governance of Artificial Intelligence Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers PRDP: Proximal Reward Difference Prediction for…
Shanghai 🇨🇳 AI Lab Achieved 1st Version of Karpathy's AI Operating System > Nov 2023: @karpathy proposes LLM OS > Feb 2024: Chinese (+ Princeton) team proposes self learning operating system What did they do? > Built an agent using a mix of Python code and GPT-4 language model…
1/ Is this thing on..🎙️🎤 New paper from the lab (a thread 🧶): arxiv.org/abs/2401.15713
@karpathy Googles cunning plan 1. Get you back making more YouTube videos 2. Train model on said videos …. 3. Profit / rename Gemini-parthy to Google Chat as that name hasn’t been used enough
@darshilistired I started the next one two days ago!
📢New paper : "In-Context Principle Learning from Mistakes" Instead of prompting using only *correct* few-shot examples, we intentionally make *mistakes*, and then learn "principles" or "lessons" from them. Lead by @tianjun_zhang @aman_madaan @luyu_gao arxiv.org/pdf/2402.05403…
In-Context Principle Learning from Mistakes paper page: huggingface.co/papers/2402.05… In-context learning (ICL, also known as few-shot prompting) has been the standard method of adapting LLMs to downstream tasks, by learning from a few input-output examples. Nonetheless, all…
@jeremyphoward @thecharlieblake @cloneofsimo Thanks, I completely forgot about RAdam. Plotted the factors and indeed r_t just looks like an automatic warmup of ~b2 "lifetime", so... meh? It's actually a bit more obscure, and indeed there's nothing one can control when it happens to _not_ work.
Announcing our latest release, Nous Hermes 2 Llama-2 70B. This is our largest available model trained on the Nous Hermes 2 dataset, with over 1,000,000 entries of primarily synthetic data. huggingface.co/NousResearch/N…
What Algorithms can Transformers Learn? (arxiv.org/abs/2310.16028) has an interesting discussion about indexes and sharp restrictions on their usage. (I didn't really think about this much in my version, but it is a worthwhile point. positional embeddings are a bit gnarly.)
We could have AGI but our learning rates are off by 1e-15.
Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.
Bunny A family of lightweight multimodal models local demo: github.com/BAAI-DCAI/Bunn… online demo: bunny.dataoptim.org Bunny is a family of lightweight but powerful multimodal models. It offers multiple plug-and-play vision encoders, like EVA-CLIP, SigLIP and language…
We're thrilled to share our latest study, titled "Prompt-Time Symbolic Knowledge Capture with Large Language Models," on arXiv. This research explores approaches that enable Large Language Models (LLMs) to learn directly from user interactions in real-time. It utilizes symbolic…
🗼TowerLLM 13B 🗼 We are releasing TowerLLM 13B a SOTA open-source LLM for translation related tasks! You can check the model on our @huggingface collection page: huggingface.co/collections/Un… Updated results can be found here: unbabel.com/nl/announcing-… Paper will be out soon...
We show Transformers generalize on complex data by using shared attention patterns for similar structures BUT how to avoid overfitting on low-complexity data? 🚨SQ-Transformer explicitly quantizes embeddings structurally & learns systematic attention arxiv.org/abs/2402.06492 🧵
Unveiling 📈Moirai* 📈, our cutting-edge time series foundation model capable of universal forecasting! As a Large Time Series Model, it can tackle any forecasting challenge, across various domains, multiple frequencies, and any-variate in a zero-shot manner. To enable this,…
1/7 Introducing Mamba-ND: the latest advancement in the Mamba. Mamba-ND extends Mamba to multi-dimensional data such as images and videos, outperforming transformer benchmarks with far fewer parameters and achieving linear complexity. arxiv.org/abs/2402.05892
we are so back in the open RLHF land, @lcastricato is back. a great example of synthetic data for a specific open ended problem (e.g. not code/math)
PINK ELEPHANTS! 🐘 Now, don’t think about it. Chatbots also find this supremely difficult. Ask one of the most popular open source models NOT to talk about pink elephants, and it will fail 34% of the time. In our new paper, we address this problem. arxiv.org/abs/2402.07896 1/N
PINK ELEPHANTS! 🐘 Now, don’t think about it. Chatbots also find this supremely difficult. Ask one of the most popular open source models NOT to talk about pink elephants, and it will fail 34% of the time. In our new paper, we address this problem. arxiv.org/abs/2402.07896 1/N
Microsoft presents Language Feedback Models (LFM) - LFMs identify desirable behaviour for imitation learning in instruction following - LFMs outperform using LLMs as experts to directly predict actions and generalize to unseen environments arxiv.org/abs/2402.07876