PhD student @unipotsdam, supervised by @davidschlangen. Working on NLP, ML and CogSci. Prev @LstSaar. Former NLP engineer.limengnlp.github.ioJoined December 2021
Ilya Sutskever: bald
Demis Hassabis: bald
Noam Shazeer: bald
Greg Brockman: bald
forget AGI.
forget curing cancer.
cure baldness now.
My hairline is on gradient descent.
⏰ We introduce Reinforcement Pre-Training (RPT🍒)
— reframing next-token prediction as a reasoning task using RLVR
✅ General-purpose reasoning
📑 Scalable RL on web corpus
📈 Stronger pre-training + RLVR results
🚀 Allow allocate more compute on specific tokens
In Hinton's NN class, there is an interesting tip to get a geometric view of high dimensional space. I think authors of interpretability papers did the opposite; they stare at LLMs and pray in their minds that it's linear and interpretable.
In Hinton's NN class, there is an interesting tip to get a geometric view of high dimensional space. I think authors of interpretability papers did the opposite; they stare at LLMs and pray in their minds that it's linear and interpretable.
I just read this WSJ article on why Europe's tech scene is so much smaller than the US's and China's.
I'm afraid that, like most articles on this topic, it largely misses the mark.
Which in itself illustrates a key reason why Europe is lagging behind: when you fail to…
📢 I am on the JOB market this year 📢
I am looking for both faculty and research scientist positions.
My research makes AI agents useful and safe for humans. I enable them to effectively convey uncertainty, ask for help, learn from human feedback, and pursue goals that benefit…
Excited to be at #NAACL2025! Let’s meet (and grab a Char's Zaku sticker 🚀).
📅 May 4, 11–12, RepL4NLP: "Amuro&Char: Analyzing the Relationship between Pre-Training and Fine-Tuning"
📅 May 2, 12 PM, Ballroom B: "SHADES: Towards a Multilingual Assessment of Stereotypes in LLMs"
🚀 Day 0: Warming up for #OpenSourceWeek!
We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency.
These humble building blocks in our online service have been documented,…
🚀 DeepSeek-R1 is here!
⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!
🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today!
🐋 1/n
The #NobelPrizeinPhysics2024 for Hopfield & Hinton rewards plagiarism and incorrect attribution in computer science. It's mostly about Amari's "Hopfield network" and the "Boltzmann Machine."
1. The Lenz-Ising recurrent architecture with neuron-like elements was published in…
My Bet: Strawberry is algorithm distillation/procedural cloning. Everyone right now is coming up with ways to distill System 2 into System 1, but that will always be limited. We need to train the model to run the algorithms, not just outputs (and post-train with RL of course).
Good Scientific American piece on the idea of AGI -I think and argue here that its incoherent - there is no general intelligence natural or artificial but different cognitive abilities that often trade-off..
scientificamerican.com/article/what-d…
cognitive scientist: so the lesson of Clever Hans is we need..
engineer: more horses
cognitive scientist:
engineer: stacked horses. parallel horses. pooled horses. horse dropout. RL with horses in the loop.
cognitive scientist:
engineer: Hans is All You Need
1K Followers 796 FollowingStaff Researcher @AlibabaGroup. Previously @MBZUAI, PhD from @ml_labs_irl and @dcucomputing @dcu interested in Large Language Models (LLMs).
40K Followers 28K FollowingBiologist at The Sainsbury Lab; passionate about plant pathogens and evolution; open science advocate; loves travel, food and sports; nomad and hunter-gatherer.
3K Followers 6K Followingnlab fan account, arxiv surveyor, pubmed enjoyer, two culture bridger, vacuous high gossiper, dearth of any domain expertise, reluctant g theorist, gpu poor,
448 Followers 1K FollowingPhD Student @cvml_mpiinf at the Max Planck Institute for Informatics, @SIC_Saar. Member of @neuroexplicit. Explainability in Computer Vision. @cse_iith alumnus.
14K Followers 3K Followingresearch @MIT_CSAIL @thinkymachines. work on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
1K Followers 2K FollowingResearch scientist @AIatMeta. PhD @NYUDataScience. Interested in AI & CogSci, specifically in goals and their representations in minds and machines (he/him).
21K Followers 19K FollowingInspired by Algorithms, Powered by Imagination: Unleashing the Potential of Generative AI.
#GenerativeAI #deeplearning #AI #MachineLearning
299 Followers 840 Following@ELLISforEurope Ph.D. Student in Natural Language Processing at @CisLmu, supervised by @HinrichSchuetze and @andre_t_martins.
On the job market.
92K Followers 369 FollowingSenior AI Product Manager at Google | Open Source Awesome LLM Apps Repo (#1 GitHub with 70k+ stars) | Author of books on GPT-3 & Neural Search in Production
2K Followers 736 FollowingPassionately in love with Science, mostly Altruistic, Engineer, Amateur Astronomer & Critical thinker. Current Research focus: ▫️Mechanistic Interpretability▫️
18K Followers 20 FollowingA high-throughput and memory-efficient inference and serving engine for LLMs. Join https://t.co/lxJ0SfX5pJ to discuss together with the community!
2K Followers 824 FollowingAssistant Professor, Peking University (@PKU1898) | Former AP @UofAInfoSci | Postdoc @ucsbNLP | Ph.D. @NUSingapore | Researcher in NLP, LLMs & Reasoning
38K Followers 485 FollowingDigital Geometer, Assoc. Prof. of Computer Science & Robotics @CarnegieMellon @SCSatCMU and member of the @GeomCollective. There are four lights.
4K Followers 197 FollowingUCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab at @AI_UCL led by @_rockt, @egrefen, @robertarail, and @jparkerholder.
265K Followers 679 FollowingBuilding with AI agents @dair_ai • Prev: Meta AI, Galactica LLM, Elastic, PaperswithCode, PhD • I share insights on how to build with AI Agents ↓
1K Followers 796 FollowingStaff Researcher @AlibabaGroup. Previously @MBZUAI, PhD from @ml_labs_irl and @dcucomputing @dcu interested in Large Language Models (LLMs).
2K Followers 532 FollowingAssistant Professor at @TelAvivUni and Research Scientist at @GoogleResearch; previously postdoc at @GoogleDeepMind and @allen_ai
1K Followers 34 Followingdeveloping embodied AI agents that empower users to use language to interact with digital and physical environments to carry out real-world tasks.
28K Followers 173 FollowingA North Star for open AGI. Co-founders: @fchollet @mikeknoop. President: @gregkamradt. Help support the mission - make a donation today.
57K Followers 561 FollowingCo-founder & CTO @hyperbolic_labs cooking fun AI systems. Prev: OctoAI (acquired by @nvidia) building Apache TVM, PhD @ University of Washington.