Shom @ShomLinEd
language model | sequence modeling | education | HCI Web Joined September 2021-
Tweets609
-
Followers305
-
Following2K
-
Likes29K
A simple take on the Transformer: MLP layers are for long-term memory. Attention is for short term memory. The state-of-the-art for efficient MLP layers is the switch-style MoE. The state-of-the-art for efficient attention is likely sliding window attention with sinks. I’m…
claude code wrapping bash usage in python subprocess calls is interesting and worrying...
True
Hire the right Chinese.
I'd like to see Meta building a lean LLM team around Narang, Allen-Zhu, Mike Lewis, Zettlemoyer and Sukhbaatar and giving them all the budget and power.
Since its fifth generation, RWKV's main progress -- outer product states, data dependent decay and delta rules -- has come only after works like RetNet, Mamba and DeltaNet with a few adjustments. I respect his efforts of training models, but he could use some more credit.
Since its fifth generation, RWKV's main progress -- outer product states, data dependent decay and delta rules -- has come only after works like RetNet, Mamba and DeltaNet with a few adjustments. I respect his efforts of training models, but he could use some more credit.
i didn't play with o3 as much but judging from my experience with claude, its love of printing probably stems from having to print out results to be collected and judged in RL loop. Its abuse of .get("key") and try catch may be caused by error penalty.
i didn't play with o3 as much but judging from my experience with claude, its love of printing probably stems from having to print out results to be collected and judged in RL loop. Its abuse of .get("key") and try catch may be caused by error penalty.
We will be presenting "APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding", a novel encoding method that enables: 🚀Pre-caching Contexts for Fast Inference 🐍Re-using Positions for Long Context Our poster session is located in Hall 3 and Hall 2B,…
We will be presenting "APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding", a novel encoding method that enables: 🚀Pre-caching Contexts for Fast Inference 🐍Re-using Positions for Long Context Our poster session is located in Hall 3 and Hall 2B,… https://t.co/oqnOCeWV7V
New Article: "Against The Achilles' Heel: A Survey on Red Teaming for Generative Models" by Lin, Mu, Zhai, Wang, Wang, Wang, Gao, Zhang, Che, Baldwin, Han, and Li jair.org/index.php/jair…
Deepseek in Jan 2025 is going through the chatgpt moment in Dec 2022. Servers going down, user base surging, rl techniques making model rise in performance.
📝Please fill in your information to get a free pass before they’re gone-only 3 days left to register! ⬇️Check the comments for the link to our questionnaire. Let’s meet and talk about innovation, AI, and opportunities! #LibrAI #AI #GITEX #FreePass #GITEX2024 #ExpandNorthStar
HOC's Fast Discrete Program Search (DPS) HOC will soon (EOY?) launch an API for our DPS solution. The interface will be simple: - You give us a set of examples (input/output pairs) - We'll give you a (Python?) function that models it And that's it. It will be an universal…
HOC's Fast Discrete Program Search (DPS) HOC will soon (EOY?) launch an API for our DPS solution. The interface will be simple: - You give us a set of examples (input/output pairs) - We'll give you a (Python?) function that models it And that's it. It will be an universal…
Tired: transformer captures long term dependency Wired: fractal exhibits long term dependency Inspired: Memory processes and 2D Ising models characterize long term dependency
Hear me out, universal structured format with thought tag and result tag
Just updated taxonomy & covered more papers 😄 github.com/Libr-AI/OpenRe…

Tianfu Fu @TianfuF
1K Followers 243 Following Member of Technical Staff @OpenAI MIT McGovern Institute for Brain Research @mcgovernmit Ex-Research Scientist @Meta
WallisBurke @DR5lsM61k80W7
0 Followers 306 Following
Robert Scoble @Scobleizer
543K Followers 24K Following The best from ML/AI community | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future | Silicon Valley robots, holodecks, BCIs, & startups.
Saber Darabi @SADarabi
306 Followers 7K Following
Mason Wang @masonwang025
771 Followers 399 Following cs @stanford. prev cto @tilderesearch & research @stanfordnlp.
Thibaut Boissin @ThibautBoissin
251 Followers 206 Following
Qian Liu @sivil_taram
4K Followers 753 Following Researcher @ TikTok 🇸🇬 📄 Sailor / StarCoder / OpenCoder 💼 Past: Research Scientist @SeaAIL; PhD @MSFTResearch 🧠 Contribution: @XlangNLP @BigCodeProject
Junxuan Wang @JunxuanWang0929
84 Followers 91 Following PhD student, Fudan University, Interpretability
Shangbin Feng @shangbinfeng
4K Followers 2K Following PhD student @uwcse @uwnlp. Model collaboration, for compositional intelligence and collaborative development. #水文学家
Elora @7OD415B1ghpJ8
27 Followers 908 Following
Men1scus @Men1scus
170 Followers 4K Following Junior@Nankai University | Major in CS | Research in GenAI & Infra | Full Stack Developer | Beginner in Crypto | Runner, Cyclist, Gym-goer | Rap enthusiast
JingyuanLiu @JingyuanLiu123
3K Followers 428 Following https://t.co/D7zLeTZRMh is all you need | Opinions are my own
Nathan Chen @nathancgy4
1K Followers 644 Following understanding models @tilderesearch, (hardware-aligned) ml & open-source, 16
Xiaosen Zheng @xszheng2020
607 Followers 2K Following Researcher @ TikTok 📄 RegMix 💼 Past: PhD @sgSMU | Intern @SeaAIL 🧠 Interests: Data-Centric AI | Code AI
Zeyuan Allen-Zhu, Sc.... @ZeyuanAllenZhu
21K Followers 465 Following physics of language models @ Meta (FAIR, not GenAI, not TBD) 🎓:Tsinghua Physics — MIT CSAIL — Princeton/IAS 🏅:IOI x 2 — ACM-ICPC — USACO — Codejam — math MCM
Wangchunshu Zhou @wangchunshu
3K Followers 2K Following Building personal superintelligence @OPPO, previously @AIWaves_inc. Former CS PhD student at ETHZ. Former researcher at ByteDance, Intern at MSRA and PYI at AI2
hear hill @HearHills
108 Followers 3K Following
Zhixuan Lin @zhxlin
455 Followers 635 Following PhD student at @Mila_Quebec and @UMontreal. Working on (linear complexity) long-context sequence models and RL.
Zhanpeng Zhou @zhanpeng_zhou
273 Followers 382 Following Ph.D. candidate @sjtu1896 | Exploring the theoretical foundations of deep learning.
Zhang Ruichong @ZhangRuichong
53 Followers 168 Following
Bowen Li @BowenLi2121
182 Followers 182 Following 🤔 NLP Researcher at Shanghai AI Lab. Large Language Models, Semantic Parsing
michielh.eth @michieldoteth
5K Followers 3K Following 25 | Building @4Mlabs | Sharing insights on Business & AI | Tweets are my opinions.
Yifan Zhang @yifan_zhang_
391 Followers 514 Following PhD student at @Princeton University, focusing on LLMs. Language Modeling and Pretraining, LLM Reasoning and RL. Prev @UCLA, @Tsinghua_IIIS
Snoarhabit @sonarforce
39 Followers 144 Following college student who majors in math | Anki user | star wars fan | studying English with LingQ
Xinyu Yang @Xinyu2ML
1K Followers 1K Following Ph.D. @CarnegieMellon. Working on agentic foundation model systems. Founder of the FM-Wild workshop series and the ASAP seminar series. They/Them
Daniel Sosebee @dnsosebee
262 Followers 1K Following At @recursecenter 🐙, learning automated learning & creating automated creativity ➿, playing piano 🎹, governing @sneaky_town ♟️
Rasurs @RasursooMSX
54 Followers 876 Following
𝗛𝗔𝗥⚡︎�... @harsha_gv
26 Followers 2K Following Namaste ★✨ Cybersecurity | Cloud DevSecOps Engineer✨ Passionate about programming and security✨ Design Thinker✨ @vhsindia member✨ Love All, Serve All ♡✨
Calc Consulting @CalcCon
4K Followers 2K Following Calculation Consulting is a boutique consultancy that specializes in machine learning, AI, and data science
Honglin Mu @honglin_mu
5 Followers 126 Following
Xiaokang Chen @PKUCXK
2K Followers 28 Following Researcher @deepseek_ai | Previously Ph.D at Peking University @PKU1898 Projects: #JanusPro, #DeepSeekVL2
Tianfu Fu @TianfuF
1K Followers 243 Following Member of Technical Staff @OpenAI MIT McGovern Institute for Brain Research @mcgovernmit Ex-Research Scientist @Meta
anandmaj @Almondgodd
2K Followers 397 Following path of childhood's end | gap @penn | prev ai @tesla_optimus @dynarobotics
Jeremy Bernstein @jxbz
7K Followers 615 Following 🧪 @thinkymachines ✍️ anon feedback @ https://t.co/RIhBhjMRdD
Franz Srambical (not ... @lemergenz
224 Followers 426 Following slowly, then suddenly. agi @prob_doom
Scott Gray @scottgray76
9K Followers 793 Following GPU Geek at @OpenAI. I have a long standing interest in neuroscience and its application to machine learning. He/Him.
Spectral Labs @spectral_hq
1K Followers 8 Following Spectral Labs is a spatial intelligence company building novel foundation models for the next generation of engineering design.
Zhaopeng Tu @tuzhaopeng
2K Followers 192 Following Tech Lead, Digital Human Center, Tencent Multimodal Department
Fiction.live @ficlive
903 Followers 36 Following Read and control interactive stories Talk to writers. Suggest your own ideas and debate with other fans. Vote for what happens next.
Math, Inc. @mathematics_inc
6K Followers 0 Following A new company dedicated to autoformalization and the creation of verified superintelligence.
QPomelo @realQPomelo
4K Followers 410 Following 🌈 这里是柚子! / Eng Profile: @isQPomelo / 日常号 @lifeQPomelo
Nuance Labs @nuance_ai
733 Followers 5 Following Building multimodal emotionally intelligent conversational AI that feels as natural to engage with as a human
Rupesh Srivastava @rupspace
2K Followers 689 Following Doer of Technical Stuff. (Co)developed Highway Networks, Upside-Down RL, Bayesian Flow Networks, EvoTorch 📜 Learning is compression.
Yuanlin Lin @yuaanlin
2K Followers 615 Following Founder & CEO at @zeaburapp / 这个推特号用来发简中内容,繁中内容发在 Threads
Zeabur @zeaburapp
3K Followers 12 Following The DevOps AI Agent for Vibe Coders. ☁️ https://t.co/GHY8ioGWJD
Ondřej Čertík @OndrejCertik
1K Followers 319 Following At @Microsoft, previously @gsitechnology, @LosAlamosNatLab. Original author of @SymPy, SymEngine, @LFortranorg, LPython, co-founder of @fortranlang org.
Lifan Yuan @lifan__yuan
2K Followers 137 Following PhD student @uiuc_nlp @GoogleDeepMind. Prev: @TsinghuaNLP
Edward Z. Yang @ezyang
14K Followers 1K Following I work on PyTorch at Meta. Chatty alt at @difficultyang.
hud @hud_evals
1K Followers 6 Following RL environments + evals for agents | @ycombinator | we're hiring!
Ofir Press @OfirPress
15K Followers 7K Following I build tough benchmarks for LMs and then I get the LMs to solve them. SWE-bench & SWE-agent. Postdoc @Princeton. PhD @nlpnoah @UW.
Yanzhe Zhang @StevenyzZhang
501 Followers 240 Following 张彦哲, Computer Science Ph.D. student @ICatGT @GeorgiaTech @SALT_NLP Previously Intern @AdobeResearch CS undergrad @ZJU_China
Ben @SolidlySheafy
286 Followers 343 Following Understanding intelligence @tilderesearch // prev math @Penn and @Cambridge_Uni
Mason Wang @masonwang025
771 Followers 399 Following cs @stanford. prev cto @tilderesearch & research @stanfordnlp.
Ai2 @allen_ai
74K Followers 410 Following Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj
Thibaut Boissin @ThibautBoissin
251 Followers 206 Following
Zeyi Sun @sunzeyi6
44 Followers 167 Following Research Intern in Shanghai AI Lab in CV PhD student in SJTU
泓君Jane @hongjun60
5K Followers 448 Following Founder of Valley 101(硅谷101)|Podcaster @thevalley101 @web3_101 https://t.co/dBbudITzrA https://t.co/RgIxFn11wx https://t.co/QQ49UEhkSp
Junxuan Wang @JunxuanWang0929
84 Followers 91 Following PhD student, Fudan University, Interpretability
Guangxuan Xiao @Guangxuan_Xiao
3K Followers 716 Following Ph.D. student at @MITEECS Prev: CS & Finance @Tsinghua_Uni
Ian Goodfellow @goodfellow_ian
348K Followers 1K Following DeepMind Research Scientist. Opinions my own. Inventor of GANs. Lead author of https://t.co/M6vl8pEQ4I Founding chairman of @pubhealthaction
Shangbin Feng @shangbinfeng
4K Followers 2K Following PhD student @uwcse @uwnlp. Model collaboration, for compositional intelligence and collaborative development. #水文学家
Zephyr @zephyr_z9
32K Followers 505 Following Tech, AI, Semiconductors, Stocks, Finance. DMs are open
Xander Chin @XanderChin
1K Followers 432 Following inference @groqinc | eng @westernu @schulichleaders | building and learning for fun
Huazi @HeyHuazi
2K Followers 478 Following 👨🎨UI/UX 设计师|⛱️GAP 中|🦲无业难民|✨ 对一切保持好奇|🧩矢量Logo收集站→ https://t.co/LpAWVDEj2Y|🎙️播客《设计漫谈》|📰 写《设计漫步周刊》→https://t.co/7TbNPjMtxe
Igor Babuschkin @ibab
103K Followers 855 Following Maybe the real ASI was the friends we made along the way. Co-founder @xAI, Research & Engineering
Shuchao Bi @shuchaobi
13K Followers 692 Following Research @Meta Superintelligence Labs, RL/post-training/agents; Previously Research @OpenAI on multimodal and RL; Opinions are my own.
Humanloop @humanloop
10K Followers 532 Following Humanloop is the LLM evals platform for enterprises. Trusted by Gusto, Vanta and Duolingo to ship reliable AI products.
Yi Wu @jxwuyi
1K Followers 103 Following AI/RL researcher, Assistant Prof. at @Tsinghua_Uni, leading the RL lab at @AntResearch_, PhD at @berkeley_ai, frequent flyer and milk tea lover.