Lei Li @_TobiasLee
Ph.D. student@HKUNLP. Previously LANCO@PKU. lilei-nlp.github.io Hong Kong Joined August 2015-
Tweets164
-
Followers737
-
Following617
-
Likes963
model = learn(data) Synthetic data is great, but it’s not data. It’s an intermediate quantity created by learn(). Data is created by people and has privacy and copyright considerations. Synthetic “data” does not - it’s internal to learn().
wow, this price is so amazing given the performance!
It's not PPO > DPO, It's policy generated data > stale data, In this paper, we answer this question by performing a rigorous analysis of a number of fine-tuning techniques on didactic and full-scale LLM problems. Our main finding is that, in general, approaches that use…
Reasoning is the core ability of LLMs. Super excited to see this comprehensive library by @Ber18791531 for popular reasoning methods!
Reasoning is the core ability of LLMs. Super excited to see this comprehensive library by @Ber18791531 for popular reasoning methods!
Meet Reka Core, our best and most capable multimodal language model yet. 🔮 It’s been a busy few months training this model and we are glad to finally ship it! 💪 Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the 3 body…
Awesome enhancements! Super helpful for users from different countries to pick up models!
a significant step towards web-browing visual agents!
a significant step towards web-browing visual agents!
A good day. Testing our new ✨Reka Core✨ model and its showing promising capabilities. Complex table understanding is one of them. Lmk if you are interested in early access @RekaAILabs
compute = intelligence. LLMs have two dimensions to adaptively adjust the compute. - Depth & Width: the model params are essential for certain ability; MoE & MoD extends this idea more dynamically; - Temporal: CoT gives LLMs more tokens to think, so better results.
compute = intelligence. LLMs have two dimensions to adaptively adjust the compute. - Depth & Width: the model params are essential for certain ability; MoE & MoD extends this idea more dynamically; - Temporal: CoT gives LLMs more tokens to think, so better results.
the qwen team is so amazing!!
New Research: a lot of talk today about "what happens" inside a language model, since they spend the exact same amount of compute on each token, regardless of difficulty. we touch on this question on our new theory paper, Do Language Models Plan for Future Tokens?
very inspiring work and valuable benchmarks for LVLMs!
[75min talk] i finally recorded this lecture I gave two weeks ago because people kept asking me for a video so here it is, enjoy "The Little guide to building Large Language Models in 2024" tried to keep it short and comprehensive – focusing on concepts that are crucial for…
🚀Our new paper on training details, official code, and FAQ of the "The-Era-of-1-bit-LLM" paper is public. github.com/microsoft/unil… 🔥We provide additional experiments and results that were not reported in the original paper. 📢Join in our discussion at huggingface.co/papers/2402.17…
A very timely survey for the interesting knowledge conflict problem.
🚨🚨🚨 ONE image can plant a BOMB into LVLMs to bypass the safety alignment!
🚨🚨🚨 ONE image can plant a BOMB into LVLMs to bypass the safety alignment!
Research impacts ≫ getting papers published. Impactful research stems from tackling important questions. This blog offers insightful tips. Personal addons: overfitting as a code sanity check & intuition verification through oracle experiments.
Research impacts ≫ getting papers published. Impactful research stems from tackling important questions. This blog offers insightful tips. Personal addons: overfitting as a code sanity check & intuition verification through oracle experiments.
Yucheng Zhou @iyczhou
18 Followers 132 Following Ph.D. student at University of Macau | MS at Fudan University | prev. intern @MSFTResearch | Research on LLM and VLElon musk @elon__musk0909
1 Followers 114 FollowingZhenwen Liang @LiangZhenwen
201 Followers 223 Following PhD stundent in NLP, University of Notre Dame. Previous intern at Aristo, AI2 and Tencent AI Lab.Ziqi Jin @Philaspp
3 Followers 44 Following Researcher in StatNLP Lab (Singapore University of Technology and Design). Focus on LLMs reasoning ability.Ashutosh Mehra @ashutoshmehra
2K Followers 5K Following Senior Principal Scientist at Adobe. Working on Acrobat AI Assistant, LLMs, and document ML.huskydoge @huskydogewoof
43 Followers 169 Following Undergraduate in IEEE-CS at SJTU, passionate about Explainable AI, NLP, and AIGC. Actively seeking PhD opportunities for 2025 and summer research internAli Athar @AliAthar1401
74 Followers 367 Following 🌟 AI PhD student in South Korea | Researching AI, NLP, and healthcare applications 💻 | MS degree from NUST 🎓 | Travel lover.Eason Shaw @DeepSeek @EasonShaw3
9 Followers 176 Following #WeAreHiring #LLM Find me through ‘[email protected]’Yuyi Li @NYU_liyuyi
0 Followers 5 Followingarash ramedani @arash_ramedani
50 Followers 906 FollowingRui Zhang @ruizhang_nlp
2K Followers 979 Following Researcher in #NLProc | Assistant Professor @PennStateEECSFrank Xu @frankxu2004
692 Followers 566 Following language and computer stuff, phd student @ltiatcmuJ. Shen @JennyShen056
19 Followers 341 Following MSDS student @DukeU | NLP & RL & Explainable AI researchNikhil Sharma @nikhilsksharma
252 Followers 643 Following Incoming PhD in HAI @JohnsHopkins | Information Seeking | Disinformation Agents | Copilots for Social Good | PhD @JHUCLSP @JHUMCEH #NLProcAlyssa, Yi CHENG @YiCheng77783310
95 Followers 212 Following Ph.D. student, working on NLP for social good and conversational AI.Zirui Wu @WilliamZR7
46 Followers 226 Following Master Student at PIE Lab @pielabpku, Peking University, China | NLPXiang Yue @xiangyue96
2K Followers 439 Following Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.Coap Pink @coaprinwal20541
36 Followers 165 FollowingMingkai Deng @mdeng34
324 Followers 280 Following PhD student @LTIatCMU | MSML @mldcmu | BA Math-Stats + CS @Columbia | CV, RL, NLP | He/HisRadoslav Krivak @rdkbio
356 Followers 5K Following Structural Bioinformatics / AI for Drug Discovery / Geometric DL (@IOCBPrague, prev. PhD @cusbg)Aman Bansal @logisticloon
5 Followers 376 Following UMass Amherst | Ex Goldman Sachs | IIT KharagpurUCR Computer Science .. @UCR_CSE
1K Followers 2K Following The official Twitter account for the Computer Science and Engineering Department at UC RiversideMarshall D. Willman @dionysianyawp
403 Followers 2K Following AI | LLMs | ML | Python | CEO @egocraftai | prev faculty @NYIT | PhD math logic, NL analysis | typus logicus: my hounds are machinesA Xin @L5cBt3oVmeAG629
59 Followers 1K Following There is no dress rehearsal in life, every day is a live broadcast.Georgina @Georgin4704946
19 Followers 368 FollowingDong Zhang @dongzha35524835
87 Followers 276 Following Speech Language Models | MS Student at FudanNLP Lab @FudanUniv | Looking for Ph.D. in 2025 fallAlexander Wan @alexwan55
474 Followers 944 Following CS at Berkeley; @BerkeleyML @BerkeleyNLP; NLP researchAbhishek Mukherjee @eceabhishek
20 Followers 556 Following Human ML Engineer || NLP || Generative AI || IIT(ISM) Dhanbad || UToledoEhsan Aghazadeh @AghazadeehEhsan
47 Followers 281 Following PhD student at UMass @manningcics #NLProc #Machinelearning The city I live in is not at all the shape of the city that lives in me.INAM KHAN @inamullahnaseeh
184 Followers 4K Following 🚀 BSCS grad 🎓 | Passionate about AI, Machine Learning, and Data Science 💻 | Eagerly seeking internships to dive into the world of cutting-edge tech!Weixi Feng @weixi_feng
395 Followers 292 Following CS Ph.D. candidate @UCSB @UCSBNLP. Ex-research intern @Adobe, @Amazon. #Multimodality #ComputerVision #NLProc.Ted Xiao @xiao_ted
11K Followers 682 Following I teach robots to be smarter @GoogleDeepMind. Tweets about robot learning, scaling, and large models. Opinions my own.Hao Liu @haoliuhl
4K Followers 155 Following phd student @berkeley_ai https://t.co/ZNJawlrerS machine learning, neural networks.Chris Paxton @chris_j_paxton
8K Followers 2K Following Mostly posting about robots. Embodied AI @hellorobotinc, formerly @AIatMeta, @NVIDIAAI, @zoox. All views my own.Xiang Yue @xiangyue96
2K Followers 439 Following Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.Fan Zhou @FaZhou_998
180 Followers 406 Following AI Research at Shanghai AI Lab | GAIR RA @XLangNLP @HKUniversity | Ex Intern @MSFTResearch Undergrad & M.S. @sjtu1896Alyssa, Yi CHENG @YiCheng77783310
95 Followers 212 Following Ph.D. student, working on NLP for social good and conversational AI.Yangqing Jia @jiayq
13K Followers 263 Following Founder @leptonai. @UCBerkeley alumni. ex @google & @facebook. ex vp @AlibabaGroup. Open source work on caffe, @pytorch, @tensorflow, & @onnxai.HaoyueBai @haoyue_bai
945 Followers 849 Following Ph.D. student at Computer Science Department @UWMadisonCS, MPhil @HKUSTCSE.Junpeng Liu @jeepliu1212
50 Followers 81 Following Ph.D. student @CUHKofficial, supervised by Prof. Wai LAM. (Multimodal) Large Language ModelWei-Lin Chiang @infwinston
3K Followers 853 Following CS PhD student at UC Berkeley. co-lead of Chatbot Arena @lmsysorgAbhi Venigalla @abhi_venigalla
5K Followers 1K Following Researcher @Databricks. Former @MosaicML, @CerebrasSystems. Addicted to all things compute.Tri Dao @tri_dao
19K Followers 365 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.jason @agikoala
2K Followers 24 Following secondary account (main is @_jasonwei) @agihippo is a buddy of mineGe Zhang @GeZhang86038849
749 Followers 448 Following Founder: M-A-P(https://t.co/CGWz8Jr9K9) Incoming Ph.D. student: Computer Science @UWaterloo MSc: ECE & DS @UMich BSc: Computer Science @ BUPTHuaxiu Yao @HuaxiuYaoML
3K Followers 527 Following Assistant Professor of Computer Science @UNC @unccs @uncsdss | Postdoc @StanfordAILab | Ph.D. @PennState | #foundationmodels, #AISafety, #AIforScience | he/himAdam Santoro @santoroAI
10K Followers 240 Following Research Scientist in artificial intelligence at DeepMindAlexander Wan @alexwan55
474 Followers 944 Following CS at Berkeley; @BerkeleyML @BerkeleyNLP; NLP researchSpaceX @SpaceX
34.7M Followers 114 Following SpaceX designs, manufactures and launches the world’s most advanced rockets and spacecraftQintong Li @qintong_li
233 Followers 244 Following A PhD student interested in NLP and ML. I’m working on text generation and its downstream tasks.Yifei Wang @yifeiwang77
431 Followers 724 Following Postdoc @MIT_CSAIL working on self-supervised learning. I prompt myself.Hải @hai_t_pham
182 Followers 718 Following Member of Technical Staff at @RekaAILabs, Ph.D. from CMU LTI/SCS.Piotr Padlewski @PiotrPadlewski
2K Followers 320 Following Chief Meme Officer @ https://t.co/CtBrcKmliI, ex-Google Deepmind/Brain ZurichQi Liu @leuchine
384 Followers 402 Following Cofounder @RekaAILabs, Assistant Professor @HKUniversity Past: @DeepMind, FAIR (@MetaAI), @MSFTResearch, PhD @UniofOxfordZhihong Shao @zhs05232838
265 Followers 574 Following Ph.D. Student @TsinghuaCoAI on LLMs and Reasoning | Ex. @MSFTResearch | Recent: DeepSeekMath, ToRA.Zhongkai Zhu @ZhongkaiZhu
86 Followers 135 FollowingMax Bain @maxhbain
2K Followers 519 Following multimodal @RekaAILabs | prev: phd @Oxford_VGG hardwork-pilledPan Lu @lupantech
4K Followers 1K Following PhD @CS_UCLA @uclanlp | Amazon/Bloomberg/Qualcomm/UCLA Fellows | Ex @Tsinghua_Uni @MSFTResearch @allen_ai @Adobe | #NLPoc, LLMs, Reasoning, AI4Math, AI4ScienceHaotian Liu @imhaotian
6K Followers 398 Following building intelligence @xAI, creator of #LLaVA, cs @UWMadison, prev @MSFTResearchBailin Wang @bailin_28
501 Followers 2K Following NLP researcher (w. latent variables, discrete structures/grammars, sequence models)John Schulman @johnschulman2
39K Followers 611 Following Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz musicBen Newhouse @newhouseb
7K Followers 955 Following @openai, https://t.co/i3YR3e9UMT, former head of sync @ dropbox (till 2018), cofounded bubbli (acquired by dropbox), previously made yelp monocle.Lior⚡ @AlphaSignalAI
84K Followers 901 Following Covering the latest in AI R&D • ML Engineer • Ex-Mila researcher • MIT Lecturer • Building AlphaSignal, a technical newsletter read by 180,000+ ML experts.Hao Peng @HaoPengNLP
10 Followers 88 FollowingYue Yang @YueYangAI
310 Followers 240 Following PhD student @upennnlp, interested in vision and language.Wenting Zhao @wzhao_nlp
820 Followers 359 Following PhD student @cornell_tech Food for life, NLP for soul!DeepSeek @deepseek_ai
4K Followers 0 Following Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.Shumin Deng @dsmall2apple1
261 Followers 294 Following Research Fellow at NUS Research Interests: NLP, Structured Prediction, IE, KG, Neuro Symbolic Reasoning, Multi-Agent Collaboration, Knowledge Editing for LLMsLewis Tunstall @_lewtun
9K Followers 425 Following 🤗 LLM engineering & research @huggingface 📖 Co-author of "NLP with Transformers" book 💥 Ex-particle physicist 🤘 Occasional guitarist 🇦🇺 in 🇨🇭Saining Xie @sainingxie
14K Followers 1K Following researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiegoQian Liu 🔭 @sivil_taram
2K Followers 434 Following ⚓️ Sailor / LoraHub / TAPEX / OctoPack / 💫 StarCoder 1/2 🐚 Research Scientist @SeaAIL 🇸🇬 𝐩𝐫𝐞𝐯 @MSFTResearch Contribution @XlangNLP @BigCodeProjectJunxian He @junxian_he
787 Followers 383 Following Assist. Prof @hkust. NLP/ML PhD @LTIatCMU. prev. @MetaAI @SFResearch.Chuang Gan @gan_chuang
4K Followers 455 Following Faculty Member at UMass Amherst; Principal researcher at MIT-IBM Watson AI Lab; Homepage: https://t.co/oXP6pqXCpoLeo Gao @nabla_theta
5K Followers 356 Following Alignment researcher. cofounder & head of alignment memes @ EleutherAI. currently RE @ OpenAI. Let's make the future awesome.Jacob Andreas @jacobandreas
14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJwIn-context learning provides an LLM with a few examples to improve accuracy. But with long-context LLMs, we can now use *thousands* of examples in-context. We find that this long-context ICL paradigm is surprisingly effective– and differs in behavior from short-context ICL! 🧵
New paper from @RekaAILabs 🔥 (yes an actual paper). This time we're releasing part of our internal evals which we call Vibe-Eval 😃 This comprises of a hard set which imo is pretty challenging for frontier models today. The fun part here is that we constructed it by trying to…
Evaluation is such a hard and underappreciated problem. My own internal ranking is in large part driven by weak signals like brand credibility (aka "rings a bell, sounds legit, gonna use that"). Way to go Reka team! (would be great to have the eval set on the hf hub 😎)
Model evals are hard, that is why we are shedding some light on how we do it at Reka. Along with the paper, we are releasing a dataset of challenging prompts with a golden reference and evaluation protocol using Reka Core as a judge.
As benchmarks continue to get saturated, it's great to see a no-frills benchmark of 387 challenging math problems: github.com/protagolabs/od… GPT-4 is 66% on high-school subset, 42% on college subset, and only 11% on high-school competition subset.
model = learn(data) Synthetic data is great, but it’s not data. It’s an intermediate quantity created by learn(). Data is created by people and has privacy and copyright considerations. Synthetic “data” does not - it’s internal to learn().
If you are working on RAG with Text and Images (Multimodal RAG), you might be familiar with the above pipeline: maintain 2 sets of models and store 2 or 3 sets of embeddings to be able to utilise the images. Why? The CLIP text encoder is a weak text encoder, you need a separate…
@DrJimFan Multimodal LLM arena from @allen_ai —> WildVision-Arena: huggingface.co/spaces/WildVis…
Glad to see Idefics2 making its way into the awesome OpenVLM Leaderboard which ranks VLMs. 2nd in its category (<10B parameters and open weights)! While InternLM-XComposer2 uses proprietary data, Idefics2 is built solely using openly available data. Leaderboard:…
In AI research there is tremendous value in intuitions on what makes things work. In fact, this skill is what makes “yolo runs” successful, and can accelerate your team tremendously. However, there’s no track record on how good someone’s intuition is. A fun way to do this is…
Some personal updates: I joined OpenAI a few months ago, working on all things robustness/safety/privacy. Also, we are working to publish more of our safety work. See my first project here below, where we make initial progress on prompt injections and other attacks!
Introducing the Instruction Hierarchy, our latest safety research to advance robustness for prompt injections and other ways of tricking LLMs into executing unsafe actions. More details: arxiv.org/abs/2404.13208
🎉 Starting today, the DeepSeek APIs offer pay-as-you-go options at attractive low prices! ✨ Sign up for 5M free tokens. Want more? Purchase 1M tokens for only $0.14 to $0.28! 👉 Kick off your seamless DeepSeek API journey now: platform.deepseek.com #DeepSeek #DeepSeekAPI
📢 New paper: Compared to 𝐌𝐮𝐥𝐭𝐢-𝐦𝐨𝐝𝐚𝐥 𝐂𝐨𝐓, We found 𝐃𝐞𝐬𝐜𝐫𝐢𝐛𝐞 (visual description generation)-then-𝐑𝐞𝐚𝐬𝐨𝐧 (generating 𝐌𝐮𝐥𝐭𝐢-𝐦𝐨𝐝𝐚𝐥 𝐂𝐨𝐓 with the assistance of descriptions) could greatly improve math reasoning on MathVista and MathVerse.…
Excited to announce that our knowledge editing tool EasyEdit now supports llama3-8b! 🚀 Currently rocking ROME editing method. We'll keep it fresh with updates and more editing methods, but remember: Transformers gotta level up to 4.40.0. Got questions or requests? Hit us up with…
Llama-3 is closing the gap with GPT-4, but multimodal models gotta catch up. Vision capabilities of open models like LlaVA are far, far behind GPT-4V. Video models are even worse. They hallucinate all the time and fail to give detailed descriptions of complex scenes and actions.…
We have just released 🍷 FineWeb: 15 trillion tokens of high quality web data. We filtered and deduplicated all CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile and SlimPajama!
🚀Excited to share our new paper "LongEmbed: Extending Embedding Models for Long Context Retrieval". We introduce the LongEmbed benchmark, explore context extension of existing embedding models, and release E5-Base-4k & E5-RoPE-Base. Paper: arxiv.org/abs/2404.12096
Excited to be part of 🦙, more to come! ai.meta.com/blog/meta-llam…
forget frontier-tier llm, on the quest for frontier-class coffee. anyone have recommendations?