Lei Li @_TobiasLee

Ph.D. student@HKUNLP. Previously LANCO@PKU. lilei-nlp.github.io Hong Kong Joined August 2015

Tweets

164
Followers

737
Following

617
Likes

963

Percy Liang @percyliang

5 days ago

model = learn(data) Synthetic data is great, but it’s not data. It’s an intermediate quantity created by learn(). Data is created by people and has privacy and copyright considerations. Synthetic “data” does not - it’s internal to learn().

30 53 417 61K 160

Lei Li @_TobiasLee

a week ago

wow, this price is so amazing given the performance!

DeepSeek @deepseek_ai

a week ago

wow, this price is so amazing given the performance!

1 8 43 4K 11

Download Image

0 0 3 583 0

Nathan Lambert @natolambert

a week ago

It's not PPO > DPO, It's policy generated data > stale data, In this paper, we answer this question by performing a rigorous analysis of a number of fine-tuning techniques on didactic and full-scale LLM problems. Our main finding is that, in general, approaches that use…

5 76 449 46K 392

Download Image

Lei Li @_TobiasLee

a week ago

Reasoning is the core ability of LLMs. Super excited to see this comprehensive library by @Ber18791531 for popular reasoning methods!

Maitrix.org @MaitrixOrg

2 weeks ago

Reasoning is the core ability of LLMs. Super excited to see this comprehensive library by @Ber18791531 for popular reasoning methods!

2 58 188 47K 159

Download Image

0 2 16 2K 6

Reka @RekaAILabs

3 weeks ago

Meet Reka Core, our best and most capable multimodal language model yet. 🔮 It’s been a busy few months training this model and we are glad to finally ship it! 💪 Core has a lot of capabilities, and one of them is understanding video --- let’s see what Core thinks of the 3 body…

49 227 1K 642K 445

Download Video

Lei Li @_TobiasLee

3 weeks ago

Awesome enhancements! Super helpful for users from different countries to pick up models!

Wei-Lin Chiang @infwinston

3 weeks ago

Awesome enhancements! Super helpful for users from different countries to pick up models!

4 7 38 33K 4

Download Image

0 0 1 567 0

Lei Li @_TobiasLee

3 weeks ago

a significant step towards web-browing visual agents!

Xiang Yue @xiangyue96

3 weeks ago

a significant step towards web-browing visual agents!

3 28 145 47K 77

Download Video

0 1 19 2K 1

Lei Li @_TobiasLee

3 weeks ago

Super awesome benchmark by @TianbaoX . LVLMs now have a more real arena for their capability now!

Aran Komatsuzaki @arankomatsuzaki

3 weeks ago

Super awesome benchmark by @TianbaoX . LVLMs now have a more real arena for their capability now!

4 87 354 73K 243

Download Video

0 0 5 982 0

Max Bain @maxhbain

4 weeks ago

A good day. Testing our new ✨Reka Core✨ model and its showing promising capabilities. Complex table understanding is one of them. Lmk if you are interested in early access @RekaAILabs

22 16 76 31K 14

Download Image

Lei Li @_TobiasLee

4 weeks ago

compute = intelligence. LLMs have two dimensions to adaptively adjust the compute. - Depth & Width: the model params are essential for certain ability; MoE & MoD extends this idea more dynamically; - Temporal: CoT gives LLMs more tokens to think, so better results.

Piotr Padlewski @PiotrPadlewski

4 weeks ago

6 38 268 30K 176

Download Image

1 1 32 2K 9

Lei Li @_TobiasLee

4 weeks ago

the qwen team is so amazing!!

Junyang Lin @JustinLin610

4 weeks ago

the qwen team is so amazing!!

14 61 301 89K 98

Download Image

0 0 3 605 1

jack morris @jxmnop

4 weeks ago

New Research: a lot of talk today about "what happens" inside a language model, since they spend the exact same amount of compute on each token, regardless of difficulty. we touch on this question on our new theory paper, Do Language Models Plan for Future Tokens?

22 145 1K 134K 1K

Download Image

Lei Li @_TobiasLee

a month ago

very inspiring work and valuable benchmarks for LVLMs!

Pan Lu @lupantech

a month ago

very inspiring work and valuable benchmarks for LVLMs!

1 34 213 37K 93

Download Image

0 0 6 580 0

Thomas Wolf @Thom_Wolf

a month ago

[75min talk] i finally recorded this lecture I gave two weeks ago because people kept asking me for a video so here it is, enjoy "The Little guide to building Large Language Models in 2024" tried to keep it short and comprehensive – focusing on concepts that are crucial for…

12 243 1K 122K 2K

Download Image

Shuming Ma @ma_shuming

a month ago

🚀Our new paper on training details, official code, and FAQ of the "The-Era-of-1-bit-LLM" paper is public. github.com/microsoft/unil… 🔥We provide additional experiments and results that were not reported in the original paper. 📢Join in our discussion at huggingface.co/papers/2402.17…

3 28 115 15K 43

Download Image

Lei Li @_TobiasLee

2 months ago

A very timely survey for the interesting knowledge conflict problem.

elvis @omarsar0

2 months ago

A very timely survey for the interesting knowledge conflict problem.

5 138 562 42K 388

Download Image

0 0 9 854 4

Lei Li @_TobiasLee

2 months ago

🚨🚨🚨 ONE image can plant a BOMB into LVLMs to bypass the safety alignment!

Xijia Tao @xijia_tao

2 months ago

🚨🚨🚨 ONE image can plant a BOMB into LVLMs to bypass the safety alignment!

10 16 62 8K 23

0 0 5 555 1

Lei Li @_TobiasLee

2 months ago

Research impacts ≫ getting papers published. Impactful research stems from tackling important questions. This blog offers insightful tips. Personal addons: overfitting as a code sanity check & intuition verification through oracle experiments.

Ofir Press 🖋 @OfirPress

2 months ago

1 6 65 8K 37

Download Image

0 0 19 2K 3

Venkata Kesav Venna @venna_kesav

19 Followers 568 Following Just A Regular Human...

Connor Brioght @CBrioght

2 Followers 46 Following agi is not coming without me

Yucheng Zhou @iyczhou

18 Followers 132 Following Ph.D. student at University of Macau | MS at Fudan University | prev. intern @MSFTResearch | Research on LLM and VL

Elon musk @elon__musk0909

1 Followers 114 Following

Frank Gu @FrankGu3528

7 Followers 81 Following BS @ZJU_china current @hkust nlp / data mining

Zhenwen Liang @LiangZhenwen

201 Followers 223 Following PhD stundent in NLP, University of Notre Dame. Previous intern at Aristo, AI2 and Tencent AI Lab.

Ziqi Jin @Philaspp

3 Followers 44 Following Researcher in StatNLP Lab (Singapore University of Technology and Design). Focus on LLMs reasoning ability.

Technology Artificial.. @technologyai_

179 Followers 4K Following Latest insights of Ai/ML/DL

Ashutosh Mehra @ashutoshmehra

2K Followers 5K Following Senior Principal Scientist at Adobe. Working on Acrobat AI Assistant, LLMs, and document ML.

Undergraduate in IEEE-CS at SJTU, passionate about Explainable AI, NLP, and AIGC. Actively seeking PhD opportunities for 2025 and summer research intern

huskydoge @huskydogewoof

43 Followers 169 Following Undergraduate in IEEE-CS at SJTU, passionate about Explainable AI, NLP, and AIGC. Actively seeking PhD opportunities for 2025 and summer research intern

Ali Athar @AliAthar1401

74 Followers 367 Following 🌟 AI PhD student in South Korea | Researching AI, NLP, and healthcare applications 💻 | MS degree from NUST 🎓 | Travel lover.

Daniel Israel @danielmisrael

249 Followers 2K Following PhD Student Studying AI/ML @UCLA

Mihara @mihara88869272

19 Followers 718 Following RL, NLP, LLM for Intelligent Education.

#WeAreHiring #LLM
Find me through ‘yi.shao@high-flyer.cn’

Eason Shaw @DeepSeek @EasonShaw3

9 Followers 176 Following #WeAreHiring #LLM Find me through ‘[email protected]’

Yuyi Li @NYU_liyuyi

0 Followers 5 Following

arash ramedani @arash_ramedani

50 Followers 906 Following

Rui Zhang @ruizhang_nlp

2K Followers 979 Following Researcher in #NLProc | Assistant Professor @PennStateEECS

lakshminarasimhan san.. @Sln2737

46 Followers 171 Following Be chiseled, be worshipped

Frank Xu @frankxu2004

692 Followers 566 Following language and computer stuff, phd student @ltiatcmu

𝔽_un @FF_un1

658 Followers 8K Following have fun

J. Shen @JennyShen056

19 Followers 341 Following MSDS student @DukeU | NLP & RL & Explainable AI research

Nitin Appiah @nitinappiah

13 Followers 68 Following Grad student, Ohio State

Incoming PhD in HAI @JohnsHopkins | Information Seeking | Disinformation Agents | Copilots for Social Good | PhD @JHUCLSP @JHUMCEH #NLProc

Nikhil Sharma @nikhilsksharma

252 Followers 643 Following Incoming PhD in HAI @JohnsHopkins | Information Seeking | Disinformation Agents | Copilots for Social Good | PhD @JHUCLSP @JHUMCEH #NLProc

MoonRide @moonride303

89 Followers 4K Following Friend of AIs

Alyssa, Yi CHENG @YiCheng77783310

95 Followers 212 Following Ph.D. student, working on NLP for social good and conversational AI.

Zirui Wu @WilliamZR7

46 Followers 226 Following Master Student at PIE Lab @pielabpku, Peking University, China | NLP

jackmtlee @jackmtleee

50 Followers 545 Following 🇨🇦

Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.

Xiang Yue @xiangyue96

2K Followers 439 Following Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.

Pingchuan Ma @pika7ma

766 Followers 1K Following @mit_csail

Coap Pink @coaprinwal20541

36 Followers 165 Following

Qinyu Chen @MorganC487369

2 Followers 23 Following Student

r0_0tm3m @r0_0tm3m

94 Followers 1K Following CyberSecurity

Mingkai Deng @mdeng34

324 Followers 280 Following PhD student @LTIatCMU | MSML @mldcmu | BA Math-Stats + CS @Columbia | CV, RL, NLP | He/His

Radoslav Krivak @rdkbio

356 Followers 5K Following Structural Bioinformatics / AI for Drug Discovery / Geometric DL (@IOCBPrague, prev. PhD @cusbg)

Aman Bansal @logisticloon

5 Followers 376 Following UMass Amherst | Ex Goldman Sachs | IIT Kharagpur

weiwei @shiziji

41 Followers 855 Following ML → LLM → AGI

UCR Computer Science .. @UCR_CSE

1K Followers 2K Following The official Twitter account for the Computer Science and Engineering Department at UC Riverside

Akshat Gupta @akshatgupta57

201 Followers 745 Following CS PhD, UC Berkeley

Jinfa Huang @vhjf36495872

110 Followers 109 Following @UoR University of Rochester, CS PHD

Marshall D. Willman @dionysianyawp

A Xin @L5cBt3oVmeAG629

59 Followers 1K Following There is no dress rehearsal in life, every day is a live broadcast.

Georgina @Georgin4704946

19 Followers 368 Following

Dong Zhang @dongzha35524835

87 Followers 276 Following Speech Language Models | MS Student at FudanNLP Lab @FudanUniv | Looking for Ph.D. in 2025 fall

Sizhe Zhou @SizheZhou189667

76 Followers 632 Following MS @IllinoisCS | BEng @SJTU1896

Alexander Wan @alexwan55

474 Followers 944 Following CS at Berkeley; @BerkeleyML @BerkeleyNLP; NLP research

Abhishek Mukherjee @eceabhishek

20 Followers 556 Following Human ML Engineer || NLP || Generative AI || IIT(ISM) Dhanbad || UToledo

PhD student at UMass @manningcics #NLProc #Machinelearning
The city I live in is not at all the shape of the city that lives in me.

Ehsan Aghazadeh @AghazadeehEhsan

47 Followers 281 Following PhD student at UMass @manningcics #NLProc #Machinelearning The city I live in is not at all the shape of the city that lives in me.

David S. Chuhran @realSERARA

764 Followers 5K Following SERARA Magisterial Son of Urantia

🚀 BSCS grad 🎓 | Passionate about AI, Machine Learning, and Data Science 💻 | Eagerly seeking internships to dive into the world of cutting-edge tech!

INAM KHAN @inamullahnaseeh

184 Followers 4K Following 🚀 BSCS grad 🎓 | Passionate about AI, Machine Learning, and Data Science 💻 | Eagerly seeking internships to dive into the world of cutting-edge tech!

Weixi Feng @weixi_feng

395 Followers 292 Following CS Ph.D. candidate @UCSB @UCSBNLP. Ex-research intern @Adobe, @Amazon. #Multimodality #ComputerVision #NLProc.

Ted Xiao @xiao_ted

11K Followers 682 Following I teach robots to be smarter @GoogleDeepMind. Tweets about robot learning, scaling, and large models. Opinions my own.

Hao Liu @haoliuhl

4K Followers 155 Following phd student @berkeley_ai https://t.co/ZNJawlrerS machine learning, neural networks.

Chris Paxton @chris_j_paxton

8K Followers 2K Following Mostly posting about robots. Embodied AI @hellorobotinc, formerly @AIatMeta, @NVIDIAAI, @zoox. All views my own.

Qinyu Chen @MorganC487369

2 Followers 23 Following Student

Xiang Yue @xiangyue96

2K Followers 439 Following Postdoc @LTIatCMU. PhD from Ohio State @osunlp. Training & evaluating foundation models. Pushing the boundaries of AI🤖. Previously @MSFTResearch.

Fan Zhou @FaZhou_998

180 Followers 406 Following AI Research at Shanghai AI Lab | GAIR RA @XLangNLP @HKUniversity | Ex Intern @MSFTResearch Undergrad & M.S. @sjtu1896

Alyssa, Yi CHENG @YiCheng77783310

95 Followers 212 Following Ph.D. student, working on NLP for social good and conversational AI.

Airchat @getairchat

33K Followers 9 Following Just talk

Founder @leptonai. @UCBerkeley alumni. ex @google & @facebook. ex vp @AlibabaGroup. Open source work on caffe, @pytorch, @tensorflow, & @onnxai.

Yangqing Jia @jiayq

13K Followers 263 Following Founder @leptonai. @UCBerkeley alumni. ex @google & @facebook. ex vp @AlibabaGroup. Open source work on caffe, @pytorch, @tensorflow, & @onnxai.

HaoyueBai @haoyue_bai

945 Followers 849 Following Ph.D. student at Computer Science Department @UWMadisonCS, MPhil @HKUSTCSE.

Junpeng Liu @jeepliu1212

50 Followers 81 Following Ph.D. student @CUHKofficial, supervised by Prof. Wai LAM. (Multimodal) Large Language Model

Wei-Lin Chiang @infwinston

3K Followers 853 Following CS PhD student at UC Berkeley. co-lead of Chatbot Arena @lmsysorg

Abhi Venigalla @abhi_venigalla

5K Followers 1K Following Researcher @Databricks. Former @MosaicML, @CerebrasSystems. Addicted to all things compute.

Tri Dao @tri_dao

19K Followers 365 Following Incoming Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.

jason @agikoala

2K Followers 24 Following secondary account (main is @_jasonwei) @agihippo is a buddy of mine

Founder: M-A-P(https://t.co/CGWz8Jr9K9)
Incoming Ph.D. student: Computer Science @UWaterloo
MSc: ECE & DS @UMich
BSc: Computer Science @ BUPT

Ge Zhang @GeZhang86038849

749 Followers 448 Following Founder: M-A-P(https://t.co/CGWz8Jr9K9) Incoming Ph.D. student: Computer Science @UWaterloo MSc: ECE & DS @UMich BSc: Computer Science @ BUPT

Assistant Professor of Computer Science @UNC @unccs @uncsdss | Postdoc @StanfordAILab | Ph.D. @PennState | #foundationmodels, #AISafety, #AIforScience | he/him

Huaxiu Yao @HuaxiuYaoML

3K Followers 527 Following Assistant Professor of Computer Science @UNC @unccs @uncsdss | Postdoc @StanfordAILab | Ph.D. @PennState | #foundationmodels, #AISafety, #AIforScience | he/him

Adam Santoro @santoroAI

10K Followers 240 Following Research Scientist in artificial intelligence at DeepMind

Alexander Wan @alexwan55

474 Followers 944 Following CS at Berkeley; @BerkeleyML @BerkeleyNLP; NLP research

Suno @suno_ai_

33K Followers 0 Following Make any song you can imagine

SpaceX @SpaceX

34.7M Followers 114 Following SpaceX designs, manufactures and launches the world’s most advanced rockets and spacecraft

Peng Li @lipengthu

41 Followers 209 Following Researcher

Qintong Li @qintong_li

233 Followers 244 Following A PhD student interested in NLP and ML. I’m working on text generation and its downstream tasks.

Yifei Wang @yifeiwang77

431 Followers 724 Following Postdoc @MIT_CSAIL working on self-supervised learning. I prompt myself.

Sansa Gong @sansa19739319

27 Followers 69 Following Diffusion in NLP/HKU PhD

Hải @hai_t_pham

182 Followers 718 Following Member of Technical Staff at @RekaAILabs, Ph.D. from CMU LTI/SCS.

Donovan Ong @donovanOng_

222 Followers 1K Following Member of Technical Staff at @RekaAILabs, 🇸🇬

Piotr Padlewski @PiotrPadlewski

2K Followers 320 Following Chief Meme Officer @ https://t.co/CtBrcKmliI, ex-Google Deepmind/Brain Zurich

Cofounder @RekaAILabs, Assistant Professor @HKUniversity Past: @DeepMind, FAIR (@MetaAI), @MSFTResearch, PhD @UniofOxford

Qi Liu @leuchine

384 Followers 402 Following Cofounder @RekaAILabs, Assistant Professor @HKUniversity Past: @DeepMind, FAIR (@MetaAI), @MSFTResearch, PhD @UniofOxford

Zhihong Shao @zhs05232838

265 Followers 574 Following Ph.D. Student @TsinghuaCoAI on LLMs and Reasoning | Ex. @MSFTResearch | Recent: DeepSeekMath, ToRA.

Zhongkai Zhu @ZhongkaiZhu

86 Followers 135 Following

Max Bain @maxhbain

2K Followers 519 Following multimodal @RekaAILabs | prev: phd @Oxford_VGG hardwork-pilled

PhD @CS_UCLA @uclanlp | Amazon/Bloomberg/Qualcomm/UCLA Fellows | Ex @Tsinghua_Uni @MSFTResearch @allen_ai @Adobe | #NLPoc, LLMs, Reasoning, AI4Math, AI4Science

Pan Lu @lupantech

4K Followers 1K Following PhD @CS_UCLA @uclanlp | Amazon/Bloomberg/Qualcomm/UCLA Fellows | Ex @Tsinghua_Uni @MSFTResearch @allen_ai @Adobe | #NLPoc, LLMs, Reasoning, AI4Math, AI4Science

Haotian Liu @imhaotian

6K Followers 398 Following building intelligence @xAI, creator of #LLaVA, cs @UWMadison, prev @MSFTResearch

Bailin Wang @bailin_28

501 Followers 2K Following NLP researcher (w. latent variables, discrete structures/grammars, sequence models)

Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz music

John Schulman @johnschulman2

39K Followers 611 Following Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz music

@openai, https://t.co/i3YR3e9UMT, former head of sync @ dropbox (till 2018), cofounded bubbli (acquired by dropbox), previously made yelp monocle.

Ben Newhouse @newhouseb

7K Followers 955 Following @openai, https://t.co/i3YR3e9UMT, former head of sync @ dropbox (till 2018), cofounded bubbli (acquired by dropbox), previously made yelp monocle.

Lior⚡ @AlphaSignalAI

84K Followers 901 Following Covering the latest in AI R&D • ML Engineer • Ex-Mila researcher • MIT Lecturer • Building AlphaSignal, a technical newsletter read by 180,000+ ML experts.

Hao Peng @HaoPengNLP

10 Followers 88 Following

Yue Yang @YueYangAI

310 Followers 240 Following PhD student @upennnlp, interested in vision and language.

Wenting Zhao @wzhao_nlp

820 Followers 359 Following PhD student @cornell_tech Food for life, NLP for soul!

DeepSeek @deepseek_ai

4K Followers 0 Following Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.

Research Fellow at NUS

Research Interests: NLP, Structured Prediction, IE, KG, Neuro Symbolic Reasoning, Multi-Agent Collaboration, Knowledge Editing for LLMs

Shumin Deng @dsmall2apple1

261 Followers 294 Following Research Fellow at NUS Research Interests: NLP, Structured Prediction, IE, KG, Neuro Symbolic Reasoning, Multi-Agent Collaboration, Knowledge Editing for LLMs

Lewis Tunstall @_lewtun

9K Followers 425 Following 🤗 LLM engineering & research @huggingface 📖 Co-author of "NLP with Transformers" book 💥 Ex-particle physicist 🤘 Occasional guitarist 🇦🇺 in 🇨🇭

researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiego

Saining Xie @sainingxie

14K Followers 1K Following researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiego

⚓️ Sailor / LoraHub / TAPEX / OctoPack / 💫 StarCoder 1/2

🐚 Research Scientist @SeaAIL 🇸🇬 𝐩𝐫𝐞𝐯 @MSFTResearch

Contribution @XlangNLP @BigCodeProject

Qian Liu 🔭 @sivil_taram

2K Followers 434 Following ⚓️ Sailor / LoraHub / TAPEX / OctoPack / 💫 StarCoder 1/2 🐚 Research Scientist @SeaAIL 🇸🇬 𝐩𝐫𝐞𝐯 @MSFTResearch Contribution @XlangNLP @BigCodeProject

Junxian He @junxian_he

787 Followers 383 Following Assist. Prof @hkust. NLP/ML PhD @LTIatCMU. prev. @MetaAI @SFResearch.

Chuang Gan @gan_chuang

4K Followers 455 Following Faculty Member at UMass Amherst; Principal researcher at MIT-IBM Watson AI Lab; Homepage: https://t.co/oXP6pqXCpo

Alignment researcher. cofounder & head of alignment memes @ EleutherAI. currently RE @ OpenAI. Let's make the future awesome.

Leo Gao @nabla_theta

5K Followers 356 Following Alignment researcher. cofounder & head of alignment memes @ EleutherAI. currently RE @ OpenAI. Let's make the future awesome.

Jacob Andreas @jacobandreas

14K Followers 958 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw

Amanda Bertsch @abertsch72

19 hours ago

In-context learning provides an LLM with a few examples to improve accuracy. But with long-context LLMs, we can now use *thousands* of examples in-context. We find that this long-context ICL paradigm is surprisingly effective– and differs in behavior from short-context ICL! 🧵

8 60 278 58K 223

Download Image

Max Bain @maxhbain

2 days ago

Yi Tay @YiTayML

3 days ago

New paper from @RekaAILabs 🔥 (yes an actual paper). This time we're releasing part of our internal evals which we call Vibe-Eval 😃 This comprises of a hard set which imo is pretty challenging for frontier models today. The fun part here is that we constructed it by trying to…

22 86 566 99K 314

Download Image

3 11 106 22K 5

Download Image

Victor Sanh @SanhEstPasMoi

2 days ago

Evaluation is such a hard and underappreciated problem. My own internal ranking is in large part driven by weak signals like brand credibility (aka "rings a bell, sounds legit, gonna use that"). Way to go Reka team! (would be great to have the eval set on the hf hub 😎)

Piotr Padlewski @PiotrPadlewski

3 days ago

Model evals are hard, that is why we are shedding some light on how we do it at Reka. Along with the paper, we are releasing a dataset of challenging prompts with a golden reference and evaluation protocol using Reka Core as a judge.

2 10 88 15K 22

Download Image

2 3 21 5K 6

Jason Wei @_jasonwei

4 days ago

As benchmarks continue to get saturated, it's great to see a no-frills benchmark of 387 challenging math problems: github.com/protagolabs/od… GPT-4 is 66% on high-school subset, 42% on college subset, and only 11% on high-school competition subset.

8 43 280 100K 136

Percy Liang @percyliang

5 days ago

30 53 417 61K 160

Bo @bo_wangbo

5 days ago

If you are working on RAG with Text and Images (Multimodal RAG), you might be familiar with the above pipeline: maintain 2 sets of models and store 2 or 3 sets of embeddings to be able to utilise the images. Why? The CLIP text encoder is a weak text encoder, you need a separate…

7 14 122 12K 78

Download Image

Bill Yuchen Lin 🤖 @billyuchenlin

2 weeks ago

@DrJimFan Multimodal LLM arena from @allen_ai —> WildVision-Arena: huggingface.co/spaces/WildVis…

0 9 31 3K 10

Victor Sanh @SanhEstPasMoi

a week ago

Glad to see Idefics2 making its way into the awesome OpenVLM Leaderboard which ranks VLMs. 2nd in its category (<10B parameters and open weights)! While InternLM-XComposer2 uses proprietary data, Idefics2 is built solely using openly available data. Leaderboard:…

2 5 22 3K 2

Download Image

Jason Wei @_jasonwei

a week ago

In AI research there is tremendous value in intuitions on what makes things work. In fact, this skill is what makes “yolo runs” successful, and can accelerate your team tremendously. However, there’s no track record on how good someone’s intuition is. A fun way to do this is…

20 37 458 137K 215

Junyang Lin @JustinLin610

a week ago

Great to be On HN!

3 6 105 8K 5

Download Image

Eric Wallace @Eric_Wallace_

2 weeks ago

Some personal updates: I joined OpenAI a few months ago, working on all things robustness/safety/privacy. Also, we are working to publish more of our safety work. See my first project here below, where we make initial progress on prompt injections and other attacks!

OpenAI @OpenAI

2 weeks ago

Introducing the Instruction Hierarchy, our latest safety research to advance robustness for prompt injections and other ways of tricking LLMs into executing unsafe actions. More details: arxiv.org/abs/2404.13208

126 287 2K 571K 654

15 20 447 56K 77

DeepSeek @deepseek_ai

a week ago

🎉 Starting today, the DeepSeek APIs offer pay-as-you-go options at attractive low prices! ✨ Sign up for 5M free tokens. Want more? Purchase 1M tokens for only $0.14 to $0.28! 👉 Kick off your seamless DeepSeek API journey now: platform.deepseek.com #DeepSeek #DeepSeekAPI

1 8 43 4K 11

Download Image

Wenhao Yu @wyu_nd

a week ago

📢 New paper: Compared to 𝐌𝐮𝐥𝐭𝐢-𝐦𝐨𝐝𝐚𝐥 𝐂𝐨𝐓, We found 𝐃𝐞𝐬𝐜𝐫𝐢𝐛𝐞 (visual description generation)-then-𝐑𝐞𝐚𝐬𝐨𝐧 (generating 𝐌𝐮𝐥𝐭𝐢-𝐦𝐨𝐝𝐚𝐥 𝐂𝐨𝐓 with the assistance of descriptions) could greatly improve math reasoning on MathVista and MathVerse.…

0 10 58 5K 32

Download Image

Alessio Fanelli @FanaHOVA

2 weeks ago

Just ran across @taoyds's group work and all they are doing around agents, it's 🔥 xlang.ai

1 1 15 1K 4

Download Image

Ningyu Zhang@ICLR2024 @zxlzr

a week ago

Excited to announce that our knowledge editing tool EasyEdit now supports llama3-8b! 🚀 Currently rocking ROME editing method. We'll keep it fresh with updates and more editing methods, but remember: Transformers gotta level up to 4.40.0. Got questions or requests? Hit us up with…

1 4 48 3K 16

Download Image

Jim Fan @DrJimFan

2 weeks ago

Llama-3 is closing the gap with GPT-4, but multimodal models gotta catch up. Vision capabilities of open models like LlaVA are far, far behind GPT-4V. Video models are even worse. They hallucinate all the time and fail to give detailed descriptions of complex scenes and actions.…

46 109 856 137K 179

Guilherme Penedo @gui_penedo

2 weeks ago

We have just released 🍷 FineWeb: 15 trillion tokens of high quality web data. We filtered and deduplicated all CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile and SlimPajama!

40 343 2K 569K 799

Download Image

Dawei Zhu @dwzhu128

2 weeks ago

🚀Excited to share our new paper "LongEmbed: Extending Embedding Models for Long Context Retrieval". We introduce the LongEmbed benchmark, explore context extension of existing embedding models, and release E5-Base-4k & E5-RoPE-Base. Paper: arxiv.org/abs/2404.12096