Sungmin Cha @_sungmin_cha

Faculty Fellow @nyuniversity | PhD @SeoulNatlUni sites.google.com/view/sungmin-c… Manhattan, NY Joined July 2019

Tweets

349
Followers

420
Following

278
Likes

2K

Jason Weston @jaseweston

2 days ago

🌀New work: Era of Real-World Human Interaction 🌀 📝: arxiv.org/abs/2509.25137 - RL *directly* from User Conversations - Organic replies + long-term history are learning signal - Trained on WildChat, beats RLHF at *user* level -> the future for personal Super Intelligence? 🧵1/6

6 63 320 28K 258

Download Image

Andrej Karpathy @karpathy

20 hours ago

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea…

Dwarkesh Patel @dwarkesh_sp

6 days ago

242 556 4K 2.0M 4K

Download Video

318 821 6K 964K 6K

Zhou Yu @Zhou_Yu_AI

22 hours ago

It was great to have @kchonyc Kyunghyun Cho visit us in Columbia University and talk about metrics and evaluation. We should rethink our benchmark and the eval metrics, don't just follow other popular papers!

0 1 29 3K 5

Download Image

Sungmin Cha @_sungmin_cha

22 hours ago

We offered a minimal working explanation on a similar phenomenon: low-recall reference models structurally limit alignment, no matter how strong the preference signal! Check details here: x.com/_sungmin_cha/s…

Etash Guha @etash_guha

a week ago

11 10 200 69K 104

0 0 1 134 0

Sungmin Cha @_sungmin_cha

22 hours ago

This resonates with our findings! Once a model’s recall is limited, alignment signals can’t fully align it. Larger, better-pretrained models preserve recall — making downstream RL much more effective! Check more details here: x.com/_sungmin_cha/s…

Nathan Lambert @natolambert

a week ago

8 40 182 22K 54

0 0 1 86 0

Nathan Lambert @natolambert

23 hours ago

XYZ Last Exam becoming a recurring benchmark name was not on my bingo card man. Progress is going to go right past these in a few years and we'll be onto the next evals before we know it.

Dr. Datta M.D. (AIIMS Delhi) @DrDatta_AIIMS

a day ago

XYZ Last Exam becoming a recurring benchmark name was not on my bingo card man. Progress is going to go right past these in a few years and we'll be onto the next evals before we know it.

27 75 395 44K 103

Download Image

5 68 154 11K 22

Kyunghyun Cho @kchonyc

2 days ago

perhaps this is what so-called frontier labs do: RL before KD. 🧐 stay tuned for a detailed 🧵 from @_sungmin_cha ! the preprint link ⬇️

4 24 247 15K 188

Download Image

Sungmin Cha @_sungmin_cha

2 days ago

🚨 New preprint out! 🚨 Our paper, “Why Alignment Must Precede Distillation: A Minimal Working Explanation” (with @kchonyc ), challenges a common workflow in aligning #LLMs. #LLM #Alignment #RLHF #DPO

2 3 25 3K 20

Download Image

Sungmin Cha @_sungmin_cha

2 days ago

We show why the common #KD → #Align workflow for LLMs fails: low-recall references create a trap that blocks rare but important behaviors. ✅ Solution: #Align → #KD. Align first, then distill. Better rewards, precision, and stability!😀