Weiting (Steven) Tan @weiting_nlp

Ph.D. Candidate at @jhuclsp, Student Researcher @Bytedance Seed | Prev @AIatMeta @Amazon Alexa AI steventan0110.github.io USA Joined July 2021

Tweets

76
Followers

210
Following

295
Likes

164

Sanxing Chen @sanxing_chen

3 days ago

Most RL for LLMs today is single-step optimization on a given state (e.g., an instruction), which is essentially a bandit setup. But to learn a meta-policy that can solve various bandit problems via in-context trial and error, you need true multi-turn RL over a long horizon. So,…

1 11 24 4K 7

Download Image

Jacob Austin @jacobaustin132

8 months ago

Making LLMs run efficiently can feel scary, but scaling isn’t magic, it’s math! We wanted to demystify the “systems view” of LLMs and wrote a little textbook called “How To Scale Your Model” which we’re releasing today. 1/n

25 389 2K 450K 3K

Download Image

Dongfu Jiang @DongfuJiang

a month ago

🚀 Excited to finally share our paper on VerlTool, released today after months of work since the initial release in late May! VerlTool is a high-efficiency, easy-to-use framework for Agentic RL with Tool use (ARLT), built on top of VeRL. It currently supports a wide range of…

Dongfu Jiang @DongfuJiang

4 months ago

5 74 384 76K 263

Download Image

2 36 155 16K 91

Download Image

Jason Weston @jaseweston

a month ago

🌀Diversity Aware RL (DARLING)🌀 📝: arxiv.org/abs/2509.02534 - Jointly optimizes for quality & diversity using a learned partition function - Outperforms standard RL in quality AND diversity metrics, e.g. higher pass@1/p@k - Works for both non-verifiable & verifiable tasks 🧵1/5

5 87 423 83K 346

Download Image

Benjamin Van Durme @ben_vandurme

7 months ago

Our latest on compressed representations: Key-Value Distillation (KVD). Query-independen transformer compression, with offline supervised distillation.

2 30 135 13K 72

Download Image

DeepSeek @deepseek_ai

9 months ago

🛠️ DeepSeek-R1: Technical Highlights 📈 Large-scale RL in post-training 🏆 Significant performance boost with minimal labeled data 🔢 Math, code, and reasoning tasks on par with OpenAI-o1 📄 More details: github.com/deepseek-ai/De… 🐋 4/n

242 825 5K 1.8M 928

Download Image

JHU Computer Science @JHUCompSci

9 months ago

Congratulations to Prof. Philipp Koehn on being named a Fellow of the @aclmeeting! cs.jhu.edu/news/philipp-k…

0 4 30 5K 0

Weiting (Steven) Tan @weiting_nlp

10 months ago

I had a great time helping host MASC-SLL at Hopkins last year. MASC-SLL is a great opportunity to connect with fellow AI/NLP/Speech researchers. If your organization is in the Mid-Atlantic region and is interested in hosting the event, please reach out!

MASC-ALL Conference @MASC_Conference

10 months ago

1 17 14 5K 0

0 1 4 1K 0

Tianjian Li @tli104

10 months ago

I have written a blogpost offering an explanation of why both the chosen and the rejected log-probability decreases during DPO, and more interestingly, why it is a desired phenomenon to some extent. Link: tianjianl.github.io/blog/2024/dpo/

0 6 13 3K 4

Sherjil Ozair @sherjilozair

10 months ago

Very happy to hear that GANs are getting the test of time award at NeurIPS 2024. The NeurIPS test of time awards are given to papers which have stood the test of the time for a decade. I took some time to reminisce how GANs came about and how AI has evolve in the last decade.

16 120 981 218K 377

Weiting (Steven) Tan @weiting_nlp

12 months ago

Excited to see that SpiritLM is fully open-sourced now. It supports speech and text as both input and output. Please consider trying it at: github.com/facebookresear…

AI at Meta @AIatMeta

12 months ago

Excited to see that SpiritLM is fully open-sourced now. It supports speech and text as both input and output. Please consider trying it at: github.com/facebookresear…

22 119 628 149K 208

Download Video

0 1 4 659 0

Saining Xie @sainingxie

12 months ago

Representation matters. Representation matters. Representation matters, even for generative models. We might've been training our diffusion models the wrong way this whole time. Meet REPA: Training Diffusion Transformers is easier than you think! sihyun.me/REPA/(🧵1/n)

29 268 2K 371K 1K

Download Image

Haoran Xu @fe1ixxu

a year ago

Multilingual models are usually heavily skewed in favor of high-resource languages. We change this with X-ALMA: an LLM-based translator committed to ensuring top-tier performance across 50 diverse languages, regardless of their resource levels! Paper: arxiv.org/pdf/2410.03115