Kevin Lin 林冠言 @nlpkevinl

research @Letta_AI @berkeleynlp @ucbrise @ai2_allennlp kevinlin.io Joined September 2017

Tweets

55
Followers

481
Following

308
Likes

1K

Charles Packer @charlespacker

2 weeks ago

Prior to GPT-5, Sonnet & Opus were the undisputed kings of AI coding. It turns out the GPT-5 is significantly better than Sonnet in one key way: the ability to recover from mistakes. Today we're excited to release our latest research at @Letta_AI on Recovery-Bench, a new…

19 37 300 37K 127

Download Image

Alex Shaw @alexgshaw

a month ago

Congrats to @Letta_AI for building the best performing open source agent on Terminal-Bench!

Letta @Letta_AI

a month ago

Congrats to @Letta_AI for building the best performing open source agent on Terminal-Bench!

2 7 28 8K 8

Download Image

1 3 15 2K 2

Alex Shaw @alexgshaw

2 months ago

Evaluating agents on benchmarks is a pain. Each benchmark comes with its own harness, scoring scripts, and environments and integrating can take days. We're introducing the Terminal-Bench dataset registry to solve this problem. Think of it as the npm of agent benchmarks. Now…

1 22 99 11K 41

Download Image

Jessy Lin @realJessyLin

2 months ago

User simulators bridge RL with real-world interaction // jessylin.com/2025/07/10/use… How do we get the RL paradigm to work on tasks beyond math & code? Instead of designing datasets, RL requires designing environments. Given that most non-trivial real-world tasks involve…

10 46 334 33K 287

Download Image

Sarah Wooders @sarahwooders

3 months ago

This has been in the works for a very long time, so very excited to share it with more people! The first and only model agnostic Agent API that is packaged with the beautiful ADE

Letta @Letta_AI

3 months ago

This has been in the works for a very long time, so very excited to share it with more people! The first and only model agnostic Agent API that is packaged with the beautiful ADE

3 9 43 17K 26

Download Video

1 1 15 1K 0

Charles Packer @charlespacker

3 months ago

introducing the Open AI agent cloud 🚀 unlike OpenAI Assistants / Responses API, Mistral Agents API, etc, you have complete control and can freely move state + memory from local <-> cloud (using agent file!)

Letta @Letta_AI

3 months ago

3 9 43 17K 26

Download Video

1 4 16 2K 2

Yizhong Wang @yizhongwyz

3 months ago

Thrilled to announce that I will be joining @UTAustin @UTCompSci as an assistant professor in fall 2026! I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘

103 53 672 74K 73

Download Image

Cursor @cursor_ai

3 months ago

A conversation on the optimal reward for coding agents, infinite context models, and real-time RL

59 144 2K 277K 1K

Download Video

Letta @Letta_AI

3 months ago

🚀 Just launched: The first benchmark for LLMs on agentic memory management! Memory (or context) management is the final bottleneck. As models get smarter, the agents that win will be those that can intelligently manage what they remember, forget, and retrieve.

4 7 34 5K 17

Download Image

CHAI AI. William Beauchamp. @chai_research

4 months ago

Interesting idea of “sleep-time compute” - Activates deep thinking in idle periods - Reduces response latency - 6.4x fewer tokens per query - 5.2x speedup in response time

5 2 15 5K 1

Download Image

Letta @Letta_AI

4 months ago

Our next Stateful Agents meetup will be hosted in SF by @IndexVentures 5/15! Come by to learn about stateful agents and MCP🔌(Model Context Protocol) - integrating MCP and stateful agents is a powerful way to connect external context and tools to your agents.

1 3 12 3K 1

Letta @Letta_AI

5 months ago

MemGPT 2.0 with sleep-time compute is now available in @Letta_AI ! 💤🧠 You can now create agents with “sleep-time” enabled - they are just like normal agents, but have one or more sleep-time agents continuously re-writing their in-context memory, separate from the main process.

4 9 35 5K 17

Download Video

Jiayi Pan @jiayi_pirate

5 months ago

We explore a new dimension in scaling reasoning models in Adaptive Parallel Reasoning APR lets LMs learn to orchestrate both serial & parallel compute E2E via supervised training + RL — w/ better efficiency and scalability than long CoT on Countdown 🧵 arxiv.org/abs/2504.15466

18 73 337 60K 172

Download Image

Letta @Letta_AI

5 months ago

We're excited to release our latest paper, “Sleep-time Compute: Beyond Inference Scaling at Test-Time”, a collaboration with @sea_snell from UC Berkeley and @Letta_AI advisors / UC Berkeley faculty Ion Stoica and @profjoeyg letta.com/blog/sleep-tim…

11 35 154 57K 99

AK @_akhaliq

5 months ago

Sleep-time Compute Beyond Inference Scaling at Test-time

9 76 603 69K 413

Download Image

kalomaze @kalomaze

5 months ago

the most underused, underappreciated, and underrated trick that is basically only used by Anthropic; context distillation

16 46 1K 140K 587

Belinda Li @belindazli

6 months ago

Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵

3 45 227 41K 155

Download Image

Catherine Chen @cathychen23

a year ago

How does the human brain represent semantic information from different languages? Our new preprint suggests that bilingual language comprehension relies on shared semantic representations that are systematically modulated by each language! 1/n

bioRxiv @biorxivpreprint

a year ago

1 7 18 53K 13

5 87 351 50K 237

Download Image

Orion Weller @orionweller

a year ago

LLMs can use complex instructions - why can’t retrieval models? We build FollowIR, a training/test set of real-world human retrieval instructions. Our FollowIR-7B is the best IR model for instruct-following, even beating @cohere @OpenAI retrievers 🤯 📝 arxiv.org/abs/2403.15246

4 42 249 56K 142

Download Image

Frances Ding @FrancesDing

2 years ago

Protein language models (pLMs) can give protein sequences likelihood scores, which are commonly used as a proxy for fitness in protein engineering. But what do likelihoods encode? In a new paper (w/ @JacobSteinhardt) we find that pLM likelihoods have a strong species bias! 1/