LMCache Lab @lmcache

🧪 Open-Source Team that maintains LMCache and Production Stack 🤖 Democratizing AI by providing efficient LLM serving for ALL lmcache.ai Github, Online Joined September 2024

Tweets

134
Followers

683
Following

46
Likes

236

LMCache Lab @lmcache

4 days ago

Gotta Cache 'Em All.

0 0 2 91 0

Download Image

Wow… LLMs can now get insane speed & memory boosts. This open-source trick makes any large language model faster than you thought possible... LMCache caches and reuses key-value data across instances and hardware, so your AI: – Remembers context – Handles multi-round Q&A…

26 33 84 52K 31

Download Gif

Daily Dose of Data Science @DailyDoseOfDS_

2 weeks ago

The fastest serving engine for LLMs is here (open-source)! LMCache is an LLM serving engine designed to reduce time-to-first-token and increase throughput, especially under long-context scenarios. It boosts vLLM with 7x faster access to 100x more KV caches. 100% open-source!

16 207 1K 66K 1K

Download Image

Yacine Mahdid @yacinelearning

3 weeks ago

I got deep respect for niche open source project in AI. you gotta have deep expertise and a good heart to run those. kuddos to teams like LMCache on keeping the open source dream alive ❤️

6 30 457 17K 226

Download Image

Yuvraj Singh @YuvrajS9886

3 weeks ago

Fastest inference engine for LLMs! LMCache is an LLM serving engine that reduce Time to First Token (TTFT) and increase throughput, especially under long-context scenarios. 100% Open Source

1 1 20 830 8

Download Image

Rez Karim @rezkhere

2 weeks ago

LLM responses don’t have to be so slow. Spending minutes just to get a basic answer. Meet LMCache , an open source KV cache layer that reduces time-to-first-token and boosts throughput for long context chat, RAG and doc QA. - Reuses KV across GPU/CPU/Disk → skips redundant…

13 12 32 34K 7

Download Video

AshutoshShrivastava @ai_for_success

2 weeks ago

This is an interesting open‑source repo. LMCache makes LLM serving more efficient by reusing KV caches across GPU, CPU memory, and disk. - Works well for long contexts - 3–10× faster responses in RAG and multi‑turn workloads

LMCache Lab @lmcache

2 weeks ago

1 2 15 6K 7

3 3 25 5K 6

Download Image

LMCache Lab @lmcache

2 weeks ago

Extending LMCache Backends: A Comprehensive Guide to Custom Backend Development blog.lmcache.ai/2025-09-11-ext…

1 1 8 532 2

Download Image

LMCache Lab @lmcache

2 weeks ago

thanks for the kind words!

Yacine Mahdid @yacinelearning

3 weeks ago

thanks for the kind words!

6 30 457 17K 226

Download Image

0 0 2 229 0

LMCache Lab @lmcache

2 weeks ago

Not satisfied with your AI's ability to process long documents? LLMs do not need to take forever. With LMCache you can reduce time to first token by up to 15x through our state of the art KV Cache management, saving crucial time and money. Try us here: github.com/LMCache/LMCache

1 2 15 6K 7

LMCache Lab @lmcache

3 weeks ago

Missed our tutorial at SIGCOMM 2025? The full recording is now available! Watch the session here: youtu.be/nem6UXZIILg?si… #SIGCOMM #SIGCOMM2025 #LMCache

LMCache Lab @lmcache

4 weeks ago

Missed our tutorial at SIGCOMM 2025? The full recording is now available! Watch the session here: youtu.be/nem6UXZIILg?si… #SIGCOMM #SIGCOMM2025 #LMCache

0 4 21 1K 4

Download Image

0 1 6 346 2

LMCache Lab @lmcache

3 weeks ago

Just got the tickets. Meet our team [Oct 6(2025)] in SF!

OpenAI Developers @OpenAIDevs

2 months ago

Just got the tickets. Meet our team [Oct 6(2025)] in SF!

116 290 1K 457K 138

Download Video

0 0 4 333 0

LMCache Lab @lmcache

4 weeks ago

LMCache highlighted by CEO of Redis @rowantrollope at Redis Released SF 2025! 🎉 We’re thrilled to partner with Redis, bringing KV cache acceleration to the infra ecosystem. #Redis #LMCache #AIInfra #LLM #Caching #SFTech #RedisReleased2025 📌 PS: Our team @kobe_eee(Kobe) &…

0 4 16 558 2

Download Image

LMCache Lab @lmcache

4 weeks ago

Join us at SIGCOMM 2025(conferences.sigcomm.org/sigcomm/2025/t…) for our full-day LMCache Tutorial — an intelligent caching middleware that makes LLM inference faster & cheaper! 📅 Sept 8, 2025 8:45 AM – 6:00 PM (Portugal Time / WEST) = 12:45 AM – 10:00 AM (PDT) What you’ll learn: 🔹 KV-cache…