🧪 Open-Source Team that maintains LMCache and Production Stack
🤖 Democratizing AI by providing efficient LLM serving for ALLlmcache.ai Github, OnlineJoined September 2024
Wow… LLMs can now get insane speed & memory boosts.
This open-source trick makes any large language model faster than you thought possible...
LMCache caches and reuses key-value data across instances and hardware, so your AI:
– Remembers context
– Handles multi-round Q&A…
The fastest serving engine for LLMs is here (open-source)!
LMCache is an LLM serving engine designed to reduce time-to-first-token and increase throughput, especially under long-context scenarios.
It boosts vLLM with 7x faster access to 100x more KV caches.
100% open-source!
I got deep respect for niche open source project in AI.
you gotta have deep expertise and a good heart to run those.
kuddos to teams like LMCache on keeping the open source dream alive ❤️
Fastest inference engine for LLMs!
LMCache is an LLM serving engine that reduce Time to First Token (TTFT) and increase throughput, especially under long-context scenarios.
100% Open Source
LLM responses don’t have to be so slow.
Spending minutes just to get a basic answer.
Meet LMCache , an open source KV cache layer that reduces time-to-first-token and boosts throughput for long context chat, RAG and doc QA.
- Reuses KV across GPU/CPU/Disk → skips redundant…
This is an interesting open‑source repo.
LMCache makes LLM serving more efficient by reusing KV caches across GPU, CPU memory, and disk.
- Works well for long contexts
- 3–10× faster responses in RAG and multi‑turn workloads
This is an interesting open‑source repo.
LMCache makes LLM serving more efficient by reusing KV caches across GPU, CPU memory, and disk.
- Works well for long contexts
- 3–10× faster responses in RAG and multi‑turn workloads https://t.co/rMKebIwW5L
Not satisfied with your AI's ability to process long documents?
LLMs do not need to take forever. With LMCache you can reduce time to first token by up to 15x through our state of the art KV Cache management, saving crucial time and money.
Try us here: github.com/LMCache/LMCache
LMCache highlighted by CEO of Redis @rowantrollope at Redis Released SF 2025! 🎉
We’re thrilled to partner with Redis, bringing KV cache acceleration to the infra ecosystem.
#Redis#LMCache#AIInfra#LLM#Caching#SFTech#RedisReleased2025
📌 PS: Our team @kobe_eee(Kobe) &…
Join us at SIGCOMM 2025(conferences.sigcomm.org/sigcomm/2025/t…) for our full-day LMCache Tutorial — an intelligent caching middleware that makes LLM inference faster & cheaper!
📅 Sept 8, 2025
8:45 AM – 6:00 PM (Portugal Time / WEST)
= 12:45 AM – 10:00 AM (PDT)
What you’ll learn:
🔹 KV-cache…
1K Followers 2K FollowingMaking AI reliable for humans.
Postdoctoral Fellow @Stanford with @HazyResearch and @Scott_linderman.
PhD @Columbia @ZuckermanBrain.
Industry: @GoogleAI.
188 Followers 310 FollowingHeterodox Systems Thinker. Everything in moderation. I solve problems before they become problems. My tweets, my own opinions.
218K Followers 425 FollowingThe latest rumors and developments in the world of artificial intelligence. DM to include your AI project in the newsletter.
2K Followers 1K FollowingAI Platform Engineer @lycorp_jp || #Kubernetes Member || Previously @Woven_ToyotaJP, @PreferredNetJP @chatwork_ja. All posts are my own.
456 Followers 3K FollowingHusband, father and data fan. Python learner, R friend and Julia explorer. Competition and econometrics. Economist @UNAM_mx @CIDE_mx, Data scientist @ITAM_mx
566K Followers 515 FollowingFounder of the world’s most read daily AI newsletter @therundownai. Sharing the latest developments in the world of artificial intelligence.
358K Followers 1K FollowingML/AI researcher & former stats professor turned LLM research engineer. Author of "Build a Large Language Model From Scratch" (https://t.co/O8LAAMRzzW).
39K Followers 994 FollowingCreator of bitsandbytes.Research Scientist @allen_ai and incoming professor @CarnegieMellon. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
97K Followers 8K FollowingCompiling in real-time, the race towards AGI.
The Largest Show on X for AI.
🗞️ Get my daily AI analysis newsletter to your email 👉 https://t.co/6LBxO8215l
7K Followers 788 FollowingCo-founder and CEO @TensorChord, building postgres-based vector extension https://t.co/7WGvl1sR56 | Father of 1 cat | Married
14K Followers 15K FollowingAustin Powered. Co-founder of OpenStack & OpenInfra Foundation. General Manager of AI & Infrastructure for the Linux Foundation. open source for fun & profit.
2K Followers 593 Following@NVIDIA Sr. Research Scientist | UIUC PhD
All opinions and tweets are personal.
Tweets about AI Inference, CUDA and GPU systems.
9K Followers 866 Followingmts @ openai |
cs phd @ 🌁 uc berkeley |
building @vllm_project |
machine learning system |
the real agi is the friends we made along the way
19K Followers 20 FollowingA high-throughput and memory-efficient inference and serving engine for LLMs. Join https://t.co/lxJ0SfX5pJ to discuss together with the community!
No recent Favorites. New Favorites will appear here.