V4 is multimodal embeddings, but V4-GGUF wasn't—until now. We've finally cracked how to generate multimodal embeddings using llama.cpp & GGUF.
We fixed two main issues. First, in the language model part, we corrected the attention mask in the transformer block so it properly…
mmBERT: Massively Multilingual BERT
Trained on 3T+ tokens across 1,833 languages, mmBERT surpasses XLM-R on standard NLU and retrieval benchmarks and is competitive with English-only encoders; in throughput tests it runs 2–4× faster than prior multilingual encoders under…
mmBERT: Massively Multilingual BERT
Trained on 3T+ tokens across 1,833 languages, mmBERT surpasses XLM-R on standard NLU and retrieval benchmarks and is competitive with English-only encoders; in throughput tests it runs 2–4× faster than prior multilingual encoders under… https://t.co/uJrWudnBnV
Today we're releasing jina-code-embeddings, a new suite of code embedding models in two sizes—0.5B and 1.5B parameters—along with 1~4bit GGUF quantizations for both. Built on latest code generation LLMs, these models achieve SOTA retrieval performance despite their compact size.…
We are at @qdrant_engine 's Vector Space Day 🚀 in Berlin on Sep 26. We'll talk about "Vision-Language Models: A New Architecture for Multi-Modal Embedding Models" and also share some insights and learnings we gained while training jina-embeddings-v4.
🎫 lu.ma/p7w9uqtz
Got a Mac with an M-chip? You can now train Gemma3 270m locally as a multilingual embedding or reranker model using our mlx-retrieval project. It lets you train Gemma3 270m locally at 4000 tokens/s on M3 Ultra - that's actually usable speed. We've implemented some standard…
Two weeks ago, we released jina-embeddings-v4-GGUF with dynamic quantizations. During our experiments, we found interesting things while converting and running GGUF embeddings. Since most of the llama.cpp community focuses on LLMs, we thought it'd be valuable to share this from…
I went together with @bo_wangbo to SIGIR this year, we wrote a blog post with our highlights and summaries of AI and neural papers that we found interesting at the conference
jina.ai/news/what-we-l…
Our official MCP server with read, search, embed, rerank tools on mcp[at]jina[at]ai, where we optimized the embedding and reranker usage particularly for context engineering for LLMs.
😎 I just published Sentence Transformers v5.1.0, and it's a big one. 2x-3x speedups of SparseEncoder models via ONNX and/or OpenVINO backends, easier distillation data preparation with hard negatives mining, and more!
See 🧵for the deets:
Resolution is important for image embeddings - especially for visual document retrieval. jina-embeddings-v4 supports inputs up to 16+ MP (the default is much lower). We wrote a blog post about how resolution affects performance across benchmarks
jina.ai/news/how-image…
We created a new benchmark for visual document retrieval with diverse visually rich documents (more than linear paginated PDFs) and more query types than just questions
github.com/jina-ai/jina-v…
We created a new benchmark for visual document retrieval with diverse visually rich documents (more than linear paginated PDFs) and more query types than just questions
github.com/jina-ai/jina-v…
We've just release 100+ intermediate checkpoints and our training logs from SmolLM3-3B training.
We hope this can be useful to the researcher working on mech interpret, training dynamics, RL and other topics :)
Training logs:
-> Usual training loss (the gap in the loss are due…
Context engineering is curating the most relevant information to pack the context windows just right. Text selection and passage reranking are integral components of it. In part 2 of our Submodularity Series, we show that both text selection and passage reranking yield to…
We just arrived @SIGIRConf! If you're here or are interested in an internship @JinaAI_ on training the following search foundation models, feel free to reach out to me:
- Embedding / Dense Retrieval Models
- Rerankers
- Small LMs (<2B) for document cleaning, extraction, etc.
Our paper "Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models" has been accepted at the Robust IR Workshop @ SIGIR 2025! 🌠
📅 I'll present it on July 17th
📝 Pre-print: arxiv.org/abs/2409.04701
🔗 Workshop: …-2025-workshop-on-robust-ir.github.io
4 Followers 109 FollowingAI & data enthusiast | Freelance ML tinkerer (experiments, HPO, datasets)
Here to share thoughts, projects & random tech takes |
DM open for freelance collabs
658 Followers 2K FollowingCPO @ DR | AI scaling care 8× | 15 yrs fixing US health incentives | Systems takes on trade, talent & fertility | ex-CPA, Hopkins
83 Followers 148 FollowingHead of Knowledge and Search @LightOnIO | PhD in physics of the universe 💫.
Interested in IR, Evals & synthetic data.
I also like gaming 🎮 and lifting 🏋️♀️.
57K Followers 857 FollowingFiguring out AI @allen_ai, open models, RLHF, fine-tuning, etc
Contact via email.
Writes @interconnectsai
Wrote The RLHF Book
Mountain runner
7K Followers 6K FollowingCenter for Language and Speech Processing at @JohnsHopkins #NLProc #MachineLearning #AI https://t.co/6IXR5OSQtw
@[email protected]
2K Followers 590 Following28, French CS Engineer 💻, PhD in ML 🎓🤖 — Guiding generative models for better synthetic data and building multimodal representations @LightOnIO — 🇫🇷🇬🇧
4K Followers 6K FollowingCEO @flint_company | Podcast “IA pas que la Data” https://t.co/wZ0okajiX7 | Chief Vibe Coding Officer, Breaker of things | France AI ambassador
24K Followers 1 Followingcovering the latest AI & LLM research /// see "highlights" for all previous weekly threads /// building the best AI paper search engine @findmypapersai
108K Followers 4 FollowingCohere builds secure, scalable, and private enterprise-grade AI solutions for real-world business problems. Join us: https://t.co/Yb2xItMObl
97K Followers 8K FollowingCompiling in real-time, the race towards AGI.
The Largest Show on X for AI.
🗞️ Get my daily AI analysis newsletter to your email 👉 https://t.co/6LBxO8215l