TOPIC6 stories

Reranking

Cross-encoders and post-retrieval ranking improvements.

Stories

Sentence Transformers 5.5.0 adds train-sentence-transformers skill with one-shot 0.8856 NDCG@10

Sentence Transformers 5.5.0 adds an agent skill for fine-tuning embeddings, rerankers, and sparse encoders from Claude Code, Codex, Cursor, and Gemini CLI. The author reports a one-shot German embedding run rising from 0.6720 to 0.8856 NDCG@10 on a local PC.

RELEASE3w ago

LightOn releases LateOn and DenseOn at 149M params with BEIR 57.22

LightOn open-sourced DenseOn and LateOn plus the training pipeline behind them, including 1.4 billion query-document pairs and decontaminated BEIR results. Teams can use the small open retrieval models and reproduced data mixtures instead of opaque closed-data baselines.

RELEASE1mo ago

Sentence Transformers releases v5.4 with multimodal embeddings and reranking

Sentence Transformers v5.4 adds one encode API for text, image, audio, and video, plus multimodal reranking and a modular CrossEncoder stack. It also flattens Flash Attention 2 inputs for text workloads, reducing padding waste and VRAM use.

NEWS1mo ago

Reason-ModernColBERT claims nearly 90% on BrowseComp-Plus with a 150M retriever

LightOn says its 150M multi-vector retriever is pushing BrowseComp-Plus close to saturation, with results showing search-call behavior and retriever choice matter nearly as much as model size. Retrieval engineers should watch multi-hop setup and tool-calling limits before copying the benchmark.

NEWS1mo ago

Reason-ModernColBERT ranks 87.59 on BrowseComp-Plus

LightOn’s late-interaction retriever paired with GPT-5 reached 87.59 accuracy on BrowseComp-Plus while using fewer search calls than larger baselines. It suggests deep-research quality may now hinge more on retrieval architecture than on swapping in ever larger LLMs.

RELEASE2mo ago

Mixedbread releases Wholembed v3 and claims large gains on multimodal retrieval benchmarks

Mixedbread introduced Wholembed v3 as a retrieval model for text, image, video, audio, and multilingual search. Benchmark it on fine-grained retrieval tasks if single-vector embeddings have been collapsing in your pipeline.