Solutions · RAG & Knowledge · Reranker

Reranker, the precision multiplier.

Retrieval gets the top-k passages close to the query. A reranker looks at each one carefully and reorders by true relevance. Often the single highest-leverage component in a RAG pipeline. Runs on-device with the same SDK, plugs in after any retriever.

Second-pass scoring Any retriever On-device
Class

Reranker

Pass a query and a list of passages. Get back relevance scores per passage.

Use case

Precision boost

Recall comes from the retriever. Precision comes from the reranker.

Integrates with

Every retriever

Vector search, BM25, hybrid, custom. Just feed it the top-k results.

Why a reranker

Retrieval is not enough.

Vector similarity is fast and approximate. BM25 is fast and lexical. Both surface candidate passages, but neither reads the query carefully and judges each passage on its actual merits. A reranker does. The cost is one extra inference pass on the top-k; the benefit is significantly higher precision at the top of the list.

01 · Fixes false positives

Vector hits with the right shape, wrong meaning

Dense embeddings group passages by topical similarity. A passage about "Spring Framework" can rank highly for "spring season" because both share vocabulary. The reranker reads both and demotes the wrong one.

02 · Surfaces buried answers

Pulls the right passage from position 8 to position 1

Often the best passage exists in the top-20 from vector search but is not at position 1. A reranker promotes it. This is the single most consistent quality lift in RAG benchmarks.

03 · Composable

Works after any retriever

Plug it after VectorRetrievalStrategy, Bm25RetrievalStrategy, HybridRetrievalStrategy, or your own retriever. Reranker only needs a query and a list of candidate passages.

04 · Local + bounded cost

One on-device inference per pipeline

The reranker scores top-k passages (typically 20-100) in a single batched pass. Latency is bounded, predictable, and stays inside your process. No per-call cloud cost.

How it works

Three patterns, same Reranker.

Simplest pattern. You already have candidate passages from any source (file system scan, SQL query, prior search). Score them against a query, sort, take the top-N.

StandaloneReranker.cs
using LMKit.Model;
using LMKit.Embeddings;

var reranker = new Reranker(LM.LoadFromModelID("bge-m3-reranker"));

var candidates = new[]
{
    "Spring framework is a popular Java web framework.",
    "In spring, trees regrow leaves and birds return.",
    "Hibernate is a Java ORM.",
};

var scores = await reranker.RerankAsync("when does spring start", candidates);

// Sort by descending relevance.
foreach (var hit in scores.OrderByDescending(s => s.Score))
{
    Console.WriteLine($"{hit.Score:F3}  {hit.Passage}");
}
When to use it

Reranker checklist.

Add it when:

  • The LLM gets context that doesn't actually answer the user's question
  • Top-5 retrieval hits are similar in vector space but only some are relevant
  • You have headroom for one extra inference pass (typically 50-200 ms)
  • You need higher answer quality more than higher recall
  • You are running benchmarks and want to compare retrieval-only vs retrieval+rerank

Skip it when:

  • The retriever already produces near-perfect top-3 (rare)
  • End-to-end latency budget is below 100ms with no spare headroom
  • You can fit all candidates in the LLM context and let the model self-rank
  • The corpus is so small that a single LLM pass over everything is cheaper
LM-Kit.NET pillars

Seven pillars, one foundation.

The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.

The foundation

Every capability above runs on this runtime.

Foundation

Local Inference

The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.

Explore the foundation

One extra pass, much better answers.

Start in 5 minutes RAG & Knowledge hub