Solutions · RAG & Knowledge · Embeddings

Embeddings, the foundation of retrieval.

Turn any text passage, image, or document into a dense vector that captures meaning. Embedder is the unified API used by every search, RAG, classification, and clustering workflow in LM-Kit.NET. Runs entirely on-device with any embedding model in the catalog.

5-minute quickstart API reference

Text + image Batch + async Any vector store

Engine

`Embedder`

One class, one constructor, one method (with overloads).

Helpers

`VectorOperations`

Cosine, dot, Euclidean similarity plus normalisation utilities.

Output

`float[]` & `float[][]`

Plain arrays. Drop directly into any vector store or do raw vector math.

What you get

One API, every embedding workflow.

01 · Unified

Text and image, same surface

Load nomic-embed-text for text, nomic-embed-vision for images, bge-m3 for multilingual. Same Embedder constructor. Same GetEmbeddings call.

02 · Batched

Single text, batches, tokenized input

Overloads accept a single string, an IEnumerable<string>, pre-tokenised arrays, or ImageBuffer sequences for images. Throughput scales linearly with batch size on GPU.

03 · Async-first

`GetEmbeddingsAsync` + cancellation

Every overload has an async sibling. Pass a CancellationToken, get back a Task. Wire straight into server pipelines without blocking the request thread.

04 · Cross-modal

Text-to-image and image-to-text

Nomic Embed Vision aligns its image vector space with nomic-embed-text. A text query retrieves matching images, an image query retrieves matching text passages. One vector store, two modalities.

05 · Multilingual

100+ languages with `bge-m3`

bge-m3 embeds across 100+ languages in one model. Index in mixed-language corpora, query in any language, retrieve cross-lingually without per-language pipelines.

06 · Plug-in stores

Any vector store, same API

In-memory, FileSystemVectorStore, Qdrant or pgvector via connector, or a custom IVectorStore implementation. float[] in, similarity hits out. No vendor lock-in.

How it works

Three patterns, same Embedder.

Single, batch, and cross-modal. Pick a tab for the pattern you need.

The simplest pattern. One model, one call, one vector. The EmbeddingSize property tells you the dimension ahead of time so you can size your vector store correctly.

SingleText.cs

using LMKit.Model;
using LMKit.Embeddings;

var model    = LM.LoadFromModelID("embeddinggemma-300m");
var embedder = new Embedder(model);

float[] vec = embedder.GetEmbeddings("Machine learning is fascinating.");
Console.WriteLine($"dimensions: {vec.Length} (matches embedder.EmbeddingSize = {embedder.EmbeddingSize})");

Batch inputs in one call for throughput. Combine with VectorOperations.CosineSimilarity for duplicate detection, semantic deduplication, or clustering.

BatchSimilarity.cs

var texts = new[]
{
    "The cat sat on the mat.",
    "A feline rested on the rug.",
    "Stock market closed higher.",
};

// One call, three vectors. GPU batches them efficiently.
float[][] vecs = embedder.GetEmbeddings(texts);

// 0 and 1 are semantically close; 2 is far away.
float simCatRug   = VectorOperations.CosineSimilarity(vecs[0], vecs[1]);
float simCatStock = VectorOperations.CosineSimilarity(vecs[0], vecs[2]);

Console.WriteLine($"cat ~ rug:   {simCatRug:F3}");
Console.WriteLine($"cat ~ stock: {simCatStock:F3}");

Vision and text embedders share a vector space when paired (e.g. nomic-embed-vision + nomic-embed-text). Embed once, query both modalities, retrieve cross-modally without two separate pipelines.

CrossModal.cs

using LMKit.Graphics;

// One aligned pair of embedders.
var textEmb  = new Embedder(LM.LoadFromModelID("nomic-embed-text"));
var imageEmb = new Embedder(LM.LoadFromModelID("nomic-embed-vision"));

// Index images.
var store = new FileSystemVectorStore("./image-index");
foreach (var path in Directory.EnumerateFiles("./photos", "*.jpg"))
{
    float[] vec = imageEmb.GetEmbeddings(ImageBuffer.LoadAsRGB(path));
    await store.UpsertAsync(path, vec, new() { ["file"] = path });
}

// Query with TEXT. Retrieves IMAGE results.
float[] query = textEmb.GetEmbeddings("sunset over the ocean");
var hits     = await store.SearchAsync(query, topK: 5);

Available models

Embedding models in the catalog.

Text

`embeddinggemma-300m`

Compact 300M-parameter multilingual embedder. Runs comfortably on CPU. Default choice for laptop-grade workloads and air-gapped servers.

Text

`nomic-embed-text`

Strong text embedder; pair with nomic-embed-vision for cross-modal workflows. Same vector dimensionality, shared semantic space.

Multilingual

`bge-m3`

100+ languages in one model. Dense AND sparse output (so it can drive lexical-style retrieval too). The default choice for international corpora.

Text

`qwen3-embedding:0.6b` / `4b` / `8b`

Qwen3 embedding family across three sizes. Choose based on accuracy / memory trade-off; same API.

Vision

`nomic-embed-vision`

Image vectors aligned with nomic-embed-text. Same vector space, two modalities, one index. Powers vision-grounded retrieval and multimodal search.

Bring your own

Load any GGUF embedder

If a model is in GGUF format, LM.LoadFromFile handles it. The Embedder wrapper works the same way.

Where embeddings go next

Embedder is layer 1. Six more compose on top.

Generating vectors is half the work. The rest of the stack handles storage, retrieval, expansion, reranking, and grounded generation.

Storage

Vector database

Built-in FileSystemVectorStore, Qdrant and pgvector connectors, or a custom IVectorStore. Same vectors, four deployment shapes.

Vector database

Recall

Query expansion

HyDE, MultiQuery, contextualisation. When the user's wording is far from the corpus, these techniques pull it back.

Query expansion

Precision

Reranker

A second-pass model reorders the top-k from any retriever. Often the difference between mediocre and excellent results.

Reranker

Generation

Retrieval-augmented generation

Wrap the full pipeline into DocumentRag and answer with citations.

RAG

Lexical companion

Document Search

Layout-aware exact / regex / fuzzy search with bounding boxes. Pair lexical and semantic for hybrid workflows.

Document Search

Outside RAG

Embeddings for text analysis

Same Embedder powers classification, clustering, semantic deduplication, and anomaly detection. The Text Analysis page covers those patterns.

Text Analysis use cases

Demos & docs

Build it. Read it. Try it.

Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.

Demo

Text Embeddings

Console demo: embed a query and corpus, rank by cosine similarity. Minimal embedding pipeline.

Open on GitHub → Sample

Text Embeddings walkthrough

Step-by-step doc page: prerequisites, setup, code path, expected output.

Read on docs → Demo

RAG (retrieval-augmented generation)

Console demo: ingest, embed, retrieve, ground answers with citations.

Open on GitHub → Demo

Image similarity search

Console demo: cross-modal embeddings + retrieval with reranking.

Open on GitHub → Sample

Image similarity search walkthrough

Step-by-step doc page: prerequisites, setup, code path, expected output.

Read on docs → How-to guide

Build a semantic search engine

Index, embed, query by meaning, rerank.

Read the guide → API reference

Embedder

API reference for the unified text + image embedder.

Open the reference →

LM-Kit.NET pillars

Seven pillars, one foundation.

The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.

01 · AI Agents

Orchestration patterns

ReAct planning, supervisors, parallel and pipeline orchestrators, persistent memory, MCP clients, custom tools.

AI Agents

02 · Document Intelligence

Parse PDFs, images, EML

PDF text and table extraction, on-device OCR reaching SOTA benchmark scores, structured field extraction with grammar-constrained generation.

Document Intelligence

03 · Vision & Multimodal

VLMs, image classification, chat with image

Image understanding, classification, labeling, multimodal chat, image embeddings, VLM-OCR, background removal. Same conversation surface as LLMs.

Vision & Multimodal

04 · RAG & Knowledge

Vector search and retrieval

Built-in vector store, Qdrant and pgvector connectors, embeddings, hybrid retrieval, document chunking, source citations.

RAG & Knowledge

05 · Text Analysis

Classification, NER, PII, sentiment

Built-in classifiers and an extractor that emits typed C# objects via grammar-constrained sampling. Sentiment, keywords, language detection.

Text Analysis

06 · Speech & Audio

Audio transcription, STT

A growing local speech-to-text stack: hallucination suppression, Voice Activity Detection, real-time translation, streaming output, 100+ languages.

Speech & Audio

07 · Text Generation

Conversations, rewriting, summaries

Single-turn, multi-turn, and stateless conversation primitives. Translate, correct, rewrite, summarise. Prompt templates, streaming, grammar-constrained outputs.

Text Generation

The foundation

Every capability above runs on this runtime.

Foundation

Local Inference

The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.

Explore the foundation

Vectors, on your hardware.

Start in 5 minutes RAG & Knowledge hub

Embeddings, the foundation of retrieval.

Embedder

VectorOperations

float[] & float[][]

Text and image, same surface

Single text, batches, tokenized input

GetEmbeddingsAsync + cancellation

Text-to-image and image-to-text

100+ languages with bge-m3

Any vector store, same API

embeddinggemma-300m

nomic-embed-text

bge-m3

qwen3-embedding:0.6b / 4b / 8b

nomic-embed-vision

Load any GGUF embedder

Vector database

Query expansion

Reranker

Retrieval-augmented generation

Document Search

Embeddings for text analysis

Text Embeddings

Text Embeddings walkthrough

RAG (retrieval-augmented generation)

Image similarity search

Image similarity search walkthrough

Build a semantic search engine

Embedder

Orchestration patterns

Parse PDFs, images, EML

VLMs, image classification, chat with image

Vector search and retrieval

Classification, NER, PII, sentiment

Audio transcription, STT

Conversations, rewriting, summaries

Local Inference

`Embedder`

`VectorOperations`

`float[]` & `float[][]`

`GetEmbeddingsAsync` + cancellation

100+ languages with `bge-m3`

`embeddinggemma-300m`

`nomic-embed-text`

`bge-m3`

`qwen3-embedding:0.6b` / `4b` / `8b`

`nomic-embed-vision`