Solutions · RAG & Knowledge · Embeddings

Embeddings, the foundation of retrieval.

Turn any text passage, image, or document into a dense vector that captures meaning. Embedder is the unified API used by every search, RAG, classification, and clustering workflow in LM-Kit.NET. Runs entirely on-device with any embedding model in the catalog.

Text + image Batch + async Any vector store
Engine

Embedder

One class, one constructor, one method (with overloads).

Helpers

VectorOperations

Cosine, dot, Euclidean similarity plus normalisation utilities.

Output

float[] & float[][]

Plain arrays. Drop directly into any vector store or do raw vector math.

What you get

One API, every embedding workflow.

01 · Unified

Text and image, same surface

Load nomic-embed-text for text, nomic-embed-vision for images, bge-m3 for multilingual. Same Embedder constructor. Same GetEmbeddings call.

02 · Batched

Single text, batches, tokenized input

Overloads accept a single string, an IEnumerable<string>, pre-tokenised arrays, or ImageBuffer sequences for images. Throughput scales linearly with batch size on GPU.

03 · Async-first

GetEmbeddingsAsync + cancellation

Every overload has an async sibling. Pass a CancellationToken, get back a Task. Wire straight into server pipelines without blocking the request thread.

04 · Cross-modal

Text-to-image and image-to-text

Nomic Embed Vision aligns its image vector space with nomic-embed-text. A text query retrieves matching images, an image query retrieves matching text passages. One vector store, two modalities.

05 · Multilingual

100+ languages with bge-m3

bge-m3 embeds across 100+ languages in one model. Index in mixed-language corpora, query in any language, retrieve cross-lingually without per-language pipelines.

06 · Plug-in stores

Any vector store, same API

In-memory, FileSystemVectorStore, Qdrant via connector, or a custom IVectorStore implementation. float[] in, similarity hits out. No vendor lock-in.

How it works

Three patterns, same Embedder.

Single, batch, and cross-modal. Pick a tab for the pattern you need.

The simplest pattern. One model, one call, one vector. The EmbeddingSize property tells you the dimension ahead of time so you can size your vector store correctly.

SingleText.cs
using LMKit.Model;
using LMKit.Embeddings;

var model    = LM.LoadFromModelID("embeddinggemma-300m");
var embedder = new Embedder(model);

float[] vec = embedder.GetEmbeddings("Machine learning is fascinating.");
Console.WriteLine($"dimensions: {vec.Length} (matches embedder.EmbeddingSize = {embedder.EmbeddingSize})");
Available models

Embedding models in the catalog.

Text

embeddinggemma-300m

Compact 300M-parameter multilingual embedder. Runs comfortably on CPU. Default choice for laptop-grade workloads and air-gapped servers.

Text

nomic-embed-text

Strong text embedder; pair with nomic-embed-vision for cross-modal workflows. Same vector dimensionality, shared semantic space.

Multilingual

bge-m3

100+ languages in one model. Dense AND sparse output (so it can drive lexical-style retrieval too). The default choice for international corpora.

Text

qwen3-embedding:0.6b / 4b / 8b

Qwen3 embedding family across three sizes. Choose based on accuracy / memory trade-off; same API.

Vision

nomic-embed-vision

Image vectors aligned with nomic-embed-text. Same vector space, two modalities, one index. Powers vision-grounded retrieval and multimodal search.

Bring your own

Load any GGUF embedder

If a model is in GGUF format, LM.LoadFromFile handles it. The Embedder wrapper works the same way.

LM-Kit.NET pillars

Seven pillars, one foundation.

The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.

The foundation

Every capability above runs on this runtime.

Foundation

Local Inference

The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.

Explore the foundation

Vectors, on your hardware.

Start in 5 minutes RAG & Knowledge hub