DocumentRag
Multi-page processing with OCR and VLM-based document understanding. Preserves layout, tables, and structure.
LLMs hallucinate. They confidently cite documents that don't exist and invent facts that sound plausible. LM-Kit's RAG engine solves this by grounding every response in your actual documents, with page-level citations you can verify.
DocumentRagMulti-page processing with OCR and VLM-based document understanding. Preserves layout, tables, and structure.
PdfChatConversational Q&A over documents. Multi-turn dialogue with automatic context management and caching.
RagChatMulti-turn conversational Q&A over custom knowledge bases. 4 query generation modes, tool calling, and agent memory.
IVectorStore4 storage backends: in-memory, built-in DB, Qdrant, or custom. Switch without changing code.
Traditional LLMs hallucinate. LM-Kit.NET's RAG engine grounds every response in your actual documents, databases, and knowledge bases. Semantic retrieval finds the most relevant passages, then generation synthesizes accurate, cited answers.
From simple text files to complex multi-page PDFs with tables, forms, and scanned
content, LM-Kit.NET handles it all with intelligent document understanding, OCR,
and vision-based parsing. Need multi-turn conversations? RagChat
adds conversational Q&A with four query generation modes over any knowledge base.
100% on-device processing. Your documents never leave your infrastructure. Meet GDPR, HIPAA, and data residency requirements by design.
// Document-centric RAG with full lifecycle management var docRag = new DocumentRag(embeddingModel); // Enable OCR for scanned documents docRag.OcrEngine = new OcrEngine(); // Enable VLM for complex layouts docRag.VisionParser = new VlmOcr(visionModel); // Import with metadata for lifecycle tracking var metadata = new DocumentMetadata(attachment, id: "report-2024-q4"); await docRag.ImportDocumentAsync(attachment, metadata, "reports"); // Query with source references var result = await docRag.QueryPartitionsAsync( "What was Q4 revenue?", matches, conversation); foreach (var reference in result.SourceReferences) Console.WriteLine($"Page {reference.PageNumber}");
DocumentRag: beyond simple text retrieval.Multi-page document processing with OCR, vision-based understanding, and complete document lifecycle management.
DocumentRag Class
DocumentRag extends RagEngine with specialized handling
for multi-page documents. It automatically extracts text page-by-page, handles
mixed content types, and maintains document structure for accurate retrieval.
Multi-page
Automatic page-by-page extraction with structure preservation. Handles PDFs, images, and multi-page formats seamlessly.
OCR
Built-in OCR engine extracts text from scanned documents and image-based pages. No external dependencies required.
VLM
VisionParser uses VLMs for advanced document understanding, preserving layout and structure as markdown for complex documents.
Lifecycle
Import, update, and delete documents with explicit IDs. Track document versions and manage your knowledge base programmatically.
Every response includes source references with document names and page numbers. Build trust with your users by showing exactly where information comes from.
Page-level
Know exactly which page contains the source information. Enable users to verify and explore original documents.
Events
Monitor document import with real-time progress callbacks. Track page processing, embedding generation, and indexing status.
Filtering
Attach custom metadata to documents and filter queries by category, date, author, or any custom attribute.
Choose the optimal strategy for your document types, or let Auto mode select the best approach per page.
Recommended
Intelligent per-page selection. Automatically chooses the best processing strategy based on content type and available engines.
Fast
Traditional text extraction with optional OCR for image-based pages. Fast and efficient for text-heavy documents.
VLM
Vision language models for advanced parsing. Preserves layout, tables, and structure as markdown.
PdfChat: chat with your documents.A complete conversational interface for document question-answering. Multi-turn dialogue, automatic context management, and intelligent retrieval in one class.
Multi-turn
Maintain context across questions. Follow-up queries understand conversation history for natural dialogue flow.
Context
Small documents load in full for complete context. Large documents use passage retrieval to inject only relevant excerpts.
Cache
Vector store caching for fast subsequent queries. Load a document once, query it indefinitely.
Tools
Register custom tools the model can invoke during conversation. Extend document Q&A with calculations, lookups, or external APIs.
Reranking
Optional reranker refines passage retrieval results for higher precision. Get the most relevant content every time.
Reasoning
Adjust reasoning depth for models that support extended thinking. Balance response quality with latency.
Memory
Connect to AgentMemory for RAG-backed persistent context that survives across conversation sessions.
Events
CacheAccessed, PassageRetrievalCompleted, ResponseGenerationStarted, and more. Full observability into the RAG pipeline.
RagChat: multi-turn Q&A over any knowledge base.
Turn any pre-populated RagEngine into a conversational interface
with automatic query contextualization, tool calling, and agent memory.
Mode 01
Send the user's question directly to retrieval. Zero overhead, fastest path.
Mode 02
Rewrites follow-up questions into self-contained queries using conversation history.
Mode 03
Generates multiple query variants and fuses results via Reciprocal Rank Fusion.
Mode 04
Generates a hypothetical answer first, then retrieves passages similar to that answer. Bridges the vocabulary gap between questions and documents.
Choose the storage that fits your application's lifecycle. Switch between
backends seamlessly via the IVectorStore interface.
In-memory
Fast prototyping, live classification, and immediate feedback. Embeddings stored in RAM with optional serialization to disk.
Built-in
SQLite for vectors. File-based persistence with zero external dependencies. Handles millions of vectors on standard hardware.
Qdrant
Enterprise-scale vector search. HNSW indexing, automatic sharding, and distributed deployment via open-source connector.
Custom
IVectorStoreImplement the IVectorStore interface to connect any proprietary database, internal API, or specialized storage system.
Everything you need to build enterprise-grade retrieval systems.
Reranking
Cross-encoder rerankers refine initial retrieval results for significantly higher precision.
Chunking
Markdown-aware, semantic, and layout-based chunking strategies. IChunking interface for custom implementations.
Multimodal
Retrieve relevant content from both text and images. Image embeddings enable visual similarity search.
Filtering
Attach custom metadata to partitions. Filter queries by category, date range, author, or any attribute.
Memory
RAG-backed persistent memory for conversational agents. Store and recall context across sessions.
Privacy
100% on-device processing. Documents never leave your infrastructure. GDPR, HIPAA compliant by design.
APIs
Every method available in both synchronous and asynchronous variants. Build responsive UIs or batch processes.
Streaming
Real-time token streaming for responsive user experiences. AfterTextCompletion event for incremental updates.
Templates
Configure how retrieved context is presented to the model. Optimize prompts for your specific use case.
Comprehensive API documentation for building custom RAG pipelines.
DocumentRagDocument-centric RAG with OCR, VLM parsing, and lifecycle management.
PdfChatConversational Q&A over PDFs with multi-turn dialogue and caching.
RagChatMulti-turn conversational Q&A over custom knowledge bases with 4 query modes.
RagEngineCore retrieval-augmented generation engine with data source management.
DataSourceContent repository for text partitions with section organization.
EmbedderGenerate text and image embeddings for similarity search.
TextChunkingRecursive text partitioning with configurable strategies.
IVectorStoreInterface for pluggable vector storage backends.
QdrantEmbeddingStoreQdrant vector database integration via IVectorStore.
AgentMemoryRAG-backed persistent memory for conversational agents.
Clone working examples from our GitHub repository and customize for your use case.
Multi-turn RAG with RagChat, query contextualization, and streaming responses.
Build a knowledge-grounded Q&A system using RagEngine with file-based persistence.
Enterprise-scale RAG using Qdrant for vector storage and search.
Production-grade RAG system with category-scoped search and markdown-aware chunking.
Compare Vector, BM25, and Hybrid search with reranking and MMR diversity filtering.
Conversational Q&A over PDF documents with PdfChat class.
Multimodal RAG with image embeddings for visual content retrieval.
All RAG samples on GitHub
Browse the complete collection of RAG samples and examples.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.
Console demo: ingest, embed, retrieve, ground answers with citations.
Open on GitHub → How-to guideEnd-to-end how-to for production RAG in .NET.
Read the guide → How-to guideHybrid retrieval plus rerank for precision-critical workloads.
Read the guide → How-to guideDense plus sparse plus lexical signals in one query.
Read the guide →The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.
01 · AI Agents
ReAct planning, supervisors, parallel and pipeline orchestrators, persistent memory, MCP clients, custom tools.
AI Agents02 · Document Intelligence
PDF text and table extraction, on-device OCR reaching SOTA benchmark scores, structured field extraction with grammar-constrained generation.
Document Intelligence03 · Vision & Multimodal
Image understanding, classification, labeling, multimodal chat, image embeddings, VLM-OCR, background removal. Same conversation surface as LLMs.
Vision & Multimodal04 · RAG & Knowledge
Built-in vector store, Qdrant connector, embeddings, hybrid retrieval, document chunking, source citations.
You are here05 · Text Analysis
Built-in classifiers and an extractor that emits typed C# objects via grammar-constrained sampling. Sentiment, keywords, language detection.
Text Analysis06 · Speech & Audio
A growing local speech-to-text stack: hallucination suppression, Voice Activity Detection, real-time translation, streaming output, 100+ languages.
Speech & Audio07 · Text Generation
Single-turn, multi-turn, and stateless conversation primitives. Translate, correct, rewrite, summarise. Prompt templates, streaming, grammar-constrained outputs.
Text GenerationThe foundation
Every capability above runs on this runtime.
Foundation
The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.
Add retrieval-augmented generation to your .NET application with a single NuGet package. No cloud dependencies. No external services.