Get Free Community License
Retrieval-Augmented Generation

Your AI Keeps Making Things Up.Ground It in Real Data.

LLMs hallucinate. They confidently cite documents that don't exist and invent facts that sound plausible. LM-Kit's RAG engine solves this by grounding every response in your actual documents, with page-level citations you can verify.

PDFs with tables and complex layouts lose structure
Scanned documents are completely ignored
Cloud RAG services see your confidential data
No way to verify which page the answer came from
100% On-Device OCR + VLM Support Page-Level Citations

DocumentRag

Multi-page processing with OCR and VLM-based document understanding. Preserves layout, tables, and structure.

Core

PdfChat

Conversational Q&A over documents. Multi-turn dialogue with automatic context management and caching.

Ready

IVectorStore

4 storage backends: in-memory, built-in DB, Qdrant, or custom. Switch without changing code.

Flexible
0
Cloud Dependencies
4
Vector Backends
3
Processing Modes

Ground Your AI in Real Data

Traditional LLMs hallucinate. LM-Kit.NET's RAG engine grounds every response in your actual documents, databases, and knowledge bases. Semantic retrieval finds the most relevant passages, then generation synthesizes accurate, cited answers.

From simple text files to complex multi-page PDFs with tables, forms, and scanned content, LM-Kit.NET handles it all with intelligent document understanding, OCR, and vision-based parsing.

100% on-device processing. Your documents never leave your infrastructure. Meet GDPR, HIPAA, and data residency requirements by design.

DocumentRag.cs
// Document-centric RAG with full lifecycle management
var docRag = new DocumentRag(embeddingModel);

// Enable OCR for scanned documents
docRag.OcrEngine = new OcrEngine();

// Enable VLM for complex layouts
docRag.VisionParser = new VlmOcr(visionModel);

// Import with metadata for lifecycle tracking
var metadata = new DocumentMetadata(
    attachment, id: "report-2024-q4");
await docRag.ImportDocumentAsync(
    attachment, metadata, "reports");

// Query with source references
var result = await docRag.QueryPartitionsAsync(
    "What was Q4 revenue?", matches, conversation);

foreach (var reference in result.SourceReferences)
    Console.WriteLine(
        $"Page {reference.PageNumber}");

DocumentRag: Beyond Simple Text Retrieval

Multi-page document processing with OCR, vision-based understanding, and complete document lifecycle management.

DocumentRag Class

Intelligent Document Processing

DocumentRag extends RagEngine with specialized handling for multi-page documents. It automatically extracts text page-by-page, handles mixed content types, and maintains document structure for accurate retrieval.

  • Multi-Page Processing

    Automatic page-by-page extraction with structure preservation. Handles PDFs, images, and multi-page formats seamlessly.

  • OCR Integration

    Built-in OCR engine extracts text from scanned documents and image-based pages. No external dependencies required.

  • Vision-Based Understanding

    VisionParser uses VLMs for advanced document understanding, preserving layout and structure as markdown for complex documents.

  • Document Lifecycle Management

    Import, update, and delete documents with explicit IDs. Track document versions and manage your knowledge base programmatically.

Source References

Grounded Answers with Citations

Every response includes source references with document names and page numbers. Build trust with your users by showing exactly where information comes from.

  • Page-Level Attribution

    Know exactly which page contains the source information. Enable users to verify and explore original documents.

  • Progress Events

    Monitor document import with real-time progress callbacks. Track page processing, embedding generation, and indexing status.

  • Metadata Filtering

    Attach custom metadata to documents and filter queries by category, date, author, or any custom attribute.

Three Intelligent Processing Modes

Choose the optimal strategy for your document types, or let Auto mode select the best approach per page.

Auto Mode

Intelligent per-page selection. Automatically chooses the best processing strategy based on content type and available engines.

  • Detects text vs. image-based pages
  • Falls back gracefully
  • Optimal quality/speed balance
  • Recommended for mixed documents

Text Extraction

Traditional text extraction with optional OCR for image-based pages. Fast and efficient for text-heavy documents.

  • Fastest processing speed
  • OCR for scanned content
  • Low resource usage
  • Best for simple layouts

Document Understanding

Vision language models for advanced parsing. Preserves layout, tables, and structure as markdown.

  • VLM-powered analysis
  • Layout preservation
  • Table structure extraction
  • Complex document handling

PdfChat: Chat With Your Documents

A complete conversational interface for document question-answering. Multi-turn dialogue, automatic context management, and intelligent retrieval in one class.

Multi-Turn Conversation

Maintain context across questions. Follow-up queries understand conversation history for natural dialogue flow.

Smart Context Management

Small documents load in full for complete context. Large documents use passage retrieval to inject only relevant excerpts.

Document Caching

Vector store caching for fast subsequent queries. Load a document once, query it indefinitely.

Tool Calling Support

Register custom tools the model can invoke during conversation. Extend document Q&A with calculations, lookups, or external APIs.

Semantic Reranking

Optional reranker refines passage retrieval results for higher precision. Get the most relevant content every time.

Reasoning Control

Adjust reasoning depth for models that support extended thinking. Balance response quality with latency.

Agent Memory Integration

Connect to AgentMemory for RAG-backed persistent context that survives across conversation sessions.

Comprehensive Events

CacheAccessed, PassageRetrievalCompleted, ResponseGenerationStarted, and more. Full observability into the RAG pipeline.

Four Flexible Storage Strategies

Choose the storage that fits your application's lifecycle. Switch between backends seamlessly via the IVectorStore interface.

In-Memory

Fast prototyping, live classification, and immediate feedback. Embeddings stored in RAM with optional serialization to disk.

  • Zero setup required
  • Instant feedback
  • Serializable to disk
Best for: Prototypes, testing

Built-in Vector DB

SQLite for vectors. File-based persistence with zero external dependencies. Handles millions of vectors on standard hardware.

  • No infrastructure needed
  • Portable and shareable
  • Millions of vectors
Best for: Desktop, offline, air-gapped

Qdrant Integration

Enterprise-scale vector search. HNSW indexing, automatic sharding, and distributed deployment via open-source connector.

  • Billions of vectors
  • Cloud or local Docker
  • Sub-second search
Best for: Distributed, production

Custom via IVectorStore

Implement the IVectorStore interface to connect any proprietary database, internal API, or specialized storage system.

  • Full backend control
  • Custom storage logic
  • Future-proof architecture
Best for: Proprietary systems

Production-Ready RAG Features

Everything you need to build enterprise-grade retrieval systems.

Semantic Reranking

Cross-encoder rerankers refine initial retrieval results for significantly higher precision on complex queries.

Advanced Chunking

Markdown-aware, semantic, and layout-based chunking strategies. IChunking interface for custom implementations.

Multimodal RAG

Retrieve relevant content from both text and images. Image embeddings enable visual similarity search.

Metadata Filtering

Attach custom metadata to partitions. Filter queries by category, date range, author, or any attribute.

Agent Memory

RAG-backed persistent memory for conversational agents. Store and recall context across sessions.

Data Privacy

100% on-device processing. Documents never leave your infrastructure. GDPR, HIPAA compliant by design.

Async/Sync APIs

Every method available in both synchronous and asynchronous variants. Build responsive UIs or batch processes.

Streaming Responses

Real-time token streaming for responsive user experiences. AfterTextCompletion event for incremental updates.

Custom Prompt Templates

Configure how retrieved context is presented to the model. Optimize prompts for your specific use case.

Core RAG Classes

Comprehensive API documentation for building custom RAG pipelines.

DocumentRag
Document-centric RAG with OCR, VLM parsing, and lifecycle management.
View Documentation
PdfChat
Conversational Q&A over PDFs with multi-turn dialogue and caching.
View Documentation
RagEngine
Core retrieval-augmented generation engine with data source management.
View Documentation
DataSource
Content repository for text partitions with section organization.
View Documentation
Embedder
Generate text and image embeddings for similarity search.
View Documentation
TextChunking
Recursive text partitioning with configurable strategies.
View Documentation
IVectorStore
Interface for pluggable vector storage backends.
View Documentation
QdrantEmbeddingStore
Qdrant vector database integration via IVectorStore.
View Documentation
AgentMemory
RAG-backed persistent memory for conversational agents.
View Documentation

Get Started in Minutes

Clone working examples from our GitHub repository and customize for your use case.

Custom Chatbot with RAG

Build a knowledge-grounded chatbot using RagEngine with multiple data sources.

View Sample

RAG with Qdrant Vector Store

Enterprise-scale RAG using Qdrant for vector storage and search.

View Sample

PDF Chat Demo

Conversational Q&A over PDF documents with PdfChat class.

View Sample

Image Similarity Search

Multimodal RAG with image embeddings for visual content retrieval.

View Sample

Research Assistant

Multi-agent RAG workflow with web search, analysis, and synthesis.

View Sample

All Samples on GitHub

Browse the complete collection of RAG samples and examples.

View Repository

Build Context-Aware AI Today

Add retrieval-augmented generation to your .NET application with a single NuGet package. No cloud dependencies. No external services.