Document Intelligence for .NET, Local AI Document Processing Platform, LM-Kit

Understand & query

Turn documents into answers.

Conversational Q&A, retrieval-augmented generation, structured field extraction, classification, summarisation, and intelligent splitting. Six capabilities, one SDK, all on-device.

PdfChat

Document chat & Q&A

Load any document and ask questions in natural language. Semantic RAG with adaptive layout analysis, multi-turn memory, every answer traced to document, page and passage.

Explore document chat

DocumentRag

Document RAG engine

The lower-level RAG engine. Explicit lifecycle management, pluggable vector stores (filesystem, Qdrant, custom), configurable chunking, progress events.

Explore document RAG

TextExtraction

Structured data extraction

Define a schema, feed in a document, get structured JSON. Dynamic Sampling and symbolic validation eliminate hallucinations. Invoices, contracts, forms, IDs, handwriting.

Explore data extraction

DocumentSplitter

Intelligent document splitting

Detect logical document boundaries within multi-page PDFs. VLM-powered, template-free, with auto-labels and confidence scoring.

Explore document splitting

Categorization

Document classification

Zero-shot classification into 30+ predefined categories or custom labels. Confidence scoring, parallel batch throughput. Mailroom-scale.

Explore classification

Summarizer

Document summarisation

Recursive summarisation handles documents of any size. Three intents (executive, bullet, narrative), auto-title, vision-aware for scans.

Explore summarisation

Process & convert

The complete document toolkit.

Industry-grade OCR, universal Markdown conversion, format-to-format converters, full PDF manipulation, email archive parsing. The infrastructure beneath every document workflow, exposed as first-class .NET APIs and as agent tools.

Foundation

Layout understanding

Deterministic multi-layer pipeline. Connected-component analysis, paragraph detection, reading-order recovery, layout-aware search across six modes. The R&D foundation under conversion, extraction, RAG, classification.

Explore layout

Flagship

OCR

Native CPU-efficient engine plus VLM OCR (PaddleOCR-VL, GLM-OCR, LightOnOCR). 34+ languages. Tables, formulas, charts, seals, bounding boxes. SOTA benchmark accuracy, on-device, no per-page cost.

Explore OCR

ImageBuffer

Image processing

The pipeline behind every accurate OCR run. Deskew, smart binarize, despeckle, auto-crop, blank detection. Plus Canvas drawing API and image-similarity search.

Explore image processing

DocumentToMarkdown

Document to Markdown

Universal converter. PDF, DOCX, PPTX, XLSX, HTML, EML, MBOX, images. Three strategies (TextExtraction, VlmOcr, Hybrid) auto-pick per page. LLM-ready output.

Explore Markdown conversion

15+ converters

Document conversion

Markdown to PDF / DOCX / HTML, HTML to Markdown, EML to PDF, image to PDF, PDF to image, MBOX to Markdown. Bidirectional. Pure .NET.

Explore conversion

PdfDocument

PDF toolkit

Merge, split, render, search, search-highlight, generate searchable PDFs from scans, unlock encrypted, extract text and images, inspect metadata.

Explore PDF toolkit

PdfAConverter

PDF/A conversion

Convert existing PDFs to archival PDF/A-1b / 2b / 3b (ISO 19005). Fonts embedded and reconciled, colours calibrated, prohibited constructs removed, XMP rebuilt. 99.5%+ veraPDF conformance.

Explore PDF/A conversion

EmlDocument

Email processing

Parse EML, MBOX, ICS. Headers, bodies, attachments, calendar events. RAG over inboxes, compliance archives, auto-triage.

Explore email processing

Adaptive processing

Three AI engines, one API.

Every page in every document is different. A digital PDF has clean text layers. A scanned invoice needs OCR. A complex form with tables and columns needs visual understanding. LM-Kit.NET's adaptive engine analyzes each page individually and selects the optimal extraction strategy automatically.

This content-aware approach means you never have to classify documents upfront or write format-specific code. One API call handles a 500-page batch containing digital contracts, scanned receipts, and image-heavy reports.

Built by IDP pioneers: This isn't a wrapper around generic RAG. It's purpose-built document intelligence from a team with 20+ years of experience processing billions of documents in production worldwide.

Recommended · Auto

`PageProcessingMode.Auto`

Analyzes each page and automatically selects the best strategy. Uses VLM for image-heavy pages, direct text extraction for digital content, OCR for scanned text. Zero configuration required.

Mode

`TextExtraction`

Extracts text directly from PDF structure with OCR fallback for scanned pages. Fastest processing, lowest resource usage. Ideal for clean digital documents.

Mode

`DocumentUnderstanding`

Vision Language Models analyze pages visually to understand layout, structure, tables, and relationships. Outputs structured markdown. Best for complex layouts, forms, and multi-column content.

Quickstart

Up and running in minutes.

LM-Kit.NET is a single NuGet package. No microservices, no Docker, no API keys. Load a model, point at a document, and start extracting intelligence.

Install the NuGet package and load your preferred AI models (chat, embedding, vision)
Create a PdfChat instance and feed it any document: PDF, Word, Excel, images, HTML
Ask questions in natural language and get grounded answers with source attribution

The same models power all four pillars. Switch from document Q&A to data extraction to document splitting by changing one class.

DocumentIntelligence.cs

using LMKit.Retrieval;
using LMKit.Extraction;
using LMKit.Model;

// Load models
var chat  = LM.LoadFromModelID("gemma4:e4b");
var embed = LM.LoadFromModelID("embeddinggemma-300m");

// Document Q&A
using var pdfChat = new PdfChat(chat, embed);
await pdfChat.LoadDocumentAsync("report.pdf");
var answer = await pdfChat.SubmitAsync(
    "What were the Q4 results?");

// Structured extraction
var extractor = new TextExtraction(chat);
extractor.SetContent(new Attachment("invoice.pdf"));
var result = extractor.Parse();

// Document splitting
var vision = LM.LoadFromModelID("qwen3.5:4b");
var splitter = new DocumentSplitting(vision);
var segments = splitter.Split(
    new Attachment("batch_scan.pdf"));

Platform capabilities

Enterprise-grade document processing.

Built for production workloads that demand accuracy, traceability, and compliance.

Capability

Layout analysis engine

Deep document structure understanding: columns, paragraphs, lines, text regions, reading order. Purpose-built algorithms for real-world document layouts.

Capability

Source attribution

Every answer and extracted value is traced to its source document, page number, and passage. Full audit trail for compliance and verification.

Capability

Intelligent caching

Processed document embeddings are cached via IVectorStore. Subsequent loads are instant. Supports filesystem, Qdrant, and custom backends.

Capability

Vision Language Models

VlmOcr uses multimodal AI to transcribe pages as structured markdown. Understands tables, forms, multi-column layouts, and handwritten notes visually.

Capability

100% on-device

All processing runs on your infrastructure. Documents never leave. Air-gapped deployments, HIPAA, GDPR, and SOC 2 compliance ready out of the box.

Capability

Neuro-symbolic validation

Dynamic Sampling combined with symbolic validation layers eliminates LLM hallucinations. Confidence scores on every extraction for production-grade reliability.

File format support

Process any document format.

Native support for the most common document types in enterprise workflows.

.pdf

PDF documents

.docx

Word documents

.xlsx

Excel spreadsheets

.pptx

PowerPoint slides

.html

HTML pages

.md

Markdown files

.png .jpg

Images & scans

.txt

Plain text

Use cases

Built for real-world document workflows.

From mailroom automation to compliance audits, LM-Kit.NET handles the document intelligence that matters.

Use case

Invoice & receipt processing

Extract vendor, amounts, line items, tax, and payment terms from any invoice format. Schema-driven extraction with zero hallucinations.

Use case

Contract analysis

Query legal agreements for clauses, obligations, termination conditions, and payment terms. Multi-document comparison with full source attribution.

Use case

Compliance & audit

Verify regulatory compliance across document collections. Traceable source references create audit trails for HIPAA, GDPR, and SOC 2.

Use case

Mailroom automation

Split multi-document scans into individual files, classify each automatically, and route to the correct workflow. No templates needed.

Use case

Knowledge base & research

Build searchable knowledge bases from technical manuals, research papers, and specifications. Semantic search across thousands of documents.

Use case

Customer support automation

Ingest product documentation and answer customer questions automatically. Grounded responses ensure accuracy with zero fabrication.

Ready-to-run demos

See document intelligence in action.

Every capability ships with a complete, runnable console application. Download, build, and explore. Full source code on GitHub.

Featured demo

Chat with PDF

Interactive console application for conversational Q&A over PDF documents. Load one or more files, choose between standard and vision processing modes, and ask questions with real-time streaming, source references, and generation stats.

Model selection with automatic download
Standard text extraction or VLM-based understanding
Multi-document loading with embedding cache
Interactive commands: /help, /status, /add, /clear

Sample guide GitHub source

Featured demo

Invoice data extraction

Extract structured fields from invoice documents (PDF and images) using vision language models. Outputs vendor details, line items, totals, tax, and payment terms as clean JSON. Includes sample invoices and a customizable extraction schema.

Schema-driven extraction with JSON output
PDF and image support (PNG, JPG, TIFF)
Optional OCR with auto language detection
Sample invoices included for immediate testing

Sample guide GitHub source

Document splitting

Detect logical boundaries in multi-page PDFs and split them into separate files. Vision-based analysis with labels and confidence scores.

Guide & GitHub

Structured data extraction

Define custom schemas and extract typed fields from text documents. Supports invoices, job offers, medical records, and more.

Guide & GitHub

Document to Markdown

Convert PDFs, images, and scans to structured Markdown using VLMs. Preserves tables, formatting, and document structure.

Guide & GitHub

Document processing agent

An AI agent with 9 built-in tools: PDF split, merge, render, inspect, OCR, deskew, crop, resize, and text extraction via natural language.

Guide & GitHub

Document summarizer

Generate titles and concise summaries from PDFs, images, and text files. Customizable summary length and style guidance.

Guide & GitHub

Language detection from document

Detect the language of PDF and image documents using VLMs. Multilingual support with fast processing and performance metrics.

Guide & GitHub

Explore all LM-Kit.NET samples

40+ console demos covering agents, chat, classification, extraction, embeddings, RAG, speech, vision, and more.

Samples overview All samples on GitHub

API reference

Core classes for document intelligence.

The building blocks for every document workflow in your .NET application.

Class

`PdfChat`

High-level conversational document agent. Load documents, ask questions, get grounded answers with source references. Supports tool calling and MCP.

View docs

Class

`DocumentRag`

Lower-level RAG engine with full control over processing modes, chunking, vector stores, and document lifecycle management.

View docs

Class

`TextExtraction`

Schema-driven structured data extraction from text, images, PDFs, and Office documents. JSON output with typed fields.

View docs

Class

`DocumentSplitting`

VLM-powered boundary detection within multi-page PDFs. Returns page ranges, labels, and confidence scores.

View docs

Class

`VlmOcr`

Vision-based document parser using multimodal LMs. Transcribes pages as structured markdown preserving layout and structure.

View docs

Interface

`IVectorStore`

Interface for embedding storage and caching. Built-in filesystem backend. Pluggable: Qdrant, PostgreSQL, or custom implementations.

View docs

Demos & docs

Build it. Read it. Try it.

Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.

Demo

Chat with PDF

Console demo: drop a PDF, ask questions, get cited answers.

Open on GitHub → Demo

Invoice data extraction

Console demo: schema-driven extraction from invoice scans.

Open on GitHub → Demo

Document to Markdown

Console demo: convert PDFs, DOCX, HTML to clean Markdown.

Open on GitHub → How-to guide

Build a private document Q&A

End-to-end how-to: load, index, chat with citations.

Read the guide →

LM-Kit.NET pillars

Seven pillars, one foundation.

The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.

01 · AI Agents

Orchestration patterns

ReAct planning, supervisors, parallel and pipeline orchestrators, persistent memory, MCP clients, custom tools.

AI Agents

02 · Document Intelligence

Parse PDFs, images, EML

PDF text and table extraction, on-device OCR reaching SOTA benchmark scores, structured field extraction with grammar-constrained generation.

You are here

03 · Vision & Multimodal

VLMs, image classification, chat with image

Image understanding, classification, labeling, multimodal chat, image embeddings, VLM-OCR, background removal. Same conversation surface as LLMs.

Vision & Multimodal

04 · RAG & Knowledge

Vector search and retrieval

Built-in vector store, Qdrant and pgvector connectors, embeddings, hybrid retrieval, document chunking, source citations.

RAG & Knowledge

05 · Text Analysis

Classification, NER, PII, sentiment

Built-in classifiers and an extractor that emits typed C# objects via grammar-constrained sampling. Sentiment, keywords, language detection.

Text Analysis

06 · Speech & Audio

Audio transcription, STT

A growing local speech-to-text stack: hallucination suppression, Voice Activity Detection, real-time translation, streaming output, 100+ languages.

Speech & Audio

07 · Text Generation

Conversations, rewriting, summaries

Single-turn, multi-turn, and stateless conversation primitives. Translate, correct, rewrite, summarise. Prompt templates, streaming, grammar-constrained outputs.

Text Generation

The foundation

Every capability above runs on this runtime.

Foundation

Local Inference

The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.

Explore the foundation

Install the SDK

Ready to build document intelligence?

The most advanced local document processing platform for .NET. From chat to extraction to splitting. 100% on your infrastructure.

Download free View pricing

The complete local document intelligence platform.

Document chat & Q&A

Document RAG engine

Structured data extraction

Intelligent document splitting

Document classification

Document summarisation

Layout understanding

OCR

Image processing

Document to Markdown

Document conversion

PDF toolkit

PDF/A conversion

Email processing

Three AI engines, one API.

PageProcessingMode.Auto

TextExtraction

DocumentUnderstanding

Up and running in minutes.

Layout analysis engine

Source attribution

Intelligent caching

Vision Language Models

100% on-device

Neuro-symbolic validation

Invoice & receipt processing

Contract analysis

Compliance & audit

Mailroom automation

Knowledge base & research

Customer support automation

Chat with PDF

Invoice data extraction

More document intelligence demos

Document splitting

Structured data extraction

Document to Markdown

Document processing agent

Document summarizer

Language detection from document

Explore all LM-Kit.NET samples

PdfChat

DocumentRag

TextExtraction

DocumentSplitting

VlmOcr

IVectorStore

Chat with PDF

Invoice data extraction

Document to Markdown

Build a private document Q&amp;A

Orchestration patterns

Parse PDFs, images, EML

VLMs, image classification, chat with image

Vector search and retrieval

Classification, NER, PII, sentiment

Audio transcription, STT

Conversations, rewriting, summaries

Local Inference

`PageProcessingMode.Auto`

`TextExtraction`

`DocumentUnderstanding`

`PdfChat`

`DocumentRag`

`TextExtraction`

`DocumentSplitting`

`VlmOcr`

`IVectorStore`

Build a private document Q&A