- Ingest any document
PDF, DOCX, XLSX, PPTX, HTML, Markdown, images - Adaptive analysis
Auto-selects OCR, VLM, or text extraction per page - Structured output
Chat answers, extracted fields, document segments, RAG results
The complete local document intelligence platform.
Turn any document into structured, searchable, actionable data. Chat with PDFs, extract fields from invoices, split multi-document scans, and build RAG pipelines. All powered by on-device AI with adaptive processing that combines OCR, Vision Language Models, and layout analysis. Zero cloud dependency.
Cloud calls
Formats
Traceable
AI engines
Turn documents into answers.
Conversational Q&A, retrieval-augmented generation, structured field extraction, classification, summarisation, and intelligent splitting. Six capabilities, one SDK, all on-device.
PdfChat
Document chat & Q&A
Load any document and ask questions in natural language. Semantic RAG with adaptive layout analysis, multi-turn memory, every answer traced to document, page and passage.
Explore document chatDocumentRag
Document RAG engine
The lower-level RAG engine. Explicit lifecycle management, pluggable vector stores (filesystem, Qdrant, custom), configurable chunking, progress events.
Explore document RAGTextExtraction
Structured data extraction
Define a schema, feed in a document, get structured JSON. Dynamic Sampling and symbolic validation eliminate hallucinations. Invoices, contracts, forms, IDs, handwriting.
Explore data extractionDocumentSplitter
Intelligent document splitting
Detect logical document boundaries within multi-page PDFs. VLM-powered, template-free, with auto-labels and confidence scoring.
Explore document splittingCategorization
Document classification
Zero-shot classification into 30+ predefined categories or custom labels. Confidence scoring, parallel batch throughput. Mailroom-scale.
Explore classificationSummarizer
Document summarisation
Recursive summarisation handles documents of any size. Three intents (executive, bullet, narrative), auto-title, vision-aware for scans.
Explore summarisationThe complete document toolkit.
Industry-grade OCR, universal Markdown conversion, format-to-format converters, full PDF manipulation, email archive parsing. The infrastructure beneath every document workflow, exposed as first-class .NET APIs and as agent tools.
Foundation
Layout understanding
Deterministic multi-layer pipeline. Connected-component analysis, paragraph detection, reading-order recovery, layout-aware search across six modes. The R&D foundation under conversion, extraction, RAG, classification.
Explore layoutFlagship
OCR
Native CPU-efficient engine plus VLM OCR (PaddleOCR-VL, GLM-OCR, LightOnOCR). 34+ languages. Tables, formulas, charts, seals, bounding boxes. SOTA benchmark accuracy, on-device, no per-page cost.
Explore OCRImageBuffer
Image processing
The pipeline behind every accurate OCR run. Deskew, smart binarize, despeckle, auto-crop, blank detection. Plus Canvas drawing API and image-similarity search.
DocumentToMarkdown
Document to Markdown
Universal converter. PDF, DOCX, PPTX, XLSX, HTML, EML, MBOX, images. Three strategies (TextExtraction, VlmOcr, Hybrid) auto-pick per page. LLM-ready output.
Explore Markdown conversion15+ converters
Document conversion
Markdown to PDF / DOCX / HTML, HTML to Markdown, EML to PDF, image to PDF, PDF to image, MBOX to Markdown. Bidirectional. Pure .NET.
Explore conversionPdfDocument
PDF toolkit
Merge, split, render, search, search-highlight, generate searchable PDFs from scans, unlock encrypted, extract text and images, inspect metadata.
Explore PDF toolkitEmlDocument
Email processing
Parse EML, MBOX, ICS. Headers, bodies, attachments, calendar events. RAG over inboxes, compliance archives, auto-triage.
Explore email processingThree AI engines, one API.
Every page in every document is different. A digital PDF has clean text layers. A scanned invoice needs OCR. A complex form with tables and columns needs visual understanding. LM-Kit.NET's adaptive engine analyzes each page individually and selects the optimal extraction strategy automatically.
This content-aware approach means you never have to classify documents upfront or write format-specific code. One API call handles a 500-page batch containing digital contracts, scanned receipts, and image-heavy reports.
Built by IDP pioneers: This isn't a wrapper around generic RAG. It's purpose-built document intelligence from a team with 20+ years of experience processing billions of documents in production worldwide.
PageProcessingMode.Auto
Analyzes each page and automatically selects the best strategy. Uses VLM for image-heavy pages, direct text extraction for digital content, OCR for scanned text. Zero configuration required.
TextExtraction
Extracts text directly from PDF structure with OCR fallback for scanned pages. Fastest processing, lowest resource usage. Ideal for clean digital documents.
DocumentUnderstanding
Vision Language Models analyze pages visually to understand layout, structure, tables, and relationships. Outputs structured markdown. Best for complex layouts, forms, and multi-column content.
Up and running in minutes.
LM-Kit.NET is a single NuGet package. No microservices, no Docker, no API keys. Load a model, point at a document, and start extracting intelligence.
- Install the NuGet package and load your preferred AI models (chat, embedding, vision)
- Create a PdfChat instance and feed it any document: PDF, Word, Excel, images, HTML
- Ask questions in natural language and get grounded answers with source attribution
The same models power all four pillars. Switch from document Q&A to data extraction to document splitting by changing one class.
Enterprise-grade document processing.
Built for production workloads that demand accuracy, traceability, and compliance.
Capability
Layout analysis engine
Deep document structure understanding: columns, paragraphs, lines, text regions, reading order. Purpose-built algorithms for real-world document layouts.
Capability
Source attribution
Every answer and extracted value is traced to its source document, page number, and passage. Full audit trail for compliance and verification.
Capability
Intelligent caching
Processed document embeddings are cached via IVectorStore. Subsequent loads are instant. Supports filesystem, Qdrant, and custom backends.
Capability
Vision Language Models
VlmOcr uses multimodal AI to transcribe pages as structured markdown. Understands tables, forms, multi-column layouts, and handwritten notes visually.
Capability
100% on-device
All processing runs on your infrastructure. Documents never leave. Air-gapped deployments, HIPAA, GDPR, and SOC 2 compliance ready out of the box.
Capability
Neuro-symbolic validation
Dynamic Sampling combined with symbolic validation layers eliminates LLM hallucinations. Confidence scores on every extraction for production-grade reliability.
Process any document format.
Native support for the most common document types in enterprise workflows.
.pdfPDF documents
.docxWord documents
.xlsxExcel spreadsheets
.pptxPowerPoint slides
.htmlHTML pages
.mdMarkdown files
.png .jpgImages & scans
.txtPlain text
Built for real-world document workflows.
From mailroom automation to compliance audits, LM-Kit.NET handles the document intelligence that matters.
Use case
Invoice & receipt processing
Extract vendor, amounts, line items, tax, and payment terms from any invoice format. Schema-driven extraction with zero hallucinations.
Use case
Contract analysis
Query legal agreements for clauses, obligations, termination conditions, and payment terms. Multi-document comparison with full source attribution.
Use case
Compliance & audit
Verify regulatory compliance across document collections. Traceable source references create audit trails for HIPAA, GDPR, and SOC 2.
Use case
Mailroom automation
Split multi-document scans into individual files, classify each automatically, and route to the correct workflow. No templates needed.
Use case
Knowledge base & research
Build searchable knowledge bases from technical manuals, research papers, and specifications. Semantic search across thousands of documents.
Use case
Customer support automation
Ingest product documentation and answer customer questions automatically. Grounded responses ensure accuracy with zero fabrication.
See document intelligence in action.
Every capability ships with a complete, runnable console application. Download, build, and explore. Full source code on GitHub.
Featured demo
Chat with PDF
Interactive console application for conversational Q&A over PDF documents. Load one or more files, choose between standard and vision processing modes, and ask questions with real-time streaming, source references, and generation stats.
- Model selection with automatic download
- Standard text extraction or VLM-based understanding
- Multi-document loading with embedding cache
- Interactive commands: /help, /status, /add, /clear
Featured demo
Invoice data extraction
Extract structured fields from invoice documents (PDF and images) using vision language models. Outputs vendor details, line items, totals, tax, and payment terms as clean JSON. Includes sample invoices and a customizable extraction schema.
- Schema-driven extraction with JSON output
- PDF and image support (PNG, JPG, TIFF)
- Optional OCR with auto language detection
- Sample invoices included for immediate testing
More document intelligence demos
Document splitting
Detect logical boundaries in multi-page PDFs and split them into separate files. Vision-based analysis with labels and confidence scores.
Guide & GitHubStructured data extraction
Define custom schemas and extract typed fields from text documents. Supports invoices, job offers, medical records, and more.
Guide & GitHubDocument to Markdown
Convert PDFs, images, and scans to structured Markdown using VLMs. Preserves tables, formatting, and document structure.
Guide & GitHubDocument processing agent
An AI agent with 9 built-in tools: PDF split, merge, render, inspect, OCR, deskew, crop, resize, and text extraction via natural language.
Guide & GitHubDocument summarizer
Generate titles and concise summaries from PDFs, images, and text files. Customizable summary length and style guidance.
Guide & GitHubLanguage detection from document
Detect the language of PDF and image documents using VLMs. Multilingual support with fast processing and performance metrics.
Guide & GitHubExplore all LM-Kit.NET samples
40+ console demos covering agents, chat, classification, extraction, embeddings, RAG, speech, vision, and more.
Core classes for document intelligence.
The building blocks for every document workflow in your .NET application.
Class
PdfChat
High-level conversational document agent. Load documents, ask questions, get grounded answers with source references. Supports tool calling and MCP.
View docsClass
DocumentRag
Lower-level RAG engine with full control over processing modes, chunking, vector stores, and document lifecycle management.
View docsClass
TextExtraction
Schema-driven structured data extraction from text, images, PDFs, and Office documents. JSON output with typed fields.
View docsClass
DocumentSplitting
VLM-powered boundary detection within multi-page PDFs. Returns page ranges, labels, and confidence scores.
View docsClass
VlmOcr
Vision-based document parser using multimodal LMs. Transcribes pages as structured markdown preserving layout and structure.
View docsInterface
IVectorStore
Interface for embedding storage and caching. Built-in filesystem backend. Pluggable: Qdrant, PostgreSQL, or custom implementations.
View docsBuild it. Read it. Try it.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.
Chat with PDF
Console demo: drop a PDF, ask questions, get cited answers.
Open on GitHub → DemoInvoice data extraction
Console demo: schema-driven extraction from invoice scans.
Open on GitHub → DemoDocument to Markdown
Console demo: convert PDFs, DOCX, HTML to clean Markdown.
Open on GitHub → How-to guideBuild a private document Q&A
End-to-end how-to: load, index, chat with citations.
Read the guide →Seven pillars, one foundation.
The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.
01 · AI Agents
Orchestration patterns
ReAct planning, supervisors, parallel and pipeline orchestrators, persistent memory, MCP clients, custom tools.
AI Agents02 · Document Intelligence
Parse PDFs, images, EML
PDF text and table extraction, on-device OCR reaching SOTA benchmark scores, structured field extraction with grammar-constrained generation.
You are here03 · Vision & Multimodal
VLMs, image classification, chat with image
Image understanding, classification, labeling, multimodal chat, image embeddings, VLM-OCR, background removal. Same conversation surface as LLMs.
Vision & Multimodal04 · RAG & Knowledge
Vector search and retrieval
Built-in vector store, Qdrant connector, embeddings, hybrid retrieval, document chunking, source citations.
RAG & Knowledge05 · Text Analysis
Classification, NER, PII, sentiment
Built-in classifiers and an extractor that emits typed C# objects via grammar-constrained sampling. Sentiment, keywords, language detection.
Text Analysis06 · Speech & Audio
Audio transcription, STT
A growing local speech-to-text stack: hallucination suppression, Voice Activity Detection, real-time translation, streaming output, 100+ languages.
Speech & Audio07 · Text Generation
Conversations, rewriting, summaries
Single-turn, multi-turn, and stateless conversation primitives. Translate, correct, rewrite, summarise. Prompt templates, streaming, grammar-constrained outputs.
Text GenerationThe foundation
Every capability above runs on this runtime.
Foundation
Local Inference
The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.
Ready to build document intelligence?
The most advanced local document processing platform for .NET. From chat to extraction to splitting. 100% on your infrastructure.