Build AI Agents.
Ship with Confidence.
The complete SDK for building production AI applications in .NET. Agent orchestration, document intelligence, RAG pipelines, and 56 built-in tools in a single NuGet package. Zero cloud dependency.
using LMKit; using LMKit.Retrieval; // Load your models var chatModel = new LM("path/to/chat-model.gguf"); var embedModel = new LM("path/to/embed-model.gguf"); // Create document Q&A with RAG using var docChat = new PdfChat(chatModel, embedModel); // Load your documents await docChat.LoadDocumentAsync("contracts.pdf"); await docChat.LoadDocumentAsync("reports.pdf"); // Ask questions with grounded answers var result = await docChat.SubmitAsync( "What are the payment terms?"); Console.WriteLine(result.Response.Completion); // Sources: contracts.pdf, p.12
A Complete AI Stack with No Moving Parts
LM-Kit.NET is a unique full-stack AI framework for .NET that unifies everything you need to build and deploy AI agents with zero cloud dependency and zero external dependencies. It combines the fastest .NET inference engine, production-ready trained models, agent orchestration, RAG pipelines, and MCP-compatible tool calling in a single in-process SDK for C# and VB.NET.
Built by experts, updated continuously. Our team ships the latest advances in generative AI, symbolic AI, and NLP research directly into the SDK. Not every problem requires a massive LLM. Dedicated task agents deliver faster execution, lower costs, and higher accuracy for specific workflows.
Category of one: LM-Kit.NET is the only .NET SDK that unifies a complete inference engine, production-ready trained models, agent orchestration, RAG pipelines, and MCP-compatible tool calling. Your AI. Your data. On your device.
AI Agents and Orchestration
Build autonomous AI agents that reason, plan, and execute complex workflows within your applications. Complete agent infrastructure with Agent, AgentBuilder, AgentExecutor, and AgentRegistry for production-ready AI agents.
Agent Framework
Complete infrastructure with Agent, AgentBuilder, AgentExecutor, and AgentRegistry for building production-ready AI agents.
API ReferenceMulti-Agent Orchestration
Coordinate multiple agents with PipelineOrchestrator, ParallelOrchestrator, RouterOrchestrator, and SupervisorOrchestrator.
API ReferencePlanning Strategies
Multiple reasoning approaches: ReAct, Chain-of-Thought, Tree-of-Thought, Plan-and-Execute, and Reflection handlers.
API Reference56 Built-in Tools
Ready-to-use tools: Data (JSON, XML, CSV), Text (Diff, Regex), Numeric (Calculator, Stats), Security (Hash, JWT), IO, Network.
API ReferenceAgent-to-Agent Delegation
Enable agents to delegate tasks to specialized sub-agents with DelegationManager and DelegateTool.
API Reference18 Agent Templates
Pre-built templates: Chat, Assistant, Code, Research, Analyst, Planner, Writer, Reviewer, and more for rapid development.
API ReferenceResilience Policies
Production-grade reliability with Retry, Circuit Breaker, Timeout, Rate Limit, Bulkhead, and Fallback policies.
API ReferenceStreaming Support
Real-time response streaming with buffered, multicast, and delegate handlers for responsive UIs.
API ReferenceAgent Observability
Full tracing and metrics with AgentTracer, AgentMetrics, and JSON export capabilities for debugging and monitoring.
API ReferenceMCP Client Support
Connect to Model Context Protocol servers for extended capabilities including resources, prompts, and tool discovery.
API ReferenceAgent Memory
Persistent memory that survives across conversation sessions with RAG-based recall for context-aware responses.
API ReferenceFunction Calling
Let models dynamically invoke your application's methods with structured parameters and type-safe contracts.
API ReferenceRun Any Model, Anywhere
100+ pre-configured model cards plus support for any GGUF model from Hugging Face.
Text Models
- LLaMA 3.1 / 4
- Mistral / Mixtral
- Qwen 2.5 / 3
- Phi 3 / 4
- Gemma 2 / 3
- Granite
- DeepSeek R1
- Falcon
Vision Models
- Qwen-VL / Qwen2-VL
- MiniCPM-V
- Pixtral
- Gemma Vision
- LightOnOCR
Embedding Models
- BGE-M3
- Nomic Embed
- Qwen Embedding
- Gemma Embedding
- Nomic Vision
Speech Models
- Whisper Tiny
- Whisper Base
- Whisper Large V3
- Whisper Large Turbo
Browse the Full Model Catalog
Explore production-ready models optimized for different tasks and hardware configurations. Load models directly from our catalog or any Hugging Face repository.
Complete Data Sovereignty by Design
Running inference locally provides inherent advantages that cloud solutions cannot match.
Complete Data Sovereignty
Sensitive information stays within your infrastructure. No data transmission, no third-party access, full audit trail.
Zero Network Latency
Responses as fast as your hardware allows. No round trips to cloud servers, no network dependencies.
No Per-Token Costs
Unlimited inference once deployed. Predictable costs regardless of usage volume or scale.
Offline Operation
Works without internet connectivity. Deploy in air-gapped and disconnected environments.
Regulatory Compliance
Meets GDPR, HIPAA, SOC 2, and data residency requirements by design. Audit-friendly operations.
Simple Deployment
Single NuGet package. No Python runtime, no containers, no external services to manage.
Document Intelligence & RAG Built In
Turn documents into knowledge. Ask questions. Get grounded answers with source attribution.
Document Intelligence
Complete document processing pipeline. Chat with PDFs, extract structured data from invoices and contracts, understand layouts with adaptive OCR/VLM processing.
- PDF Chat and Document Q&A with retrieval, reranking, and grounded generation
- Schema-based structured data extraction with JSON output
- OCR and extraction pipelines for invoices, forms, IDs, emails
- Native PDF, DOCX, XLSX, PPTX, HTML, and image support
- Layout-aware processing with paragraph and line detection
RAG & Knowledge
Ground AI responses in your organization's knowledge with a flexible, extensible RAG framework. Modular architecture for custom retrieval strategies.
- Modular RAG architecture: use built-in pipelines or custom strategies
- Built-in vector database without external dependencies
- Multimodal RAG: retrieve from both text and images
- Advanced chunking: Markdown-aware, semantic, layout-based
- Reranking for precision, Qdrant integration for scale
Multimodal Intelligence
Process and understand content across text, images, documents, and audio. Build voice-driven assistants, visual inspection systems, and document understanding pipelines.
Vision Language Models (VLM)
Analyze images, extract information, answer questions about visual content with state-of-the-art VLMs.
API ReferenceVLM-Based OCR
High-accuracy text extraction from images and scanned content using vision language models.
API ReferenceSpeech-to-Text
Transcribe audio with voice activity detection, multi-language support, and hallucination suppression.
API ReferenceDocument Processing
Native support for PDF, DOCX, XLSX, PPTX, HTML, and image formats with layout understanding.
API ReferenceImage Embeddings
Generate semantic representations of images for similarity search and multimodal retrieval.
API ReferenceLanguage Detection
Identify languages from text, images, or audio for automatic routing and processing.
API ReferenceContent Intelligence & Text Analysis
Analyze and understand text and visual content. Compliance-focused text intelligence for PII extraction, NER, classification, and sentiment analysis.
Sentiment and Emotion Analysis
Detect emotional tone from text and images. Identify sentiment, emotions, and sarcasm.
API ReferenceCustom Classification
Categorize text and images into your defined classes with zero-shot or fine-tuned classifiers.
API ReferenceKeyword Extraction
Identify key terms and phrases from documents for indexing, tagging, and search optimization.
API ReferenceNamed Entity Recognition (NER)
Extract people, organizations, locations, dates, and custom entity types from text.
API ReferencePII Detection & Extraction
Identify and classify personal identifiers for privacy compliance: SSN, email, phone, addresses.
API ReferenceContent Summarization
Condense long content with configurable strategies: extractive, abstractive, and hybrid approaches.
API ReferenceText Generation and Transformation
Generate and refine content with precise control. Build context-aware chatbots, constrain outputs with schemas, and transform text for your specific needs.
Conversational AI
Build context-aware chatbots with multi-turn memory, function calling, and smart context management.
API ReferenceConstrained Generation
Guide model outputs using JSON schemas, templates, or custom grammar rules for structured outputs.
API ReferenceTranslation
Convert text between languages with confidence scoring and quality assessment.
API ReferenceText Enhancement
Improve clarity, fix grammar and spelling, adapt tone, and enhance writing quality.
API ReferencePerformance and Hardware
The fastest .NET inference engine. LM-Kit.NET automatically leverages the best available acceleration on any hardware with optimized kernels for maximum throughput.
NVIDIA GPUs (CUDA)
Optimized CUDA backends with custom kernels for maximum performance on NVIDIA hardware.
Apple Silicon (Metal)
Metal acceleration for M-series chips with unified memory architecture support.
Cross-Vendor GPUs (Vulkan)
Vulkan backend for AMD, Intel, and other GPU vendors with broad compatibility.
Multi-GPU Support
Distribute models across multiple GPUs for larger models and higher throughput.
CPU Fallback
Optimized CPU inference with SIMD acceleration when GPU is unavailable.
Dual Backend Architecture
Choose llama.cpp for broad compatibility or ONNX Runtime for optimized inference.
Run Anywhere .NET Runs
Full cross-platform support with hardware acceleration on Windows, macOS, and Linux.
Operating Systems
Windows 7 through 11, macOS 11+ (Intel and Apple Silicon), Linux with glibc 2.27+ (x64 and ARM64).
GPU Acceleration
CUDA for NVIDIA, Metal for Apple Silicon, Vulkan for cross-vendor GPU support.
.NET Frameworks
.NET Framework 4.6.2 through latest .NET releases. Full MAUI support for cross-platform apps.
Zero Dependencies. One NuGet Package.
The entire AI stack runs in-process within your .NET application. No Python runtime. No containers. No external services. No native libraries to manage separately.
Get Started in Minutes
Comprehensive documentation, runnable samples, and API reference.
Documentation
Complete guides covering agents, RAG, document processing, and deployment.
Read DocsAPI Reference
Complete API documentation for all namespaces, classes, and methods.
View APICode Samples
50+ runnable demos covering agents, chat, RAG, vision, speech, and more.
View SamplesModel Catalog
Browse pre-configured models optimized for different tasks and hardware.
Browse ModelsReady to Build Local AI Agents?
From prompts to production with zero cloud dependency. Download the SDK and start building in minutes.