The .NET SDK for local AI

The complete local AI runtime for .NET.

Seven capability pillars on one adaptive inference engine. Agents, document intelligence, vision, RAG, text analysis, speech, generation. One NuGet, zero cloud calls, full control of your data and your latency.

NuGet: LM-Kit.NET Targets: .NET Standard 2.0 · .NET 8 / 9 / 10 Platforms: Windows · Linux · macOS
The shape of the SDK

A runtime, not a wrapper.

Most "local LLM" tools are inference engines. LM-Kit.NET is the runtime that sits on top: agents that reason and call tools, RAG with page-level citations, OCR that holds its own against commercial engines, structured extraction that emits typed C# objects, multilingual speech-to-text, image understanding, embeddings, and a growing catalog of built-in tools. Every capability ships in the same NuGet, runs on the same model graph, and respects the same adaptive sampling layer underneath.

What ships in the box

Seven pillars, one foundation.

LM-Kit.NET ships seven pillars and the local runtime they all sit on. Use the parts you need, ignore the rest.

The foundation

Every capability above runs on this runtime.

Foundation

Local Inference

The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.

Explore the foundation
Core technology

Dynamic Sampling, the symbolic layer.

The reason a 4B local model can match fine-tuned cloud behaviour on extraction, classification, and structured generation. Dynamic Sampling is an adaptive inference engine that sits underneath every LM-Kit call, steering each token with structural awareness, contextual signals, and grammar-aligned validation. Always on, model-agnostic, no retraining.

Pillar A

Constrained output

Dynamic grammar guarantees JSON, schemas, and tool-call shapes always parse. A novel hybrid path runs roughly twice as fast as classical grammar sampling.

Pillar C

Model-agnostic

No architecture coupling, no fine-tuning, no per-model adapter. Drop in a new open-weight release and the layer keeps working from day one.

Open the Dynamic Sampling deep dive →

Runs where your code already runs

Same process, same threads, same deploy.

No sidecar service, no special runtime. LM-Kit links into your application, picks up the right native acceleration for the host, and gets out of the way.

Runtime
.NET Standard 2.0 · .NET 8 · 9 · 10
OS
Windows, Linux x64 & ARM64, macOS
Acceleration
CPU, AVX/AVX2, CUDA 12/13, Vulkan, Metal
Models
Gemma 3, Qwen 3, Llama, Phi-4, GLM 4.7, GPT OSS, Whisper, embeddings
Storage
In-memory, built-in vector DB, Qdrant, bring-your-own
Bridges
Microsoft.Extensions.AI, Semantic Kernel, MCP clients
Licensing

Free for builders. Commercial when you ship.

Run the full SDK on your own hardware at no cost. Buy a commercial license when LM-Kit is part of a product you sell to customers.

Community

Freeforever

Full SDK access for any company or individual. Build and deploy non-commercial applications, or evaluate LM-Kit before shipping.

  • Eligibility: any company or individual
  • Deployment: development, internal tools, OSS
  • Platforms: Windows, Linux, macOS
  • Community support on GitHub

Professional

Customper project

For products that ship LM-Kit to customers. Pricing is scaled to deployment size and value. Includes dedicated support and roadmap input.

  • Commercial redistribution rights
  • Dedicated technical support
  • Unlimited developers and end users
  • Direct relationship with the team
What customers say 4.9 / 5 on SourceForge

Get started

Ship local AI in your next build.