Why Local AI · Compare · LM-Kit.NET vs Foundry Local

LM-Kit.NET vs Foundry Local Same Vision, Different Scope

Microsoft Foundry Local (formerly Azure AI Foundry Local) is a local inference runtime built on ONNX Runtime. LM-Kit.NET is a complete AI development platform with its own inference engine, agents, RAG, and document intelligence. Both believe in local AI, but they differ in what they deliver.

Try LM-Kit.NET All comparisons

Product Positioning

Foundry Local

ONNX-based local inference runtime with model management and OpenAI-compatible API

LM-Kit.NET

Self-contained AI platform with inference, agents, RAG, documents, NLP, and speech

Quick Comparison

60+
Built-in Models

5
GPU Backends

GA
Production Status

1
NuGet Package

Before we compare

Before We Compare: Different Product Categories.

Microsoft Foundry Local and LM-Kit.NET share the same fundamental belief: AI should run locally on your hardware. But they occupy very different positions in the stack. Foundry Local is an inference runtime. LM-Kit.NET is a complete AI development platform. Understanding this distinction is essential for a fair comparison.

Microsoft Foundry Local

Foundry Local (formerly Azure AI Foundry Local) is a free, on-device AI inference runtime built on ONNX Runtime. It downloads, manages, and serves ONNX models locally through an OpenAI-compatible API or native in-process SDK. Currently in public preview.

ONNX Runtime with auto hardware detection
NPU, CUDA, DirectML, Metal acceleration
OpenAI-compatible REST API
C#, Python, JavaScript, and Rust SDKs

LM-Kit.NET

LM-Kit.NET is an enterprise-grade .NET SDK that bundles a local inference engine with a complete AI application platform: agent orchestration, RAG, document intelligence, text analysis, speech processing, and vision. Production-ready and generally available.

In-process inference with GPU acceleration
Agent orchestration with 4 multi-agent patterns
Built-in RAG, document processing, and NLP
60+ curated models, GGUF format

Think of it this way: Foundry Local is like an engine. It runs models and returns completions. LM-Kit.NET is the entire vehicle: it has the engine (inference), but also the navigation system (RAG), the dashboard instruments (NLP and text analysis), the cargo bay (document processing), the communication system (speech), and the autopilot (agent orchestration). If all you need is an engine, Foundry Local is a solid choice. If you need the whole vehicle, LM-Kit.NET delivers it in one package.

Foundry Local strengths

Where Foundry Local Genuinely Shines

Foundry Local is a well-designed inference runtime backed by Microsoft and the ONNX ecosystem. Here are the areas where it excels.

NPU and Hardware Auto-Detection

First-class support for Qualcomm, Intel, and AMD NPUs alongside CUDA and DirectML. Automatically detects your hardware and downloads the optimal model variant.

Multi-Language SDKs

Available in C#, Python, JavaScript, and Rust. Teams using polyglot stacks can access local inference from their preferred language.

OpenAI-Compatible API

Exposes standard OpenAI endpoints for chat completions, audio transcription, and embeddings. Any OpenAI SDK client can point at the local endpoint with minimal changes.

Microsoft Ecosystem Integration

Part of Windows AI Foundry. Integrates with Semantic Kernel, Microsoft.Extensions.AI, the AI Toolkit for VS Code, and the broader Microsoft Foundry cloud platform.

Free to Use

No cost to install or run. No API keys, no metered billing, no subscription required. The runtime itself is proprietary but free for developers.

Android and Mobile Support

Foundry Local has entered private preview on Android, with a major partner (PhonePe) already integrating on-device AI for their mobile platform.

LM-Kit.NET advantages

Where LM-Kit.NET Goes Further

Foundry Local focuses on one thing: running models locally. LM-Kit.NET starts with local inference and builds an entire AI development platform on top. Here is what that means in practice.

Agent Orchestration

Foundry Local has no agent framework. It supports basic tool calling (one tool per request, limited to Qwen models) but cannot coordinate multi-step workflows. LM-Kit.NET ships a full agent orchestration system.

Pipeline, Parallel, Router, Supervisor patterns
Rich tool catalog with permission policies
ReAct planning with multi-step reasoning
Agent memory and MCP protocol support

Complete RAG Pipeline

Foundry Local has no RAG capabilities. Building RAG requires assembling external components (Semantic Kernel, a vector database, an embedding service). LM-Kit.NET ships the full pipeline in one SDK.

Hybrid retrieval: vector + BM25 with RRF
Built-in vector store, Qdrant and pgvector connectors
Semantic, Markdown, HTML, layout chunking
Multi-query, HyDE, query contextualization

Document Intelligence

Foundry Local has no document processing. PDF parsing, OCR, and format conversion require external Azure services or third-party libraries. LM-Kit.NET handles all of this natively.

PDF text extraction, OCR, table detection
PDF/image to Markdown conversion
HTML, EML, DOCX processing
Document splitting and structured extraction

NLP and Text Analysis

Foundry Local has no dedicated NLP features. Developers must prompt the LLM directly for text analysis tasks. LM-Kit.NET provides purpose-built, high-accuracy NLP APIs.

NER with 102 entity types, PII detection
Sentiment analysis, emotion detection
Custom text and document classification
Language detection and translation

Model Ecosystem (60+ Models, GGUF)

Foundry Local is restricted to ONNX format with a small curated catalog. Custom models require conversion through Microsoft Olive. LM-Kit.NET uses the GGUF format, giving access to the broadest model ecosystem available.

60+ curated models (Gemma 3, Qwen 3, Phi-4, Llama, etc.)
GGUF: thousands of community quantizations available
On-device fine-tuning (LoRA) and quantization
No format conversion required

Speech, Vision, and Fine-Tuning

Foundry Local supports Whisper transcription and Phi-3.5 vision, but has no on-device fine-tuning or text-to-speech. LM-Kit.NET covers all three areas within the same SDK.

Whisper speech-to-text (tiny to large-v3-turbo)
Vision language models (Qwen 3.6, Gemma 4)
On-device LoRA fine-tuning
Model quantization for deployment optimization

Feature comparison

Detailed Comparison.

A thorough, category-by-category comparison. We have marked features honestly, including where Foundry Local has the edge.

Feature	LM-Kit.NET	Foundry Local
Core Architecture
Product type	Complete AI platform	Inference runtime
Inference engine	llama.cpp (built-in)	ONNX Runtime GenAI
Model format	GGUF	ONNX only
In-process inference	Yes	Yes (C# SDK v0.8+)
OpenAI-compatible API	No (SK & MEAI bridges)	Yes (native)
Production status	Generally Available	Public Preview
Model Management
Curated model catalog	60+ models	~15 models
Auto model download	Yes	Yes
Hardware-adaptive variants	Manual selection	Auto-detection
Custom model support	Any GGUF model	ONNX via Olive conversion
Model cache management	Yes	Yes (CLI + SDK)
Hardware Acceleration
CUDA (NVIDIA GPU)	CUDA 12/13	CUDA
Vulkan (cross-platform GPU)	Yes	No
Metal (Apple GPU)	Yes	Yes
DirectML (AMD/Intel GPU)	No	Yes
NPU (Qualcomm, Intel, AMD)	No	Yes (QNN, OpenVINO)
TensorRT (NVIDIA optimized)	No	Yes
AVX/AVX2 (CPU optimized)	Yes	Yes
Agent Orchestration
Agent framework	Built-in (4 patterns)	None
Tool / function calling	Rich catalog + custom	Basic (1 tool/request, Qwen only)
Multi-agent patterns	Pipeline, Parallel, Router, Supervisor	None
Agent memory	Yes	No
MCP protocol	Yes	No
Tool permission policies	Yes (category, risk, approval)	No
RAG & Retrieval
Built-in RAG pipeline	Yes	No
Vector store	Built-in + Qdrant & pgvector	None (external required)
Hybrid retrieval (BM25 + vector)	Yes with RRF	No
Document chunking strategies	Semantic, Markdown, HTML, layout	No
Embeddings generation	Built-in models	API exists, limited catalog models
Document & Vision
PDF processing	Built-in (pdfium)	No
OCR	Built-in (tesseract)	No
Vision language models	Qwen 3.6, Gemma 4	Phi-3.5-Vision
Document format conversion	PDF/HTML/EML/DOCX to Markdown	No
NLP & Text Analysis
Named entity recognition	102 entity types	No
PII detection	Yes	No
Sentiment analysis	Yes	No
Text classification	Custom categories	No
Language detection / translation	Yes	No
Speech
Speech-to-text (Whisper)	Tiny through large-v3-turbo	Whisper-tiny, whisper-medium
Streaming transcription	Yes	Yes (C# SDK)
Model Customization
On-device fine-tuning	LoRA	No (requires Olive + cloud)
On-device quantization	Yes	Pre-quantized only (via Olive)
Grammar-constrained generation	JSON schema, GBNF	No
Platform & Licensing
Windows	x64	x64, ARM
macOS	Universal (Apple Silicon)	Apple Silicon
Linux	x64, ARM64	In development
Android	No	Private Preview
SDK languages	.NET (C#)	C#, Python, JS, Rust
License	Commercial (free trial)	Proprietary (free to use)
Microsoft.Extensions.AI	Yes (bridge)	Yes (via OpenAI compat)
Semantic Kernel integration	Yes (dedicated bridge)	Yes (via OpenAI connector)

Decision

Which One Fits Your Project?

Honest guidance based on what each product actually delivers today.

Choose Foundry Local if...

Best for teams that need a lightweight inference runtime with NPU support and multi-language access.

You only need chat completions and basic inference
You target NPU hardware (Snapdragon, Intel NPU)
You need Python, JavaScript, or Rust SDKs
You want drop-in OpenAI API compatibility
You already use ONNX models in your pipeline
You are prototyping and preview status is acceptable

Choose LM-Kit.NET if...

Best for .NET teams building production AI applications that need more than just inference.

You need agents, RAG, or document processing
You require a production-ready, GA-status SDK
You want access to the broadest model ecosystem (GGUF)
You need NLP features like NER, PII, or sentiment analysis
You want on-device fine-tuning and quantization
You are building a complete AI-powered .NET application

Keep comparing

Other comparisons and capability pages.

The full grid of LM-Kit.NET versus framework and runtime comparisons, plus the capability pages most relevant to this comparison.

Other comparisons

LM-Kit vs LLamaSharp LM-Kit vs Ollama LM-Kit vs Semantic Kernel LM-Kit vs Microsoft Agent Framework LM-Kit vs Microsoft AutoGen LM-Kit vs LangChain LM-Kit vs LlamaIndex

Capabilities mentioned in these comparisons

Quickstart in 5 minutes AI agent orchestration Document RAG On-device OCR Layout understanding Local inference & backends Multi-GPU inference Context hibernation Conversation primitives Model Context Protocol Semantic Kernel bridge Microsoft.Extensions.AI bridge

Build production AI in .NET.

Local inference, agents, RAG, document intelligence, speech, vision. One SDK. 100% on-device.

Download free SDK overview