Get Free Community License
Compare

LM-Kit.NET vs Foundry LocalSame Vision, Different Scope

Microsoft Foundry Local (formerly Azure AI Foundry Local) is a local inference runtime built on ONNX Runtime. LM-Kit.NET is a complete AI development platform with its own inference engine, agents, RAG, and document intelligence. Both believe in local AI, but they differ in what they deliver.

Complete AI Platform 60+ Built-in Models Production Ready (GA) GGUF Model Ecosystem

Quick Comparison

Capability LM-Kit.NET Foundry Local
Agent Orchestration
Built-in RAG Pipeline
Document Intelligence
NPU Acceleration
OpenAI-Compatible API
NLP / Text Analysis
Production Status (GA)

Product Positioning

Foundry Local
ONNX-based local inference runtime with model management and OpenAI-compatible API
LM-Kit.NET
Self-contained AI platform with inference, agents, RAG, documents, NLP, and speech
60+
Built-in Models
5
GPU Backends
GA
Production Status
1
NuGet Package
Important Context

Before We Compare: Different Product Categories

Microsoft Foundry Local and LM-Kit.NET share the same fundamental belief: AI should run locally on your hardware. But they occupy very different positions in the stack. Foundry Local is an inference runtime. LM-Kit.NET is a complete AI development platform. Understanding this distinction is essential for a fair comparison.

Microsoft Foundry Local

Local Inference Runtime

Foundry Local (formerly Azure AI Foundry Local) is a free, on-device AI inference runtime built on ONNX Runtime. It downloads, manages, and serves ONNX models locally through an OpenAI-compatible API or native in-process SDK. Currently in public preview.

  • ONNX Runtime with auto hardware detection
  • NPU, CUDA, DirectML, Metal acceleration
  • OpenAI-compatible REST API
  • C#, Python, JavaScript, and Rust SDKs

LM-Kit.NET

Self-Contained AI Platform

LM-Kit.NET is an enterprise-grade .NET SDK that bundles a local inference engine with a complete AI application platform: agent orchestration, RAG, document intelligence, text analysis, speech processing, and vision. Production-ready and generally available.

  • In-process inference with GPU acceleration
  • Agent orchestration with 4 multi-agent patterns
  • Built-in RAG, document processing, and NLP
  • 60+ curated models, GGUF format

Think of it this way: Foundry Local is like an engine. It runs models and returns completions. LM-Kit.NET is the entire vehicle: it has the engine (inference), but also the navigation system (RAG), the dashboard instruments (NLP and text analysis), the cargo bay (document processing), the communication system (speech), and the autopilot (agent orchestration). If all you need is an engine, Foundry Local is a solid choice. If you need the whole vehicle, LM-Kit.NET delivers it in one package.

Credit Where It's Due

Where Foundry Local Genuinely Shines

Foundry Local is a well-designed inference runtime backed by Microsoft and the ONNX ecosystem. Here are the areas where it excels.

NPU and Hardware Auto-Detection

First-class support for Qualcomm, Intel, and AMD NPUs alongside CUDA and DirectML. Automatically detects your hardware and downloads the optimal model variant.

Multi-Language SDKs

Available in C#, Python, JavaScript, and Rust. Teams using polyglot stacks can access local inference from their preferred language.

OpenAI-Compatible API

Exposes standard OpenAI endpoints for chat completions, audio transcription, and embeddings. Any OpenAI SDK client can point at the local endpoint with minimal changes.

Microsoft Ecosystem Integration

Part of Windows AI Foundry. Integrates with Semantic Kernel, Microsoft.Extensions.AI, the AI Toolkit for VS Code, and the broader Microsoft Foundry cloud platform.

Free to Use

No cost to install or run. No API keys, no metered billing, no subscription required. The runtime itself is proprietary but free for developers.

Android and Mobile Support

Foundry Local has entered private preview on Android, with a major partner (PhonePe) already integrating on-device AI for their mobile platform.

The Complete Platform Advantage

Where LM-Kit.NET Goes Further

Foundry Local focuses on one thing: running models locally. LM-Kit.NET starts with local inference and builds an entire AI development platform on top. Here is what that means in practice.

Agent Orchestration

Foundry Local has no agent framework. It supports basic tool calling (one tool per request, limited to Qwen models) but cannot coordinate multi-step workflows. LM-Kit.NET ships a full agent orchestration system.

  • Pipeline, Parallel, Router, Supervisor patterns
  • Rich tool catalog with permission policies
  • ReAct planning with multi-step reasoning
  • Agent memory and MCP protocol support

Complete RAG Pipeline

Foundry Local has no RAG capabilities. Building RAG requires assembling external components (Semantic Kernel, a vector database, an embedding service). LM-Kit.NET ships the full pipeline in one SDK.

  • Hybrid retrieval: vector + BM25 with RRF
  • Built-in vector store and Qdrant connector
  • Semantic, Markdown, HTML, layout chunking
  • Multi-query, HyDE, query contextualization

Document Intelligence

Foundry Local has no document processing. PDF parsing, OCR, and format conversion require external Azure services or third-party libraries. LM-Kit.NET handles all of this natively.

  • PDF text extraction, OCR, table detection
  • PDF/image to Markdown conversion
  • HTML, EML, DOCX processing
  • Document splitting and structured extraction

NLP and Text Analysis

Foundry Local has no dedicated NLP features. Developers must prompt the LLM directly for text analysis tasks. LM-Kit.NET provides purpose-built, high-accuracy NLP APIs.

  • NER with 102 entity types, PII detection
  • Sentiment analysis, emotion detection
  • Custom text and document classification
  • Language detection and translation

Model Ecosystem (60+ Models, GGUF)

Foundry Local is restricted to ONNX format with a small curated catalog. Custom models require conversion through Microsoft Olive. LM-Kit.NET uses the GGUF format, giving access to the broadest model ecosystem available.

  • 60+ curated models (Gemma 3, Qwen 3, Phi-4, Llama, etc.)
  • GGUF: thousands of community quantizations available
  • On-device fine-tuning (LoRA) and quantization
  • No format conversion required

Speech, Vision, and Fine-Tuning

Foundry Local supports Whisper transcription and Phi-3.5 vision, but has no on-device fine-tuning or text-to-speech. LM-Kit.NET covers all three areas within the same SDK.

  • Whisper speech-to-text (tiny to large-v3-turbo)
  • Vision language models (Qwen2-VL, Gemma3-VL)
  • On-device LoRA fine-tuning
  • Model quantization for deployment optimization
Feature by Feature

Detailed Comparison

A thorough, category-by-category comparison. We have marked features honestly, including where Foundry Local has the edge.

Feature LM-Kit.NET Foundry Local
Core Architecture
Product typeComplete AI platformInference runtime
Inference enginellama.cpp (built-in)ONNX Runtime GenAI
Model formatGGUFONNX only
In-process inferenceYesYes (C# SDK v0.8+)
OpenAI-compatible APINo (SK & MEAI bridges)Yes (native)
Production statusGenerally AvailablePublic Preview
Model Management
Curated model catalog60+ models~15 models
Auto model downloadYesYes
Hardware-adaptive variantsManual selectionAuto-detection
Custom model supportAny GGUF modelONNX via Olive conversion
Model cache managementYesYes (CLI + SDK)
Hardware Acceleration
CUDA (NVIDIA GPU)CUDA 12/13CUDA
Vulkan (cross-platform GPU)YesNo
Metal (Apple GPU)YesYes
DirectML (AMD/Intel GPU)NoYes
NPU (Qualcomm, Intel, AMD)NoYes (QNN, OpenVINO)
TensorRT (NVIDIA optimized)NoYes
AVX/AVX2 (CPU optimized)YesYes
Agent Orchestration
Agent frameworkBuilt-in (4 patterns)None
Tool / function callingRich catalog + customBasic (1 tool/request, Qwen only)
Multi-agent patternsPipeline, Parallel, Router, SupervisorNone
Agent memoryYesNo
MCP protocolYesNo
Tool permission policiesYes (category, risk, approval)No
RAG & Retrieval
Built-in RAG pipelineYesNo
Vector storeBuilt-in + QdrantNone (external required)
Hybrid retrieval (BM25 + vector)Yes with RRFNo
Document chunking strategiesSemantic, Markdown, HTML, layoutNo
Embeddings generationBuilt-in modelsAPI exists, limited catalog models
Document & Vision
PDF processingBuilt-in (pdfium)No
OCRBuilt-in (tesseract)No
Vision language modelsQwen2-VL, Gemma3-VLPhi-3.5-Vision
Document format conversionPDF/HTML/EML/DOCX to MarkdownNo
NLP & Text Analysis
Named entity recognition102 entity typesNo
PII detectionYesNo
Sentiment analysisYesNo
Text classificationCustom categoriesNo
Language detection / translationYesNo
Speech
Speech-to-text (Whisper)Tiny through large-v3-turboWhisper-tiny, whisper-medium
Streaming transcriptionYesYes (C# SDK)
Model Customization
On-device fine-tuningLoRANo (requires Olive + cloud)
On-device quantizationYesPre-quantized only (via Olive)
Grammar-constrained generationJSON schema, GBNFNo
Platform & Licensing
Windowsx64x64, ARM
macOSUniversal (Apple Silicon)Apple Silicon
Linuxx64, ARM64In development
AndroidNoPrivate Preview
SDK languages.NET (C#)C#, Python, JS, Rust
LicenseCommercial (free trial)Proprietary (free to use)
Microsoft.Extensions.AIYes (bridge)Yes (via OpenAI compat)
Semantic Kernel integrationYes (dedicated bridge)Yes (via OpenAI connector)
Decision Guide

Which One Fits Your Project?

Honest guidance based on what each product actually delivers today.

Choose Foundry Local if...

Best for teams that need a lightweight inference runtime with NPU support and multi-language access.

  • You only need chat completions and basic inference
  • You target NPU hardware (Snapdragon, Intel NPU)
  • You need Python, JavaScript, or Rust SDKs
  • You want drop-in OpenAI API compatibility
  • You already use ONNX models in your pipeline
  • You are prototyping and preview status is acceptable

Choose LM-Kit.NET if...

Best for .NET teams building production AI applications that need more than just inference.

  • You need agents, RAG, or document processing
  • You require a production-ready, GA-status SDK
  • You want access to the broadest model ecosystem (GGUF)
  • You need NLP features like NER, PII, or sentiment analysis
  • You want on-device fine-tuning and quantization
  • You are building a complete AI-powered .NET application

More Than an Inference Runtime.

LM-Kit.NET gives you the inference engine and everything you need to build production AI applications: agents, RAG, documents, NLP, speech, and vision. One NuGet package, zero cloud dependencies.