Chat client
Streaming, multi-turn, function calling, structured output. Identical surface to the Microsoft abstractions.
The LM-Kit.NET.Integrations.ExtensionsAI package
implements IChatClient and
IEmbeddingGenerator<string, Embedding<float>>
on top of LM-Kit. If your codebase already speaks the
Microsoft.Extensions.AI abstractions, swap one line and inference
runs on a local model. Streaming works. Function calling works.
Tools work. Existing middleware (logging, retry, telemetry) keeps
working unchanged.
Streaming, multi-turn, function calling, structured output. Identical surface to the Microsoft abstractions.
Drop-in IEmbeddingGenerator for any vector store. Pair with the LM-Kit catalogue's embedding models.
Existing chat-client middleware (logging, caching, function-invocation) works unchanged. Swap the implementation, keep the pipeline.
Microsoft.Extensions.AI is becoming the canonical surface for AI
interactions in .NET applications. Web hosts, background services,
tools and SDKs increasingly accept any IChatClient. A
local-first stack that ignores that abstraction forces every consumer
to write a second integration. The bridge package means LM-Kit speaks
the standard interface natively.
Code that already accepts IChatClient works as-is. Swap the constructor argument; inference now runs on-device.
Use LM-Kit for the local-first path and a cloud provider as a fallback under the same interface. Routing logic lives in the middleware, not in branch statements.
The Microsoft abstraction's tool-invocation flow maps to LM-Kit's [LMFunction] binding. Functions registered for the abstract client run on-device.
Token-by-token streaming via IAsyncEnumerable<StreamingChatCompletionUpdate>. Same async pattern your app already uses.
IEmbeddingGenerator implementation plugs into vector stores that consume the abstraction. RAG pipelines built on the standard interface stay portable.
Register the LM-Kit chat client through IServiceCollection like any other implementation. No bespoke configuration surface.
Register an LM-Kit-backed IChatClient and IEmbeddingGenerator in DI with one line each.
using Microsoft.Extensions.AI; using LMKit.Integrations.ExtensionsAI; var model = LM.LoadFromModelID("qwen3.5:4b"); // Register the LM-Kit-backed IChatClient. builder.Services.AddSingleton<IChatClient>(_ => model.AsChatClient()); // Embeddings too. var embedder = LM.LoadFromModelID("qwen3-embedding:0.6b"); builder.Services.AddSingleton<IEmbeddingGenerator<string, Embedding<float>>>(_ => embedder.AsEmbeddingGenerator());
Consume the chat client from any controller or service. No LM-Kit-specific code reaches the call site.
// Existing controller / service / handler. No LM-Kit-specific code. public class SupportController(IChatClient chat) : Controller { [HttpPost("answer")] public async Task<IActionResult> AnswerAsync([FromBody] Question q) { var response = await chat.CompleteAsync(q.Text); return Ok(response.Message.Text); } }
Stack UseFunctionInvocation and UseLogging middleware to run a standard function-calling pipeline on-device.
// Standard function-calling pipeline. Tools run on-device. var chat = model .AsChatClient() .AsBuilder() .UseFunctionInvocation() .UseLogging() .Build(); var response = await chat.CompleteAsync( "What is the weather in Toulouse?", new() { Tools = [getCurrentWeatherTool] });
An existing app builds against IChatClient with a cloud provider. Swap the registration; inference now runs on the box. Customers keep their data.
Route sensitive requests to the local LM-Kit client and bulk requests to a cloud provider. Both implement the same interface; routing is one decorator.
A NuGet that consumes IChatClient works with LM-Kit out of the box. No need for the library author to know about LM-Kit at all.
Logging, caching, retry middleware written against the abstraction works unchanged. The bridge participates as a regular client.
Vector stores that consume IEmbeddingGenerator get an on-device embedder for free. Local RAG without an API key.
Run end-to-end tests against the abstraction with the LM-Kit implementation in CI. No external API quota, no flaky network.
The other major .NET AI abstraction. Same idea: drop LM-Kit into existing pipelines.
When you need finer control than the abstraction provides, the native Tools API is one constructor away.
Pair the embedding generator with the multimodal embedding catalogue for image-and-text vector spaces.
When the abstraction is not enough for full-document workflows, the native RAG primitives offer source attribution and adaptive ingestion.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.