Solutions · Integrations · Semantic Kernel

Plug into Semantic Kernel.

The LM-Kit.NET.SemanticKernel package implements Semantic Kernel's IChatCompletionService and memory store on top of LM-Kit. Existing kernels, plugins, planners, prompt functions, and memory connectors continue to work, with inference running locally instead of against a hosted endpoint. Same kernel builder, same plugin model, same prompt files.

Start building free GitHub package

IChatCompletionService Memory store Plugins compatible

Chat service

Implements IChatCompletionService backed by an LM-Kit LM. Drop into any kernel.

Memory store

SK memory backed by LM-Kit embeddings and the built-in vector store. Local recall, semantic queries.

Plugins compatible

Existing SK plugins, prompt functions, and planners run unchanged. The kernel does not know it is running locally.

Why a Semantic Kernel bridge

Existing kernels do not need to change.

Teams building on Semantic Kernel have invested in plugins, planners, and prompt-function libraries. Switching to a local stack should not mean rewriting that investment. The bridge implements the contracts Semantic Kernel expects: chat completion, embeddings, memory. Existing code keeps composing the same way; only the inference backend moves on-device.

Drop-in `IChatCompletionService`

Register through Kernel.Builder. Plugins and planners that depend on IChatCompletionService resolve to the local implementation.

Memory backed by LM-Kit

SK memory uses LM-Kit embeddings and the built-in vector store under the hood. Queries hit local indexes; nothing leaves the box.

Prompt functions unchanged

Existing .skprompt files and prompt configs work as written. The bridge consumes them through the same pipeline SK uses.

Planner compatible

Action planners, sequential planners, function-calling planners run on the local model. Tool-call signatures stay the same.

Hybrid kernels

Register multiple chat services with different IDs. Route per request: local for sensitive data, cloud for bulk traffic. Both inside the same kernel.

Telemetry preserved

SK's telemetry emits as it always has. Inference runs locally; the rest of the observability story is untouched.

Build a kernel

Local backend, same kernel.

KernelSetup.cs

using Microsoft.SemanticKernel;
using LMKit.Integrations.SemanticKernel;

var model = LM.LoadFromModelID("qwen3.5:4b");

var kernel = Kernel.CreateBuilder()
    .AddLMKitChatCompletion(model)              // IChatCompletionService
    .AddLMKitTextEmbeddingGeneration(embedder)  // embedding service
    .Build();

// Existing plugins. No changes.
kernel.ImportPluginFromType<CalendarPlugin>();
kernel.ImportPluginFromPromptDirectory(@"prompts/support");

// Invoke a prompt function.
var answer = await kernel.InvokeAsync("Support", "DraftReply", new()
{
    ["customer"] = customerName,
    ["issue"]    = issueDescription
});

Run an SK function-calling planner against the local model, with tools routed through registered kernel plugins.

PlannerOnDevice.cs

// Existing function-calling planner. Tools resolve through the kernel.
var settings = new OpenAIPromptExecutionSettings
{
    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
};

var result = await kernel.InvokePromptAsync(
    "Schedule a 30-minute review with Loic for tomorrow afternoon.",
    new(settings));

// Inference runs on the local model. Tool calls invoke kernel plugins.

Back Semantic Kernel memory with LM-Kit embeddings and the built-in vector store, with no separate database.

MemoryRecall.cs

// SK memory backed by LM-Kit embeddings and the built-in vector store.
var memory = new SemanticTextMemory(
    new LMKitMemoryStore(),
    embeddingService);

await memory.SaveInformationAsync(
    collection: "docs",
    text:       "The Q3 launch slipped two weeks because of a shipping delay.",
    id:         "q3-launch");

var hits = memory.SearchAsync("docs", "why did Q3 slip?", limit: 3);
await foreach (var hit in hits) Console.WriteLine(hit.Metadata.Text);

Where the bridge ships

Existing investments, local backend.

SK app moving on-device

An application built on Semantic Kernel with a hosted backend switches to local inference for compliance or cost. Plugins, planners, prompt files all keep working.