Pre-bundled
Models ship inside the installer. First run never reaches for the network. The catalogue points at a local path.
Cloud AI assumes connectivity. Real workloads do not. A factory floor with policy-blocked outbound traffic. A research vessel three weeks from port. A hospital with strict data-sovereignty rules. A mobile field engineer in a basement. A defence cleanroom by construction. LM-Kit ships to all of them with the same .NET SDK: pre-bundle the model, sign the binary, optionally AOT-compile, and run.
Models ship inside the installer. First run never reaches for the network. The catalogue points at a local path.
No telemetry, no licence callbacks, no model auto-download in restricted modes. Verifiable via firewall rules.
Windows, Linux, macOS, MAUI mobile (iOS, Android). Same NuGet, same code, same model.
Cloud AI is the easy default until the deployment target is not on the internet. Edge devices, regulated environments, mobile installs, and offline tools all share the same constraint: the AI has to ship inside the binary. LM-Kit treats that as the primary case, not a special path.
For workloads where data is classified or regulated, the safest model is the one that never leaves the box. No outbound calls, no third-party processing, no transit risk.
No round-trip. Inference latency is bounded by hardware, not by network. Critical for interactive UIs and voice assistants.
A field engineer in a basement, a researcher on a remote vessel, a kiosk in a low-connectivity store. The application has to work today, not when the link comes back.
A burst of inference at month-end does not hit a paywall. Hardware capacity is the cap, not a vendor's billing dial.
Hardware amortises across years. Per-call cost approaches zero. Scaling means more units, not larger bills.
"Your data never leaves your machine" is a sales line cloud AI cannot make. It closes deals in regulated industries.
The same SDK serves every shape. Same code, different packaging.
Server
A 2U server in a customer rack. Runs the same NuGet as your cloud edition. Models pre-staged on the local SSD; no external dependencies.
Workstation
Developer or analyst workstations with one or two consumer GPUs. Local agents, offline document analysis, data-sensitive workflows.
Kiosk
Touchscreen units in stores, lobbies, factories. Locked-down OS, no inbound network. Pre-bundled model, signed binary.
Mobile
Cross-platform mobile apps with on-device inference. Smaller models, lower precisions, AOT compilation, encrypted artefacts.
Cleanroom
Air-gapped, verifiable, no outbound telemetry. Encrypted models, signed binaries, hardware-rooted key storage.
Embedded
Linux ARM64 boxes, robotics controllers, instrument panels. Smaller models, AOT native binaries, fixed firmware images.
Point the catalogue at a bundled directory and disable downloads so the runtime never touches the network.
using LMKit.Global; using LMKit.Model; // Point the catalogue at the bundled model directory before any load call. Configuration.ModelStorageDirectory = Path.Combine( AppContext.BaseDirectory, "models"); // Disable any auto-download path. In offline mode, the model must be present. Configuration.AllowModelDownload = false; // Load by ID. Resolves from the bundled directory; never reaches the network. var model = LM.LoadFromModelID("qwen3.5:2b");
Ship an encrypted model file with the app and resolve the key from the OS keychain at first launch.
// Edge install with an encrypted model in the bundle. // Key resolved from OS keychain on first run, cached for the session. var password = OsKeychain.GetOrPrompt("lmk-model-key"); var model = LM.LoadEncrypted( path: Path.Combine(AppContext.BaseDirectory, "models/release.lmk-enc"), password: password);
Resolve the bundled model from the platform-specific app-data directory in a standard MAUI CreateMauiApp entry point.
// MAUI startup. The same builder you would use on the desktop. public static MauiApp CreateMauiApp() { var builder = MauiApp.CreateBuilder(); builder.Services.AddSingleton(_ => { // Resolve the bundled model from the platform-specific cache directory. var path = Path.Combine(FileSystem.AppDataDirectory, "models/qwen3-0.6b.lmk"); return LM.LoadFromFile(path); }); builder.Services.AddSingleton<IChatService, LocalChatService>(); return builder.Build(); }
Air-gapped networks, classified data, no outbound telemetry. Encrypted models, signed binaries, audit trails.
Patient data confined to the clinic. AI-assisted documentation, summarisation, and structured extraction without HIPAA exposure paths.
Rigs, refineries, remote sites with intermittent connectivity. Local copilots that never block on the link.
Ships, aircraft, rail. Long stretches without internet. AI assistants for crew, technicians, and operators that work offline.
Store kiosks, hotel concierge tablets, in-store navigation. Pre-bundled, fixed-firmware deployments.
Inspectors, technicians, surveyors. iOS and Android apps with local AI built on MAUI; works in basements, tunnels, and rural sites.
A short list of things that matter when the deployment target is not in your data centre.
Smaller models for mobile and embedded, larger models for workstations. The catalogue exposes parameter count, file size, and required precision per variant.
Q4 or Q5 for laptops and phones; Q6 or Q8 for workstations. The Quantizer trims file size and VRAM with minimal quality loss.
Set Configuration.ModelStorageDirectory to a folder inside the installer. Disable network model loading. Verify with a firewall rule.
Use LM.LoadEncrypted for fine-tuned or commercial models. Stream-decrypt at runtime; never write plaintext to disk.
.NET 8+ AOT works with LM-Kit on most targets. Smaller binaries, faster startup, no JIT.
Code-signed binaries, SHA-256 model checksums, firmware-signed installers. Tamper-evidence is part of the threat model in many edge deployments.
For proprietary or restricted models on customer hardware. Stream decryption, no plaintext on disk.
Trim file size and VRAM for the device. Same model, lower precision, smaller install.
Why local AI is the only viable option for many regulated workloads.
The economic story behind moving inference off the API and onto the box.