Solutions · Local Inference · LLM fine-tuning

Specialise any model on your data.

The LoraFinetuning engine trains task-specific LoRA adapters against a frozen base model. Per-tensor rank control, AdamW with cosine decay, gradient accumulation, automatic early-stopping, checkpoint save/resume, and ShareGPT dataset export. A complete fine-tuning loop that runs on your hardware with no cloud uploads of your training data.

Start building free API reference

LoRA adapters Per-tensor ranks Checkpointing

`LoraFinetuning`

Trainer with iteration loop, progress events, checkpointing.

`LoraTrainingParameters`

25+ knobs: ranks, AdamW, cosine decay, gradient clipping, RoPE.

`TrainingDataset`

Build datasets from chat history, plain text, or ShareGPT.

`ShareGptExporter`

Export production conversations as a ShareGPT-format dataset.

Why LoRA?

A small adapter beats a big retrain.

Full fine-tuning of a 7B model needs 14+ GB of weights, a multi-GPU setup, and days of training. LoRA (Low-Rank Adaptation) trains a tiny rank-r matrix injected alongside each frozen weight. The base model stays untouched. The resulting adapter is typically 10 to 100 MB and can be hot-swapped at inference time, mixed with other adapters, or merged back into the base permanently with LoraMerger.

Tiny adapters

A typical adapter is 10 to 100 MB. Versionable, distributable, swappable. The base model never moves.

Reasonable hardware

Train rank-16 LoRAs on consumer GPUs (16 to 24 GB VRAM) for 4B-7B models. No multi-node setup required.

Per-task specialisation

One base model, many adapters. Sentiment, code, legal, medical: load the right adapter per request.

Privacy by design

Training data never leaves the machine. Your customer interactions, internal docs, and proprietary logs stay where they belong.

Training loop

A full training run in 30 lines.

FineTune.cs

using LMKit.Model;
using LMKit.Finetuning;

// 1. Load the base model (it stays frozen during training).
var model = new LM("path/to/base-model.gguf");

// 2. Configure training hyperparameters.
var parameters = new LoraTrainingParameters
{
    LoraRank             = 16,
    LoraAlpha            = 32,
    AdamAlpha            = 1e-4f,
    AdamBeta1            = 0.9f,
    AdamBeta2            = 0.999f,
    AdamDecay            = 0.01f,
    AdamGradientClipping = 1.0f,
    GradientAccumulation = 4,
    CosineDecaySteps     = 2000,
    CosineDecayMin       = 0.1f,
    MaxNoImprovement     = 100,

    // Per-tensor rank control: spend more capacity where it matters.
    RankWQ = 16, RankWK = 16, RankWV = 16, RankWO = 8
};

// 3. Wire up the trainer with progress events.
var trainer = new LoraFinetuning(model, parameters)
{
    Iterations          = 2000,
    BatchSize           = 8,
    ContextSize         = 2048,
    UseGradientCheckpointing = true,
    TrainingCheckpoint  = "checkpoints/run-2026-q1.bin"
};

trainer.FinetuningProgress += (s, e) =>
{
    Console.WriteLine($"Iter {e.Iteration}/{e.MaxIterations}  loss={e.Loss:F4}  lr={e.LearningRate:E2}");
};

// 4. Load training data from a ChatHistory, text file, or both.
int samples = trainer.LoadTrainingDataFromText("corpus/customer-support.jsonl");
Console.WriteLine($"{samples} training samples loaded");

// 5. Train. The output is a single .lora file.
trainer.Finetune2Lora("adapters/customer-support.lora");

Dataset tools

Build training data from production.

Most real fine-tuning failures stem from poor data, not poor hyperparameters. The SDK ships first-class tools for building, filtering, exporting, and versioning training datasets directly from running applications.

`LoadTrainingDataFromChatHistory`

Convert a ChatHistory from a live MultiTurnConversation into training samples in one call. Capture the best customer interactions and feed them back as supervision.

`LoadTrainingDataFromText`

Load JSONL or plain-text corpora. Multiple overloads cover ShareGPT, Alpaca, and custom formats.

`FilterSamplesBySize`

Drop samples outside (minSize, maxSize) token bounds in a single pass. Common cleanup step before training.

`ShareGptExporter`

Export collected samples as a standard ShareGPT-format JSON file. Share with team members or version-control alongside your code.

`SampleAvgLength` / `SampleMinLength` / `SampleMaxLength`

Inspect the loaded dataset's distribution before training. Catch outliers that would skew your loss curve.

Sample manipulation

GetSample(int), RemoveSample(int), ClearTrainingData(), SaveTrainingData: full control over the loaded corpus.

Hyperparameters

Twenty-five knobs, sane defaults.

Every important training parameter is exposed for advanced users; defaults are calibrated for common 4B to 13B fine-tunes.

Optimizer (AdamW)

AdamAlpha, AdamBeta1, AdamBeta2, AdamDecay, AdamDecayMinNDim, AdamGradientClipping.

Cosine schedule

CosineDecayMin, CosineDecayRestart, CosineDecaySteps. Anneal learning rate cleanly with optional warm restarts.

LoRA structure

LoraRank, LoraAlpha, plus per-tensor ranks RankWQ, RankWK, RankWV, RankWO, attention-norm and feed-forward ranks.

Gradient handling

GradientAccumulation for larger effective batches; UseGradientCheckpointing for memory savings on big models.

Early stopping

MaxNoImprovement: stop training after N iterations without loss improvement. No more babysitting runs.

RoPE control

RopeFreqBase, RopeFreqScale: tune positional encoding for long-context fine-tuning experiments.

Applications

Where local fine-tuning pays off.

Domain language

Adapt a base model to your domain's vocabulary: legal, medical, financial, scientific. Improve named-entity accuracy and instruction following without prompt engineering.

House style

Train an adapter that emits writing in your brand voice, with your editorial conventions, terminology, and tone.

Customer-facing chat

Fine-tune on your support transcripts for grounded, on-brand replies. Hot-swap a freshly-trained adapter daily without redeploying the base model.

Tool-call accuracy

Train against a corpus of (intent, function-call) pairs so the model rarely hallucinates tool names or arguments. Pair with grammar-constrained decoding for ironclad output.

Compliance training

Train on your team's redaction policies, privacy rules, and disclosure templates. Run entirely on premises so the policies themselves never leak.

Multi-tenant SaaS

Train a per-customer adapter on their data. Load adapters dynamically per request so each tenant gets a model that knows their patterns.

Developer Resources

API reference.

`LoraFinetuning`

Main trainer. Configure iteration count, batch size, context window, checkpointing. Subscribe to FinetuningProgress for live metrics.

View documentation

`LoraTrainingParameters`

25+ tunable hyperparameters. AdamW, cosine decay, gradient clipping, per-tensor ranks, RoPE controls.

View documentation

`TrainingDataset` / `TrainingSample`

Container types for the loaded corpus. Inspect, filter, save, reload during a training run.

View documentation

`ShareGptExporter`

Convert collected ChatTrainingSamples into a ShareGPT-format JSON file for sharing or archival.

View documentation

After training, see LoRA Integration for runtime hot-swap and Model Quantization to compress for deployment.

Demos & docs

Build it. Read it. Try it.

Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.

How-to guide

Train where the data lives.

No cloud uploads. No GPU rental. Your data, your model, your hardware.

Get Community Edition Download

Specialise any model on your data.

LoraFinetuning

LoraTrainingParameters

TrainingDataset

ShareGptExporter

Tiny adapters

Reasonable hardware

Per-task specialisation

Privacy by design

LoadTrainingDataFromChatHistory

LoadTrainingDataFromText

FilterSamplesBySize

ShareGptExporter

SampleAvgLength / SampleMinLength / SampleMaxLength

Sample manipulation

Optimizer (AdamW)

Cosine schedule

LoRA structure

Gradient handling

Early stopping

RoPE control

Domain language

House style

Customer-facing chat

Tool-call accuracy

Compliance training

Multi-tenant SaaS

LoraFinetuning

LoraTrainingParameters

TrainingDataset / TrainingSample

ShareGptExporter

Prepare training datasets for LoRA finetuning

Generate fine-tuning datasets from extractions

Load and merge LoRA adapters

`LoraFinetuning`

`LoraTrainingParameters`

`TrainingDataset`

`ShareGptExporter`

`LoadTrainingDataFromChatHistory`

`LoadTrainingDataFromText`

`FilterSamplesBySize`

`ShareGptExporter`

`SampleAvgLength` / `SampleMinLength` / `SampleMaxLength`

`LoraFinetuning`

`LoraTrainingParameters`

`TrainingDataset` / `TrainingSample`

`ShareGptExporter`