LoraFinetuning
Trainer with iteration loop, progress events, checkpointing.
The LoraFinetuning engine trains task-specific LoRA adapters
against a frozen base model. Per-tensor rank control, AdamW with cosine
decay, gradient accumulation, automatic early-stopping, checkpoint
save/resume, and ShareGPT dataset export. A complete fine-tuning loop
that runs on your hardware with no cloud uploads of your training data.
LoraFinetuningTrainer with iteration loop, progress events, checkpointing.
LoraTrainingParameters25+ knobs: ranks, AdamW, cosine decay, gradient clipping, RoPE.
TrainingDatasetBuild datasets from chat history, plain text, or ShareGPT.
ShareGptExporterExport production conversations as a ShareGPT-format dataset.
Full fine-tuning of a 7B model needs 14+ GB of weights, a multi-GPU setup,
and days of training. LoRA (Low-Rank Adaptation) trains a tiny rank-r matrix
injected alongside each frozen weight. The base model stays untouched. The
resulting adapter is typically 10 to 100 MB and can be hot-swapped at
inference time, mixed with other adapters, or merged back into the base
permanently with LoraMerger.
A typical adapter is 10 to 100 MB. Versionable, distributable, swappable. The base model never moves.
Train rank-16 LoRAs on consumer GPUs (16 to 24 GB VRAM) for 4B-7B models. No multi-node setup required.
One base model, many adapters. Sentiment, code, legal, medical: load the right adapter per request.
Training data never leaves the machine. Your customer interactions, internal docs, and proprietary logs stay where they belong.
using LMKit.Model; using LMKit.Finetuning; // 1. Load the base model (it stays frozen during training). var model = new LM("path/to/base-model.gguf"); // 2. Configure training hyperparameters. var parameters = new LoraTrainingParameters { LoraRank = 16, LoraAlpha = 32, AdamAlpha = 1e-4f, AdamBeta1 = 0.9f, AdamBeta2 = 0.999f, AdamDecay = 0.01f, AdamGradientClipping = 1.0f, GradientAccumulation = 4, CosineDecaySteps = 2000, CosineDecayMin = 0.1f, MaxNoImprovement = 100, // Per-tensor rank control: spend more capacity where it matters. RankWQ = 16, RankWK = 16, RankWV = 16, RankWO = 8 }; // 3. Wire up the trainer with progress events. var trainer = new LoraFinetuning(model, parameters) { Iterations = 2000, BatchSize = 8, ContextSize = 2048, UseGradientCheckpointing = true, TrainingCheckpoint = "checkpoints/run-2026-q1.bin" }; trainer.FinetuningProgress += (s, e) => { Console.WriteLine($"Iter {e.Iteration}/{e.MaxIterations} loss={e.Loss:F4} lr={e.LearningRate:E2}"); }; // 4. Load training data from a ChatHistory, text file, or both. int samples = trainer.LoadTrainingDataFromText("corpus/customer-support.jsonl"); Console.WriteLine($"{samples} training samples loaded"); // 5. Train. The output is a single .lora file. trainer.Finetune2Lora("adapters/customer-support.lora");
Most real fine-tuning failures stem from poor data, not poor hyperparameters. The SDK ships first-class tools for building, filtering, exporting, and versioning training datasets directly from running applications.
LoadTrainingDataFromChatHistoryConvert a ChatHistory from a live MultiTurnConversation into training samples in one call. Capture the best customer interactions and feed them back as supervision.
LoadTrainingDataFromTextLoad JSONL or plain-text corpora. Multiple overloads cover ShareGPT, Alpaca, and custom formats.
FilterSamplesBySizeDrop samples outside (minSize, maxSize) token bounds in a single pass. Common cleanup step before training.
ShareGptExporterExport collected samples as a standard ShareGPT-format JSON file. Share with team members or version-control alongside your code.
SampleAvgLength / SampleMinLength / SampleMaxLengthInspect the loaded dataset's distribution before training. Catch outliers that would skew your loss curve.
GetSample(int), RemoveSample(int), ClearTrainingData(), SaveTrainingData: full control over the loaded corpus.
Every important training parameter is exposed for advanced users; defaults are calibrated for common 4B to 13B fine-tunes.
AdamAlpha, AdamBeta1, AdamBeta2, AdamDecay, AdamDecayMinNDim, AdamGradientClipping.
CosineDecayMin, CosineDecayRestart, CosineDecaySteps. Anneal learning rate cleanly with optional warm restarts.
LoraRank, LoraAlpha, plus per-tensor ranks RankWQ, RankWK, RankWV, RankWO, attention-norm and feed-forward ranks.
GradientAccumulation for larger effective batches; UseGradientCheckpointing for memory savings on big models.
MaxNoImprovement: stop training after N iterations without loss improvement. No more babysitting runs.
RopeFreqBase, RopeFreqScale: tune positional encoding for long-context fine-tuning experiments.
Adapt a base model to your domain's vocabulary: legal, medical, financial, scientific. Improve named-entity accuracy and instruction following without prompt engineering.
Train an adapter that emits writing in your brand voice, with your editorial conventions, terminology, and tone.
Fine-tune on your support transcripts for grounded, on-brand replies. Hot-swap a freshly-trained adapter daily without redeploying the base model.
Train against a corpus of (intent, function-call) pairs so the model rarely hallucinates tool names or arguments. Pair with grammar-constrained decoding for ironclad output.
Train on your team's redaction policies, privacy rules, and disclosure templates. Run entirely on premises so the policies themselves never leak.
Train a per-customer adapter on their data. Load adapters dynamically per request so each tenant gets a model that knows their patterns.
LoraFinetuningMain trainer. Configure iteration count, batch size, context window, checkpointing. Subscribe to FinetuningProgress for live metrics.
LoraTrainingParameters25+ tunable hyperparameters. AdamW, cosine decay, gradient clipping, per-tensor ranks, RoPE controls.
TrainingDataset / TrainingSampleContainer types for the loaded corpus. Inspect, filter, save, reload during a training run.
ShareGptExporterConvert collected ChatTrainingSamples into a ShareGPT-format JSON file for sharing or archival.
After training, see LoRA Integration for runtime hot-swap and Model Quantization to compress for deployment.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.
Data prep, validation, and quality controls for LoRA training.
Read the guide → How-to guideBootstrap a training set from extraction outputs.
Read the guide → How-to guideHot-swap adapters at runtime without reloading the base model.
Read the guide →No cloud uploads. No GPU rental. Your data, your model, your hardware.