Get Free Community License
Content Classification Engine

Custom ContentClassification for .NET

Classify text, images, PDFs, and Office documents into any categories you define. Get the single best match or a ranked list with confidence scores. Add category descriptions for precision. Handle unknown inputs gracefully. Runs 100% on-device with Dynamic Sampling for maximum accuracy on any hardware.

Multimodal Unlimited Categories Confidence Scores Batch Processing
Live Classification
Input
"My payment didn't go through and I've been charged twice on my credit card statement."
Classification Results
Billing Issue
94.2%
Technical Support
3.8%
Account Access
1.4%
Feature Request
0.6%
Custom Categories
6+
File Formats
100%
On-Device

Stop Routing Content Manually

Manual sorting wastes time, introduces errors, and breaks down as content volume grows. Automate classification directly inside your .NET applications.

Without LM-Kit
  • Manual tagging that does not scale with volume
  • Rigid keyword rules that miss intent and context
  • Cloud API costs and latency that add up fast
  • Sending sensitive data to third-party servers
  • Separate pipelines for text, images, and documents
With LM-Kit
  • AI-powered classification that understands meaning
  • Define unlimited custom categories in a single call
  • Zero API costs with 100% on-device processing
  • Your data never leaves the machine
  • One API for text, images, PDFs, DOCX, XLSX, and PPTX

Everything You Need to Classify Content

The Categorization engine delivers production-grade classification with a clean, flexible API designed for real-world workloads.

Multimodal Input

Classify plain text, images, PDFs, DOCX, XLSX, PPTX, and HTML through a single unified API. Automatic OCR extracts content from scanned documents and screenshots.

Confidence Scoring

Every classification returns a Confidence score so you can set thresholds, escalate uncertain cases to human review, and audit decisions at scale.

Dynamic Sampling

LM-Kit's innovative sampling technology delivers up to 75% error reduction and 2x faster processing, even with smaller models on CPU-only hardware.

Embedding Classifier Mode

Enable UseEmbeddingClassifier for ultra-fast, lightweight classification using vector similarity instead of generative inference. Ideal for high-throughput pipelines.

Custom Guidance

The Guidance property lets you inject domain-specific instructions to steer classification behavior, fine-tuning how categories are interpreted without retraining.

Unknown Category Handling

AllowUnknownCategory returns -1 when content does not match any predefined category, preventing forced misclassification on out-of-scope inputs.

Single Category

Single-Category Classification

Use GetBestCategory to assign content to the single most relevant category from your predefined list. Provide optional category descriptions to improve accuracy when category names are ambiguous. Access the Confidence property after each call to verify classification reliability.

  • Pass categories as a simple list of strings
  • Optional descriptions list for each category improves precision
  • Returns index of the winning category (-1 if unknown)
  • Async overload: GetBestCategoryAsync for non-blocking workflows
  • Works with text, Attachment, and ImageBuffer inputs
GetBestCategory.cs
var model = LM.LoadFromModelID(
    "lmkit-tasks:4b-preview");

var classifier = new Categorization(model);

// Define categories with descriptions
var categories = new List<string> {
    "Billing Issue",
    "Technical Support",
    "Feature Request",
    "General Feedback"
};

var descriptions = new List<string> {
    "Payment problems, refunds, charges",
    "Bugs, errors, troubleshooting",
    "New features or improvements",
    "General comments or suggestions"
};

int best = classifier.GetBestCategory(
    categories, descriptions,
    "My card was charged twice");

Console.WriteLine(
    $"Category: {categories[best]}");
Console.WriteLine(
    $"Confidence: {classifier.Confidence:P1}");
// Output: Billing Issue (94.2%)
Multi-Category

Multi-Category Classification

Use GetTopCategories to retrieve a ranked list of the most relevant categories, each with its confidence score. Perfect for content that spans multiple topics, for building recommendation systems, or for routing content to multiple downstream processes simultaneously.

  • Specify maximum number of categories to return
  • Results ranked by confidence from highest to lowest
  • Combine with AllowUnknownCategory for safety filtering
  • Async overload: GetTopCategoriesAsync
  • Apply to text, images, and document attachments
GetTopCategories.cs
var model = LM.LoadFromModelID(
    "lmkit-tasks:4b-preview");

var classifier = new Categorization(model)
{
    AllowUnknownCategory = true
};

var categories = new List<string> {
    "Technology", "Business",
    "Science", "Politics",
    "Entertainment", "Sports"
};

string article = "Apple unveils new AI chip " +
    "that could reshape the industry.";

// Get top 3 matching categories
var results = classifier.GetTopCategories(
    categories, article, 3);

foreach (var r in results)
{
    Console.WriteLine(
        $"{categories[r.Index]}: {r.Confidence:P1}");
}
// Technology: 88.1%
// Business: 9.3%
// Science: 2.6%
Multimodal

Classify Any Content Type

The Categorization engine accepts text, images, and full document files through the Attachment class. Pass PDFs, Word documents, Excel spreadsheets, PowerPoint presentations, HTML files, and images. LM-Kit automatically extracts text and visual content, then classifies against your categories. Use a vision-capable model for image inputs.

  • PreferredInferenceModality controls text, image, or multimodal processing
  • Automatic OCR for scanned documents and screenshots
  • MaxInputTokens controls how much content is analyzed
  • ImageBuffer overload for in-memory image classification
  • Combine with batch processing for high-volume document pipelines
Plain Text PNG / JPG PDF DOCX XLSX PPTX HTML
MultimodalClassification.cs
var model = LM.LoadFromModelID(
    "lmkit-tasks:4b-preview");

var classifier = new Categorization(model);

var docTypes = new List<string> {
    "Invoice", "Contract",
    "Resume", "Report"
};

// Classify a PDF document
var pdf = new Attachment("document.pdf");
int pdfResult = classifier.GetBestCategory(
    docTypes, pdf);

Console.WriteLine(
    $"PDF: {docTypes[pdfResult]}");

// Classify an image (vision model)
var image = new Attachment("scan.png");
int imgResult = classifier.GetBestCategory(
    docTypes, image);

// Classify a Word document
var docx = new Attachment("report.docx");
int docResult = await classifier
    .GetBestCategoryAsync(docTypes, docx);

Console.WriteLine(
    $"Confidence: {classifier.Confidence:P1}");

Up and Running in 4 Steps

From NuGet install to production classification in minutes. No cloud accounts, API keys, or external services required.

1

Install LM-Kit.NET

Add the LM-Kit.NET NuGet package to your C# or VB.NET project. Zero external dependencies.

2

Load a Model

Call LM.LoadFromModelID to download and load a task-optimized model like lmkit-tasks:4b-preview.

3

Define Categories

Create a list of category names, optionally paired with descriptions for higher accuracy.

4

Classify Content

Call GetBestCategory or GetTopCategories with your content. Check the Confidence score for reliability.

Real-World Classification Use Cases

Organizations across industries use LM-Kit content classification to automate workflows, improve routing, and extract structured labels from unstructured content.

Support Ticket Routing

Automatically classify incoming tickets by topic, priority, and product area. Route to the right team instantly, cutting first-response time dramatically.

Document Classification

Sort invoices, contracts, reports, and correspondence. Classify PDFs and scans into document types for intelligent document processing pipelines.

Content Moderation

Classify user-generated content by topic, flag policy violations, and detect spam. Process text and images through one unified pipeline.

News & Media Categorization

Tag articles by topic, region, and relevance. Build automated news feeds, media monitoring dashboards, and content recommendation engines.

Email Triage

Classify incoming emails by intent, urgency, and department. Automate responses for common categories and escalate high-priority items.

Compliance Screening

Classify documents and communications against regulatory categories. Identify sensitive content types and route for review before release.

LM-Kit vs. Cloud Classification APIs

See how LM-Kit's on-device approach compares to cloud-based classification services.

Feature LM-Kit.NET Cloud APIs
Custom categories ✓ Unlimited, no retraining Often requires retraining or fine-tuning
Data privacy ✓ 100% on-device Data sent to third-party servers
Per-request cost ✓ Zero Per-token or per-request pricing
Offline support ✓ Full offline ✗ Requires internet
Multimodal (text + images + docs) ✓ Native Varies by provider
Embedding classifier mode ✓ Built-in ✗ Not available
Latency ✓ Milliseconds (local) Network-dependent

API Reference & Demos

Complete documentation, code samples, and live demos for the Categorization engine.

Categorization

Core Class

The main classification engine. Define categories, classify content, read confidence scores.

View Docs
GetBestCategory

Single-Category

Classify text, images, or documents into the single most relevant category from your list.

View Docs
GetTopCategories

Multi-Category

Get a ranked list of the top matching categories with individual confidence scores.

View Docs
Custom Classification

Demo (CLI)

Console sample showing how to define custom categories and classify text with confidence scoring.

View on GitHub
Document Classification

Demo (CLI)

Classify entire documents (PDF, DOCX) into categories using the Attachment input method.

View on GitHub
Batch Classification

Demo (CLI)

Process multiple documents in batch, classifying each against your predefined category set.

View on GitHub

Ready to Classify Your Content?

Unlimited custom categories. Multimodal input. Confidence scoring. 100% on-device. Start building intelligent .NET applications today.