Solutions · Vision · Background Removal & Preprocessing

Background removal and the full preprocessing toolkit.

Strip backgrounds with U2-Net or ModNet. Deskew scanned pages, crop to bounding boxes, resize for downstream models, measure skew angles. Each preprocessing step is a class and a built-in agent tool, runnable from code or from a function-calling agent.

5-minute quickstart Graphics API

U2-Net · ModNet Deskew · Crop · Resize Built-in agent tools

Engine

U2-Net

Salient-object segmentation. Strong on general subjects.

Engine

ModNet

Portrait matting. Hair-level edges, fast.

Tools

Deskew, Crop, Resize

Built-in agent tools wrap each step.

Background removal

Strip the background without sending the image away.

U2-Net

Salient object segmentation

Identifies and isolates the foreground subject. Good general-purpose model for products, objects, animals, and complex scenes.

ModNet

Portrait matting

Specialised for people. Hair-level edge accuracy, fast enough for real-time webcam pipelines and video conferencing.

ONNX

Hardware-accelerated

Both engines run via the ONNX backend with CUDA, DirectML, or CPU. Same accelerator stack as the rest of LM-Kit.NET.

Preprocessing toolkit

Every image-prep step has a tool.

These preprocessing operations ship as both .NET classes (call from code) and built-in agent tools (call from a function-calling agent). Pair them with OCR, VLMs, or any vision pipeline.

Tool

`image_deskew`

Detect and correct rotation in scanned pages. Critical for downstream OCR accuracy on phone-scanned documents.

Tool

`image_measure_skew`

Compute the skew angle without rotating. Useful when you want to flag oblique scans without modifying them.

Tool

`image_crop` & `image_resize_box`

Region-of-interest extraction by pixel coordinates or by detected region. Feed only the relevant patch to a VLM.

Tool

`image_resize`

Aspect-aware resizing with quality interpolation. Standardise inputs before feeding downstream models.

Tool

`image_info`

Inspect resolution, color space, EXIF, MIME. Use it as the first step in an agent-driven pipeline.

Tool

`ocr_recognize`

Run OCR as part of a preprocessing pipeline. Routes to the configured OCR engine (LMKit OCR, PaddleOCR-VL, GLM-OCR).

Use cases

Where preprocessing changes the outcome.

E-commerce product photos

Remove backgrounds at upload time. Consistent catalog look without manual editing.

Video calls & streaming

Real-time portrait matting on the user's machine. No cloud relay, no privacy compromise.

Document OCR prep

Deskew phone-scanned pages, crop to the page boundary, resize to the OCR engine's preferred resolution. Single agent loop chains the tools.

Privacy-respecting editing

Background-strip a photo, redact a region, blur a face. Every operation runs on the device that holds the original.

Demos & docs

Build it. Read it. Try it.

Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.

Demo

Background Removal

Console demo: U2-Net-based background removal that writes transparent-background PNGs in batch.

Open on GitHub → Sample

Background Removal walkthrough

Step-by-step doc page: prerequisites, setup, code path, expected output.

Read on docs → How-to guide

Preprocess images for vision pipelines

Deskew, crop, resize, denoise. Built-in tools wrap each step.

Read the guide → How-to guide

Discover and browse built-in tools

Catalog of agent-callable tools, including image preprocessing.

Read the guide → API reference

LMKit.Graphics

Drawing primitives, canvas, brush, pen, and the ImageBuffer type.

Open the reference →

LM-Kit.NET pillars

Seven pillars, one foundation.

The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.

01 · AI Agents

Orchestration patterns

ReAct planning, supervisors, parallel and pipeline orchestrators, persistent memory, MCP clients, custom tools.

AI Agents

02 · Document Intelligence

Parse PDFs, images, EML

PDF text and table extraction, on-device OCR reaching SOTA benchmark scores, structured field extraction with grammar-constrained generation.

Document Intelligence

03 · Vision & Multimodal

VLMs, image classification, chat with image

Image understanding, classification, labeling, multimodal chat, image embeddings, VLM-OCR, background removal. Same conversation surface as LLMs.

Vision & Multimodal

04 · RAG & Knowledge

Vector search and retrieval

Built-in vector store, Qdrant and pgvector connectors, embeddings, hybrid retrieval, document chunking, source citations.

RAG & Knowledge

05 · Text Analysis

Classification, NER, PII, sentiment

Built-in classifiers and an extractor that emits typed C# objects via grammar-constrained sampling. Sentiment, keywords, language detection.

Text Analysis

06 · Speech & Audio

Audio transcription, STT

A growing local speech-to-text stack: hallucination suppression, Voice Activity Detection, real-time translation, streaming output, 100+ languages.

Speech & Audio

07 · Text Generation

Conversations, rewriting, summaries

Single-turn, multi-turn, and stateless conversation primitives. Translate, correct, rewrite, summarise. Prompt templates, streaming, grammar-constrained outputs.

Text Generation

The foundation

Every capability above runs on this runtime.

Foundation

Local Inference

The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.

Explore the foundation

Strip, deskew, crop, on-device.

Start in 5 minutes Back to Vision hub

Background removal and the full preprocessing toolkit.

U2-Net

ModNet

Deskew, Crop, Resize

Salient object segmentation

Portrait matting

Hardware-accelerated

image_deskew

image_measure_skew

image_crop & image_resize_box

image_resize

image_info

ocr_recognize

E-commerce product photos

Video calls & streaming

Document OCR prep

Privacy-respecting editing

Background Removal

Background Removal walkthrough

Preprocess images for vision pipelines

Discover and browse built-in tools

LMKit.Graphics

Orchestration patterns

Parse PDFs, images, EML

VLMs, image classification, chat with image

Vector search and retrieval

Classification, NER, PII, sentiment

Audio transcription, STT

Conversations, rewriting, summaries

Local Inference

`image_deskew`

`image_measure_skew`

`image_crop` & `image_resize_box`

`image_resize`

`image_info`

`ocr_recognize`