Solutions · Vision · Background Removal & Preprocessing

Background removal and the full preprocessing toolkit.

Strip backgrounds with U2-Net or ModNet. Deskew scanned pages, crop to bounding boxes, resize for downstream models, measure skew angles. Each preprocessing step is a class and a built-in agent tool, runnable from code or from a function-calling agent.

U2-Net · ModNet Deskew · Crop · Resize Built-in agent tools
Engine

U2-Net

Salient-object segmentation. Strong on general subjects.

Engine

ModNet

Portrait matting. Hair-level edges, fast.

Tools

Deskew, Crop, Resize

Built-in agent tools wrap each step.

Background removal

Strip the background without sending the image away.

U2-Net

Salient object segmentation

Identifies and isolates the foreground subject. Good general-purpose model for products, objects, animals, and complex scenes.

ModNet

Portrait matting

Specialised for people. Hair-level edge accuracy, fast enough for real-time webcam pipelines and video conferencing.

ONNX

Hardware-accelerated

Both engines run via the ONNX backend with CUDA, DirectML, or CPU. Same accelerator stack as the rest of LM-Kit.NET.

Preprocessing toolkit

Every image-prep step has a tool.

These preprocessing operations ship as both .NET classes (call from code) and built-in agent tools (call from a function-calling agent). Pair them with OCR, VLMs, or any vision pipeline.

Tool

image_deskew

Detect and correct rotation in scanned pages. Critical for downstream OCR accuracy on phone-scanned documents.

Tool

image_measure_skew

Compute the skew angle without rotating. Useful when you want to flag oblique scans without modifying them.

Tool

image_crop & image_resize_box

Region-of-interest extraction by pixel coordinates or by detected region. Feed only the relevant patch to a VLM.

Tool

image_resize

Aspect-aware resizing with quality interpolation. Standardise inputs before feeding downstream models.

Tool

image_info

Inspect resolution, color space, EXIF, MIME. Use it as the first step in an agent-driven pipeline.

Tool

ocr_recognize

Run OCR as part of a preprocessing pipeline. Routes to the configured OCR engine (LMKit OCR, PaddleOCR-VL, GLM-OCR).

Use cases

Where preprocessing changes the outcome.

E-commerce product photos

Remove backgrounds at upload time. Consistent catalog look without manual editing.

Video calls & streaming

Real-time portrait matting on the user's machine. No cloud relay, no privacy compromise.

Document OCR prep

Deskew phone-scanned pages, crop to the page boundary, resize to the OCR engine's preferred resolution. Single agent loop chains the tools.

Privacy-respecting editing

Background-strip a photo, redact a region, blur a face. Every operation runs on the device that holds the original.

LM-Kit.NET pillars

Seven pillars, one foundation.

The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.

The foundation

Every capability above runs on this runtime.

Foundation

Local Inference

The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.

Explore the foundation

Strip, deskew, crop, on-device.

Start in 5 minutes Back to Vision hub