Solutions · Text Analysis · NER

Extract structured data from any content.

Identify and classify people, organizations, locations, dates, monetary values, and custom entity types from text and images. Get precise character offsets for every mention. Define unlimited custom entities. Run 100% on-device with zero cloud dependencies.

8 built-in types Unlimited custom Position tracking
Example extraction

Sarah Chen, CEO of Quantum Dynamics Inc., announced a $2.5 billion investment in Singapore on March 15, 2025. The QD-X1 Processor will power the new facility.

Problem

Unstructured data is everywhere.

Contracts, emails, documents, and web content contain critical information buried in free-form text. Manual extraction is slow, error-prone, and impossible to scale.

Traditional approaches fall short

  • Regex patterns break with format variations
  • Rule-based systems require constant maintenance
  • Cloud APIs expose sensitive documents to third parties
  • Fixed entity types cannot adapt to domain-specific needs
  • Images and scanned documents require separate pipelines
Capabilities

Enterprise-grade entity extraction.

Extract structured information from any content with precision, flexibility, and complete data privacy.

Multimodal

Multimodal processing

Extract entities from text, images, and documents with a unified API. Built-in OCR handles scanned documents, screenshots, and photos automatically.

Custom

Custom entity types

Beyond built-in types, define unlimited custom entities with descriptions. Extract product codes, patient IDs, invoice numbers, or any domain-specific identifier.

Tracking

Position tracking

Every extracted entity includes precise character offsets for all occurrences. Enable highlighting, redaction, and targeted document manipulation.

Built-in types

8 standard entity types out of the box.

Start extracting immediately with production-ready entity definitions. Add your own custom types as needed.

Type

Person

Names of individuals, executives, authors, contacts

Type

Organization

Companies, institutions, agencies, government bodies

Type

Location

Places, addresses, cities, countries, regions

Type

Date

Calendar dates, time references, deadlines

Type

Money

Currency amounts, financial values, prices

Type

Percent

Percentage values, ratios, growth rates

Type

Product

Product names, brands, model numbers

Quick start

Extract entities in 5 lines of code.

The NamedEntityRecognition class provides a simple, intuitive API for extracting entities from any content. Load a model, create the recognizer, and call Recognize(). All built-in entity types are enabled by default.

  • Works with any compatible LM-Kit model
  • Synchronous and asynchronous methods available
  • Returns entity value, type, and all occurrences
  • Confidence scores for quality assessment
  • Multilingual support out of the box

API reference   View demo

BasicNER.cs
using LMKit.Model;
using LMKit.TextAnalysis;

// Load the language model
var model = LM.LoadFromModelID("lmkit-tasks:4b-preview");

// Create the NER engine with default entity types
var ner = new NamedEntityRecognition(model);

// Extract entities from text
string text = "Apple Inc. announced that CEO Tim Cook " +
    "will visit Paris on January 15, 2025.";

var entities = ner.Recognize(text);

foreach (var entity in entities)
{
    Console.WriteLine($"{entity.Type}: {entity.Value}");
}
Customization

Define your own entity types.

Extract domain-specific information by defining custom entity types with EntityDefinition. Provide a name and optional description to guide extraction. Mix built-in and custom types freely.

  • Define unlimited custom entity types
  • Optional descriptions improve extraction accuracy
  • Mix with built-in types as needed
  • Perfect for product codes, IDs, domain terms
  • Use Guidance property for complex extraction rules
CustomEntities.cs
var model = LM.LoadFromModelID("lmkit-tasks:4b-preview");

// Define custom entity types with descriptions
var definitions = new List<EntityDefinition>
{
    new EntityDefinition(NamedEntityType.Person),
    new EntityDefinition(NamedEntityType.Organization),
    new EntityDefinition(NamedEntityType.Custom,
        "ProductCode",
        "Product codes like SKU-12345 or ITEM-ABC"),
    new EntityDefinition(NamedEntityType.Custom,
        "PatientID",
        "Medical record numbers")
};

var ner = new NamedEntityRecognition(model);
foreach (var def in definitions) ner.EntityDefinitions.Add(def);
Multimodal

Extract entities from images.

Process scanned documents, screenshots, business cards, and photos with the same API. Built-in OCR extracts text automatically, then NER identifies entities. Use OcrEngine for traditional OCR or rely on vision language models for intelligent extraction.

  • Same API for text and image inputs
  • Built-in OCR integration via OcrEngine property
  • Vision language model support for complex layouts
  • PreferredInferenceModality for control over processing
  • Process PDFs, JPEGs, PNGs, and more
NerFromImage.cs
using LMKit.Model;
using LMKit.TextAnalysis;
using LMKit.Data;

// Load a vision-capable model
var model = LM.LoadFromModelID("qwen3-vl:8b");
var ner = new NamedEntityRecognition(model)
{
    // Control inference modality
    PreferredInferenceModality = InferenceModality.Multimodal
};

// Extract entities from a scanned business card
var attachment = new Attachment("business_card.jpg");
var entities = ner.Recognize(attachment);
Position tracking

Track every entity mention.

Each extracted entity includes an Occurrences collection with precise character offsets for every mention in the source text. Enable precise highlighting, redaction, entity linking, and document manipulation workflows.

  • Start and end character positions for each occurrence
  • Multiple mentions tracked for the same entity
  • Enable precise document redaction
  • Build entity highlighting and annotation features
  • Power entity linking and knowledge graph construction
TrackOccurrences.cs
var model = LM.LoadFromModelID("lmkit-tasks:4b-preview");
var ner = new NamedEntityRecognition(model);

string text = "Microsoft CEO Satya Nadella announced " +
    "that Microsoft will invest in AI. " +
    "Nadella emphasized Microsoft's commitment.";

var entities = ner.Recognize(text);

foreach (var entity in entities)
{
    Console.WriteLine($"{entity.Type}: {entity.Value}");
    foreach (var occ in entity.Occurrences)
        Console.WriteLine($"  [{occ.Start}-{occ.End}]");
}
Workflow

Four steps to structured data.

From raw content to structured entities in milliseconds, all running locally on your hardware.

Step 1

Load model

Choose from LM-Kit's optimized task models or bring your own. Models run entirely on-device.

Step 2

Configure entities

Use default entity types or define custom ones for your domain. Add guidance for complex extraction.

Step 3

Process content

Feed text or images through the recognizer. Built-in OCR handles visual content automatically.

Step 4

Get results

Receive structured entities with types, values, confidence scores, and position offsets.

Applications

Real-world use cases.

Organizations across industries use LM-Kit NER to automate document processing, enhance search, and ensure compliance.

Legal

Contract analysis

Extract parties, dates, amounts, and obligations from legal documents. Automate contract review and clause identification.

Finance

Invoice processing

Parse vendor names, amounts, dates, and line items from invoices and receipts. Enable touchless AP automation.

Compliance

Compliance monitoring

Identify regulated entities in communications. Monitor for mentions of competitors, products, or restricted topics.

Health

Medical records

Extract patient names, medications, diagnoses, and providers from clinical notes. Enable structured EHR data entry.

Search

Enhanced search

Index documents by extracted entities. Enable faceted search by person, organization, location, or date.

News

News intelligence

Track mentions of companies, executives, and competitors across news feeds. Build real-time market intelligence.

Graphs

Knowledge graphs

Extract entities and relationships to build knowledge graphs. Connect people, organizations, and events automatically.

Support

Support ticket routing

Extract products, issues, and customer details from support tickets. Enable intelligent routing and prioritization.

HR

Resume parsing

Extract candidate names, employers, skills, and education from resumes. Automate applicant tracking workflows.

Comparison

LM-Kit vs. alternatives.

See how LM-Kit NER compares to cloud APIs and traditional NLP libraries.

FeatureLM-Kit NERCloud NER APIsTraditional NLP
Data privacy100% on-deviceData sent to cloudLocal processing
Custom entity typesUnlimitedLimited or noneRequires retraining
Multimodal (images)Built-inSeparate serviceNot supported
Position trackingYesVariesYes
Contextual understandingLLM-poweredLLM-poweredRule-based
MultilingualYesYesPer-language models
Offline operationYesNoYes
Per-request costNonePer API callNone
Developer Resources

API reference.

Complete documentation for the NamedEntityRecognition class and related types.

NamedEntityRecognition

Main class for extracting named entities from text and images with configurable entity types.

View docs

EntityDefinition

Define built-in or custom entity types with optional descriptions for guided extraction.

View docs

NamedEntityType

Enumeration of built-in entity types: Person, Organization, Location, Date, Money, Percent, Product, Event, Custom.

View docs

Recognize

Synchronously extract entities from text or image attachments with optional cancellation.

View docs

RecognizeAsync

Asynchronously extract entities for non-blocking operation in UI and server applications.

View docs

Guidance

Property to provide semantic guidance for domain-specific extraction rules and context.

View docs

Confidence

Get the confidence score of the last recognition operation for quality assessment.

View docs

OcrEngine

Optional OCR engine for traditional text extraction from raster content during recognition.

View docs

MaximumContextLength

Configure maximum context length in tokens for processing large documents.

View docs

Ready to extract structured data?

8+ entity types. Custom definitions. Multimodal. Position tracking. 100% on-device. Start building intelligent .NET applications today.

Download free API documentation