Extract Structured DataFrom Any Content
Identify and classify people, organizations, locations, dates, monetary values, and custom entity types from text and images. Get precise character offsets for every mention. Define unlimited custom entities. Run 100% on-device with zero cloud dependencies.
Sarah Chen, CEO of Quantum Dynamics Inc., announced a $2.5 billion investment in Singapore on March 15, 2025. The QD-X1 Processor will power the new facility.
Unstructured Data is Everywhere
Contracts, emails, documents, and web content contain critical information buried in free-form text. Manual extraction is slow, error-prone, and impossible to scale.
Traditional Approaches Fall Short
- Regex patterns break with format variations
- Rule-based systems require constant maintenance
- Cloud APIs expose sensitive documents to third parties
- Fixed entity types cannot adapt to domain-specific needs
- Images and scanned documents require separate pipelines
LM-Kit NER Delivers
- LLM-powered extraction understands context and meaning
- Zero-maintenance with built-in intelligence
- 100% on-device processing protects your data
- Define unlimited custom entity types for your domain
- Unified API for text and images with built-in OCR
Enterprise-Grade Entity Extraction
Extract structured information from any content with precision, flexibility, and complete data privacy.
Multimodal Processing
Extract entities from text, images, and documents with a unified API. Built-in OCR handles scanned documents, screenshots, and photos automatically.
Custom Entity Types
Beyond built-in types, define unlimited custom entities with descriptions. Extract product codes, patient IDs, invoice numbers, or any domain-specific identifier.
Position Tracking
Every extracted entity includes precise character offsets for all occurrences. Enable highlighting, redaction, and targeted document manipulation.
8 Standard Entity Types Out of the Box
Start extracting immediately with production-ready entity definitions. Add your own custom types as needed.
Person
Names of individuals, executives, authors, contacts
Organization
Companies, institutions, agencies, government bodies
Location
Places, addresses, cities, countries, regions
Date
Calendar dates, time references, deadlines
Money
Currency amounts, financial values, prices
Percent
Percentage values, ratios, growth rates
Product
Product names, brands, model numbers
Custom
Define any entity type for your domain
Extract Entities in 5 Lines of Code
The NamedEntityRecognition class provides a simple, intuitive API for extracting entities from any content.
Load a model, create the recognizer, and call Recognize(). All built-in entity types are enabled by default.
- Works with any compatible LM-Kit model
- Synchronous and asynchronous methods available
- Returns entity value, type, and all occurrences
- Confidence scores for quality assessment
- Multilingual support out of the box
using LMKit.Model; using LMKit.TextAnalysis; // Load the language model var model = LM.LoadFromModelID("lmkit-tasks:4b-preview"); // Create the NER engine with default entity types var ner = new NamedEntityRecognition(model); // Extract entities from text string text = "Apple Inc. announced that CEO Tim Cook " + "will visit Paris on January 15, 2025."; var entities = ner.Recognize(text); foreach (var entity in entities) { Console.WriteLine($"[{entity.Type}] {entity.Value}"); } // Output: // [Organization] Apple Inc. // [Person] Tim Cook // [Location] Paris // [Date] January 15, 2025
Define Your Own Entity Types
Extract domain-specific information by defining custom entity types with EntityDefinition. Provide a name and optional description to guide extraction. Mix built-in and custom types freely.
- Define unlimited custom entity types
- Optional descriptions improve extraction accuracy
- Mix with built-in types as needed
- Perfect for product codes, IDs, domain terms
- Use Guidance property for complex extraction rules
var model = LM.LoadFromModelID("lmkit-tasks:4b-preview"); // Define custom entity types with descriptions var definitions = new List<EntityDefinition> { new EntityDefinition(NamedEntityType.Person), new EntityDefinition(NamedEntityType.Organization), new EntityDefinition( NamedEntityType.Custom, "ProductCode", "Product codes like SKU-12345 or ITEM-ABC" ), new EntityDefinition( NamedEntityType.Custom, "OrderID", "Order identifiers starting with ORD-" ) }; // Create recognizer with custom definitions var ner = new NamedEntityRecognition(model, definitions); string invoice = "Contact John Smith at Acme Corp " + "regarding SKU-78901 for order ORD-2025-001."; var entities = ner.Recognize(invoice); foreach (var entity in entities) { Console.WriteLine($"[{entity.Type}] {entity.Value}"); } // [Person] John Smith // [Organization] Acme Corp // [ProductCode] SKU-78901 // [OrderID] ORD-2025-001
Extract Entities from Images
Process scanned documents, screenshots, business cards, and photos with the same API. Built-in OCR extracts text automatically, then NER identifies entities. Use OcrEngine for traditional OCR or rely on vision language models for intelligent extraction.
- Same API for text and image inputs
- Built-in OCR integration via OcrEngine property
- Vision language model support for complex layouts
- PreferredInferenceModality for control over processing
- Process PDFs, JPEGs, PNGs, and more
using LMKit.Model; using LMKit.TextAnalysis; using LMKit.Data; // Load a vision-capable model var model = LM.LoadFromModelID("qwen2.5-vl-7b-instruct"); var ner = new NamedEntityRecognition(model) { // Control inference modality PreferredInferenceModality = InferenceModality.Multimodal }; // Extract entities from a scanned business card var attachment = new Attachment("business_card.jpg"); var entities = ner.Recognize(attachment); foreach (var entity in entities) { Console.WriteLine($"[{entity.Type}] {entity.Value}"); } // [Person] Jane Doe // [Organization] TechCorp International // [Location] San Francisco, CA // Or extract from a scanned invoice PDF var invoice = new Attachment("invoice_scan.pdf"); var invoiceEntities = await ner.RecognizeAsync(invoice);
Track Every Entity Mention
Each extracted entity includes an Occurrences collection with precise character offsets for every mention in the source text. Enable precise highlighting, redaction, entity linking, and document manipulation workflows.
- Start and end character positions for each occurrence
- Multiple mentions tracked for the same entity
- Enable precise document redaction
- Build entity highlighting and annotation features
- Power entity linking and knowledge graph construction
var model = LM.LoadFromModelID("lmkit-tasks:4b-preview"); var ner = new NamedEntityRecognition(model); string text = "Microsoft CEO Satya Nadella announced " + "that Microsoft will invest in AI. " + "Nadella emphasized Microsoft's commitment."; var entities = ner.Recognize(text); foreach (var entity in entities) { Console.WriteLine($"{entity.Type}: {entity.Value}"); Console.WriteLine($" Found {entity.Occurrences.Count} times:"); foreach (var occ in entity.Occurrences) { Console.WriteLine( $" Position [{occ.Start}-{occ.End}]"); } } // Output: // Organization: Microsoft // Found 3 times: // Position [0-9] // Position [42-51] // Position [91-100] // Person: Satya Nadella // Found 2 times: // Position [14-27] // Position [72-78]
Four Steps to Structured Data
From raw content to structured entities in milliseconds, all running locally on your hardware.
Load Model
Choose from LM-Kit's optimized task models or bring your own. Models run entirely on-device.
Configure Entities
Use default entity types or define custom ones for your domain. Add guidance for complex extraction.
Process Content
Feed text or images through the recognizer. Built-in OCR handles visual content automatically.
Get Results
Receive structured entities with types, values, confidence scores, and position offsets.
Real-World Use Cases
Organizations across industries use LM-Kit NER to automate document processing, enhance search, and ensure compliance.
Contract Analysis
Extract parties, dates, amounts, and obligations from legal documents. Automate contract review and clause identification.
Invoice Processing
Parse vendor names, amounts, dates, and line items from invoices and receipts. Enable touchless AP automation.
Compliance Monitoring
Identify regulated entities in communications. Monitor for mentions of competitors, products, or restricted topics.
Medical Records
Extract patient names, medications, diagnoses, and providers from clinical notes. Enable structured EHR data entry.
Enhanced Search
Index documents by extracted entities. Enable faceted search by person, organization, location, or date.
News Intelligence
Track mentions of companies, executives, and competitors across news feeds. Build real-time market intelligence.
Knowledge Graphs
Extract entities and relationships to build knowledge graphs. Connect people, organizations, and events automatically.
Support Ticket Routing
Extract products, issues, and customer details from support tickets. Enable intelligent routing and prioritization.
Resume Parsing
Extract candidate names, employers, skills, and education from resumes. Automate applicant tracking workflows.
LM-Kit vs. Alternatives
See how LM-Kit NER compares to cloud APIs and traditional NLP libraries.
| Feature | LM-Kit NER | Cloud NER APIs | Traditional NLP |
|---|---|---|---|
| Data Privacy | 100% On-Device | Data sent to cloud | Local processing |
| Custom Entity Types | Unlimited | Limited or none | Requires retraining |
| Multimodal (Images) | Built-in | Separate service | Not supported |
| Position Tracking | Yes | Varies | Yes |
| Contextual Understanding | LLM-powered | LLM-powered | Rule-based |
| Multilingual | Yes | Yes | Per-language models |
| Offline Operation | Yes | No | Yes |
| Per-Request Cost | None | Per API call | None |
API Reference
Complete documentation for the NamedEntityRecognition class and related types.
NamedEntityRecognition
Main class for extracting named entities from text and images with configurable entity types.
View DocsEntityDefinition
Define built-in or custom entity types with optional descriptions for guided extraction.
View DocsNamedEntityType
Enumeration of built-in entity types: Person, Organization, Location, Date, Money, Percent, Product, Event, Custom.
View DocsRecognize()
Synchronously extract entities from text or image attachments with optional cancellation.
View DocsRecognizeAsync()
Asynchronously extract entities for non-blocking operation in UI and server applications.
View DocsGuidance
Property to provide semantic guidance for domain-specific extraction rules and context.
View DocsConfidence
Get the confidence score of the last recognition operation for quality assessment.
View DocsOcrEngine
Optional OCR engine for traditional text extraction from raster content during recognition.
View DocsMaxContextLength
Configure maximum context length in tokens for processing large documents.
View DocsReady to Extract Structured Data?
8+ entity types. Custom definitions. Multimodal. Position tracking. 100% on-device. Start building intelligent .NET applications today.