KeywordCount
How many keywords to extract per document, capped by model capacity.
Extract the most relevant keywords and key phrases from text and images with
the KeywordExtraction engine. Configure keyword count, n-gram
size, and target language. Handle large documents with intelligent shrinking
strategies. Run 100% on-device with Dynamic Sampling for fast, accurate
results on any hardware.
KeywordCountHow many keywords to extract per document, capped by model capacity.
MaxNgramSizeMaximum phrase length in words. Set to 1 for single-word keywords.
TargetLanguageOutput language for generated keywords. Auto-detected when undefined.
TextShrinkingStrategyHow the engine reduces oversized documents to fit the context window.
Every document, article, and customer message contains critical terms buried in noise. Extracting them manually is slow, inconsistent, and impossible at scale.
Human taggers produce inconsistent results, miss key terms, and cannot keep up with content volume.
Sending sensitive documents to third-party endpoints exposes intellectual property and violates compliance requirements.
Rule-based approaches cannot capture semantic meaning, multi-word phrases, or context-dependent importance.
Processing thousands of documents daily through cloud APIs creates unpredictable and growing costs.
01
Unlike rule-based methods, the KeywordExtraction engine uses language models to understand context, capturing multi-word phrases and semantically important terms.
02
LM-Kit's proprietary sampling technology delivers high accuracy even with smaller models running on CPU. Enterprise results without GPU requirements.
03
All processing stays on your infrastructure. No API calls, no data exposure. Process sensitive legal, medical, or financial documents with confidence.
04
Process millions of documents with no per-call fees. Fixed licensing cost regardless of volume. Predictable budgets, unlimited throughput.
KeywordExtraction class.The foundation of topic discovery in your .NET applications. KeywordExtraction provides a high-level API to extract the most important keywords and phrases from any content. Configure the number of keywords, control n-gram size, set a target language, and handle documents that exceed model context limits with automatic text shrinking strategies. Works with both text and image inputs.
KeywordCount property sets how many keywords to extract (default: 5)MaxNgramSize controls maximum phrase length (default: 3 words)TargetLanguage for multilingual extraction with auto-detectionTextShrinkingStrategy handles oversized documents automaticallyGuidance property steers extraction toward specific themesExtractKeywords / ExtractKeywordsAsyncusing LMKit.TextAnalysis; var model = LM.LoadFromModelID("qwen3.5:4b"); var extractor = new KeywordExtraction(model) { KeywordCount = 8, MaxNgramSize = 3, TargetLanguage = Language.English }; string article = File.ReadAllText("report.txt"); var keywords = extractor.ExtractKeywords(article); Console.WriteLine($"Confidence: {extractor.Confidence:P1}"); foreach (var kw in keywords) { Console.WriteLine($"- {kw.Value}"); }
The KeywordExtraction engine works with both text content and image attachments through a unified API. Pass an Attachment object containing an image and the engine automatically applies OCR to extract visible text before identifying the most relevant keywords. Process scanned documents, screenshots, infographics, and photographs alongside plain text content.
ExtractKeywords method for both text and image inputsusing LMKit.TextAnalysis; using LMKit.Data; // Use a vision-capable model for images var model = LM.LoadFromModelID("qwen3.5:4b"); var extractor = new KeywordExtraction(model) { KeywordCount = 6, MaxNgramSize = 3 }; // Extract from an image (OCR is automatic) var attachment = new Attachment("infographic.png"); var imgKeywords = extractor.ExtractKeywords(attachment); foreach (var kw in imgKeywords) Console.WriteLine($"- {kw.Value}");
Fine-tune extraction behavior with a comprehensive set of properties. Control how many keywords to extract, the maximum phrase length, target language, context window management, and how the engine handles documents that exceed the model's capacity. Use the Guidance property to steer extraction toward specific themes or terminology.
KeywordCount
Sets the desired number of keywords. The actual count depends on model capacity and input data, but will never exceed this value.
MaxNgramSize
Controls the maximum n-gram size. Set to 1 for single words, or higher for multi-word phrases like "machine learning" or "interest rate adjustment".
TargetLanguage
Specifies the language for generated keywords. When set to Undefined, the engine auto-detects the input language.
TextShrinkingStrategy
Determines how content is reduced when it exceeds the MaximumContextLength. Different strategies trade semantic integrity for length reduction.
Guidance
Optional text that steers the extraction process toward specific themes, constraints, or terminology. Useful for domain-specific applications.
MaximumContextLength
Limits the token count for model input. Reducing this value increases inference speed on CPU at the cost of some quality.
var model = LM.LoadFromModelID("qwen3.5:4b"); var extractor = new KeywordExtraction(model) { // Extract up to 10 keywords KeywordCount = 10, // Allow phrases up to 4 words MaxNgramSize = 4, // Generate keywords in French TargetLanguage = Language.French, // Steer toward financial terms Guidance = "Focus on financial and" + " economic terminology", // Limit context for faster CPU inference MaximumContextLength = 2048 };
Extract keywords from content in any language supported by the underlying model. The TargetLanguage property lets you explicitly set the output language or leave it as Undefined for automatic detection. Process multilingual content and generate keywords in a target language different from the source, enabling cross-language content analysis and indexing.
TargetLanguage is Undefinedvar model = LM.LoadFromModelID("qwen3.5:4b"); // Auto-detect language var autoExtractor = new KeywordExtraction(model) { KeywordCount = 5 }; string germanText = "Die Europäische Zentralbank hat " + "neue Maßnahmen zur Inflationsbekämpfung " + "und Zinspolitik angekündigt."; var deKeywords = autoExtractor.ExtractKeywords(germanText); // Output: Zentralbank, Inflationsbekämpfung... // Cross-language: input German, output English
Clone the sample, run it, and see keyword extraction in action on your own data in minutes.
Console demo
The Keyword Extraction Demo is a standalone console application that lets you point the engine at any text file and instantly surface the most relevant keywords. Choose from multiple pre-trained models, configure extraction parameters, and view results with timing and confidence metrics.
.csproj file directly, no extra installations neededOrganizations across industries leverage LM-Kit's keyword extraction to power search, automate tagging, and unlock insights from unstructured content.
SEO
Automatically identify the most relevant terms from web pages, articles, and product descriptions. Power SEO analysis, meta tag generation, and search relevance scoring.
CMS
Auto-tag articles, reports, and knowledge base entries with relevant keywords. Improve content discoverability and enable faceted search across document repositories.
Topics
Surface the main themes from large document sets. Identify trending topics in customer feedback, survey responses, and social media streams.
Insights
Extract key terms from support tickets, reviews, and survey answers. Identify the language your customers use to describe products, features, and issues.
i18n
Process documents in 50+ languages and generate keywords in a unified target language. Build cross-language search indexes and content catalogs.
Pipelines
Feed extracted keywords into downstream systems: classification engines, RAG pipelines, recommendation algorithms, and analytics dashboards.
From NuGet install to keyword extraction in production in under 10 minutes. No cloud keys, no API limits, no surprises.
dotnet add package LMKit.NETLM.LoadFromModelID("qwen3.5:4b")new KeywordExtraction(model)extractor.ExtractKeywords(text)MaximumContextLength for optimal speed vs. quality tradeoff on your hardwareTextShrinkingStrategy for documents longer than the model context windowGuidance to focus extraction on your domain's terminology and prioritiesComplete documentation for the KeywordExtraction class, supporting types, and related APIs.
KeywordExtractionCore class for extracting keywords from text and images. Includes all configuration properties and extraction methods.
KeywordItemRead-only value container representing a single extracted keyword. Returned as a collection by ExtractKeywords methods.
TextShrinkingStrategyEnum defining strategies for handling content that exceeds the model context window. Options include Auto, Truncation, and more.
CategorizationCombine keyword extraction with classification. Use extracted keywords to inform custom content categorization workflows.
EmbedderGenerate embeddings from extracted keywords for semantic search, clustering, and RAG applications.
AttachmentData class for passing image content to the extraction engine. Used for multimodal keyword extraction from images.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.
Console demo: surface what matters from any text or image.
Open on GitHub → How-to guideConfigure n-gram size, target language, shrinking strategy.
Read the guide → API referenceAPI reference for the KeywordExtraction class.
Open the reference →Powerful keyword extraction. Multimodal support. Multilingual. 100% on-device. Start building intelligent .NET applications today.