LayoutSearchEngine
The single entry point. One new, all modes.
A foundational engine that turns any page (parsed PDF, OCR'd scan, layout tree from a VLM) into a searchable coordinate-aware index. Exact, regex, and fuzzy matching. Region, proximity, and between-anchor queries. Single or multi-page. Designed to be extremely fast and to slot into any workflow or agent tool. Always on-device.
LayoutSearchEngineThe single entry point. One new, all modes.
TextMatchText, snippet, score, page, bounding box. Ready for redaction or highlighting.
PageElementThe same layout tree produced by PDF parsing, OCR, and VLM layout analysis.
SearchHighlightEngineRender matches as annotated images for review UIs and audit trails.
string.Contains tells you a match exists. A real
document workflow needs to know where: which page,
which paragraph, what bounding box, with what confidence, in
what surrounding context. Document Search answers those
questions in microseconds, across millions of pages, without
leaving the process.
01 · Layout-aware
Every match returns the page index plus the bounding box of the matched text. Drop the box straight into a redaction step, a viewer highlight, or an audit annotation.
02 · Multi-modal input
The engine reads PageElement trees. They come from native PDF parsing, traditional OCR, vision-language model OCR, layout extraction, or any custom source. One search API for all input modalities.
03 · Engineered for speed
Tokenisation, normalisation, and matching are tuned to run inline inside agent tools and chat loops. No background indexer to manage, no I/O round-trip, no warm-up.
04 · Multi-page native
Every method has a single-page and an IEnumerable<PageElement> overload. Search across thousands of pages, get back ordered matches with cross-page locations.
05 · Snippet + score
Every TextMatch carries the matched text, a configurable context window snippet, and a normalised relevance score. Sort, filter, surface to the user, with no extra plumbing.
06 · 100% local
No cloud search index, no per-call cost, no quota. The engine works inside an in-process method call. Air-gapped, regulated, and offline-first workloads are first-class.
The right mode depends on the question. Same engine, same result type, same coordinate guarantees. Pick a tab for the signature you need.
FindText performs substring matching with optional
case-insensitivity and whole-word boundaries. The default is
OrdinalIgnoreCase; flip to Ordinal for
case-sensitive matching. Use it for invoice numbers, contract
keywords, SKUs.
using LMKit.Document.Search; var engine = new LayoutSearchEngine(); // Default: case-insensitive substring search. List<TextMatch> hits = engine.FindText(page, "Invoice"); // Whole-word, case-sensitive search. hits = engine.FindText(page, "NDA", new TextSearchOptions { Comparison = StringComparison.Ordinal, WholeWord = true, MaxResults = 50, ContextChars = 60, }); foreach (var m in hits) { Console.WriteLine($"page {m.PageIndex} {m.Bounds} {m.Snippet}"); }
FindRegex accepts any .NET regex. Useful for
structured patterns: dates, monetary amounts, identifiers,
case numbers, IBANs. The engine resolves every match against
the layout tree so even multi-character regex matches return
a single contiguous bounding box.
// Find every monetary amount on every page in the document. var money = engine.FindRegex( pages, @"\$\s?\d{1,3}(?:,\d{3})*(?:\.\d{2})?"); // Find ISO dates (YYYY-MM-DD) anywhere in the contract. var dates = engine.FindRegex( contract, @"\b\d{4}-\d{2}-\d{2}\b", new RegexSearchOptions { MaxResults = 1000 });
FindFuzzy tolerates OCR errors, typos, and
layout-induced whitespace drift. Set MinScore
to control strictness and MaxEditDistance to
bound the search. Returns a normalised similarity score
with every match so you can sort and threshold.
// "Pharmaceuticals" with up to 2 OCR-style typos. var hits = engine.FindFuzzy(page, "Pharmaceuticals", new FuzzySearchOptions { MaxEditDistance = 2, MinScore = 0.80, TokenAware = true, MaxResults = 100, }); foreach (var m in hits.OrderByDescending(m => m.Score)) { Console.WriteLine($"score {m.Score:F2} text \"{m.Text}\""); }
Beyond the three base modes, the engine exposes three structural operators that turn search into a layout query language. Pick a tab.
FindInRegion returns every text element inside a
pixel-coordinate rectangle. Useful for invoice line items,
form fields with known layout, table-cell extraction, and
bounding-box-driven redaction.
// Pull every text element from the top-right corner of every page // (where the page number lives on this corpus). var pageNumberBox = new Rectangle(x: 540, y: 20, width: 60, height: 30); var hits = engine.FindInRegion(pages, pageNumberBox, new RegionSearchOptions { MaxResults = 2000, });
FindNear finds matches within a proximity radius
of an anchor query. The radius is expressed as a fraction of
page size, so the same call works across A4, US Letter, and
scanned variants without rewriting coordinates.
// Find every dollar amount within 5% of the page diagonal from the // word "Total" - across every page in the document. var hits = engine.FindNear(pages, "Total", new ProximityOptions { Radius = 0.05, // 5% of page size MatchPattern = @"\$[\d,\.]+", MaxResults = 200, });
FindBetween returns the text that sits between
two anchors. Perfect for extracting the body of a section,
the contents of a labeled box, or any "from this heading to
the next" pattern, with layout preserved.
// Pull the body of the "Indemnification" clause out of a contract. var clause = engine.FindBetween( contract, startQuery: "Indemnification", endQuery: "Termination", new BetweenOptions { IncludeAnchors = false }); Console.WriteLine(clause[0].Text);
Every match carries pixel-precise coordinates. The companion
SearchHighlightEngine turns a list of matches into
a rendered annotated image, ready for a review UI, an audit
trail, or a customer-facing explanation of where an answer came
from.
using LMKit.Document.Search; // Find PII in a scanned page, render highlights, save the annotated image. var hits = engine.FindRegex(page, @"\b\d{3}-\d{2}-\d{4}\b"); // SSN-like var result = await SearchHighlightEngine.HighlightAsync( page, hits, new SearchHighlightOptions { Appearance = new HighlightAppearance { FillColor = Color.FromArgb(80, Color.Yellow), StrokeColor = Color.OrangeRed, StrokeWidth = 2, }, }); await result.SaveAsync(@"D:\out\redacted-preview.png");
Exact, regex, and fuzzy answer the question "where does this string appear?". They do not answer "where does this concept appear?". For paraphrases, synonyms, multilingual queries, and intent-driven retrieval, the same SDK exposes an embedding + vector-search layer that composes directly with everything above. Document Search is where most workflows start; RAG & Knowledge is where they grow.
Layer 1
Turn any text passage (or image) into a high-dimensional vector with the Embedder class. Text and image vectors share a space via Nomic Embed Vision so cross-modal queries work out of the box.
Layer 2
Store vectors in the built-in FileSystemVectorStore, an in-memory store, or a Qdrant connector via the IVectorStore contract. Cosine, dot, and Euclidean similarity are first-class.
Layer 3
HybridRetrievalStrategy fuses dense embeddings with lexical BM25 (Bm25RetrievalStrategy) and reranks via RagReranker. Recall from BM25, precision from vectors, ranking from a dedicated reranker model.
Layer 4
When the user's wording is far from the corpus, HydeRetriever (Hypothetical Document Embeddings) and MultiQueryRetriever generate alternative queries with the LLM, retrieve for each, and merge results.
Layer 5
DocumentRag wraps chunking (TextChunking, HtmlChunking, MarkdownChunking), embedding, retrieval, and source attribution into a single high-level engine. Hand it a folder, get cited answers.
Layer 6
PdfChat and RagChat compose every layer above into a multi-turn conversation primitive. Streaming answers, multi-turn memory, source-attributed citations, all in one class.
Mix freely.
Layout-aware search and semantic retrieval are not alternatives,
they are complementary. A common pattern: use
LayoutSearchEngine to locate anchors with bounding
boxes, then expand to DocumentRag to retrieve
conceptually similar passages, then ground the answer with both
coordinate and citation provenance. Every layer above can run
inside the same process, on-device, with no external service.
The same engine is wired across the SDK. Wherever
PageElement shows up, search shows up next to it.
OCR & VLM-OCR
Both LMKitOcr and VlmOcr return PageElement trees. Feed them directly into LayoutSearchEngine; coordinates from OCR become coordinates in the match.
Layout Understanding
Layout analysis tags paragraphs, headings, tables, footnotes. Combine with region or between-anchor search to query by structural intent ("find the price inside the first table").
LayoutDocument RAG
RAG retrieves a passage. LayoutSearchEngine locates that passage in the source page with a bounding box. The result: every answer can show you exactly where it came from.
Agents
Search runs in microseconds and is safe to expose as an agent tool. An agent can locate clauses, scan for PII, find numbers near labels, all as in-process tool calls with no external service.
ToolsPII & redaction
Regex and fuzzy modes find sensitive content; bounding boxes drive the redaction step. SearchHighlightEngine can render before-and-after evidence for compliance review.
Extraction
Find an anchor ("Invoice number"), search near it ("alphanumeric token within 5% of the page"), feed the result back into a schema-constrained extractor. Robust against template drift.
ExtractionScan thousands of pages for sensitive terms, get back every hit with page number and bounding box, produce an audit-ready PDF with highlighted evidence. Works on air-gapped legal review machines.
Find labels by anchor query, find values by proximity. No fragile template needed; the same code handles invoices from a hundred vendors with different layouts.
Pull the body of a clause with FindBetween, scan for prohibited language with FindRegex, render highlighted versions for legal review. Works offline on a workstation.
When the RAG model cites a passage, search locates it on the original page. The viewer renders an annotated image so users can see the provenance.
Expose search as a tool. The agent can find clauses, count occurrences across pages, locate anchors, all in microseconds, all without sending document content over the network.
Run thousands of pages through OCR + search + highlight to produce dashboards of detected entities, dates, and amounts. Per-page latency stays in the millisecond range.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.
End-to-end how-to: load a document, run text / regex / fuzzy queries, get coordinates.
Read the guide → How-to guideCombine layout extraction with structural queries (find inside a region, between anchors).
Read the guide → How-to guideOCR a scanned page into a PageElement tree, then run any search query on the result.
Read the guide → API referenceAPI reference for the search engine (FindText, FindRegex, FindFuzzy, FindInRegion, FindNear, FindBetween).
Open the reference →The seven pillars of LM-Kit.NET, plus the local runtime they share. Highlighted card is where you are now.
01 · AI Agents
ReAct planning, supervisors, parallel and pipeline orchestrators, persistent memory, MCP clients, custom tools.
AI Agents02 · Document Intelligence
PDF text and table extraction, on-device OCR reaching SOTA benchmark scores, structured field extraction with grammar-constrained generation.
Document Intelligence03 · Vision & Multimodal
Image understanding, classification, labeling, multimodal chat, image embeddings, VLM-OCR, background removal. Same conversation surface as LLMs.
Vision & Multimodal04 · RAG & Knowledge
Built-in vector store, Qdrant connector, embeddings, hybrid retrieval, document chunking, source citations.
RAG & Knowledge05 · Text Analysis
Built-in classifiers and an extractor that emits typed C# objects via grammar-constrained sampling. Sentiment, keywords, language detection.
Text Analysis06 · Speech & Audio
A growing local speech-to-text stack: hallucination suppression, Voice Activity Detection, real-time translation, streaming output, 100+ languages.
Speech & Audio07 · Text Generation
Single-turn, multi-turn, and stateless conversation primitives. Translate, correct, rewrite, summarise. Prompt templates, streaming, grammar-constrained outputs.
Text GenerationThe foundation
Every capability above runs on this runtime.
Foundation
The runtime all seven pillars sit on. The LM-Kit.NET NuGet ships the complete inference system: open-weight LLMs, vision-language models, embeddings, on-device speech-to-text, OCR and classifiers, accelerated on CPU, AVX2, CUDA 12/13, Vulkan or Metal. One package, zero cloud calls, predictable latency, full data and technology sovereignty.
Three modes, six operators, bounding-box coordinates, microsecond latency, zero cloud calls. Embed it in the agent tool, the chat loop, the redaction pipeline, the RAG explainer.