Multi-turn conversation
Full conversation history with automatic context carry-over. Follow-up questions resolve pronouns and references seamlessly.
Single-shot retrieval misses context. Follow-up questions fail because the
retriever forgets what was asked before. LM-Kit's RagChat
combines multi-turn conversation, advanced query reformulation, and grounded
retrieval in a single class that runs entirely on your hardware.
Full conversation history with automatic context carry-over. Follow-up questions resolve pronouns and references seamlessly.
Four retrieval strategies: Original, Contextual, MultiQuery, and HyDE. Each one optimized for different query complexity levels.
Register tools, built-in functions, and Agent Skills. Extend Q&A with web search, calculations, or any custom operation.
RagChat is LM-Kit.NET's turnkey solution for multi-turn conversational
question-answering over custom knowledge bases. Unlike document-centric RAG,
which manages the full document lifecycle, RagChat operates on a
pre-populated RagEngine that you own and manage. This makes it ideal
for custom corpora, multi-source knowledge bases, and enterprise data that comes
from heterogeneous systems.
A single SubmitAsync() call orchestrates the full pipeline: query
reformulation, semantic retrieval, prompt construction, and grounded response
generation, all with full conversation history preserved across turns.
Complete ownership of your data. RagChat runs 100% on-device. No
API calls, no cloud dependencies, no data leaving your infrastructure. Compliant
with GDPR, HIPAA, and air-gapped environments by design.
// Build your knowledge base var ragEngine = new RagEngine(embeddingModel); ragEngine.ImportText(corporateKnowledge); ragEngine.ImportText(productDocs); // Start a multi-turn conversation using var chat = new RagChat(ragEngine, chatModel); chat.QueryGenerationMode = QueryGenerationMode.Contextual; // Ask questions naturally var r1 = await chat.SubmitAsync("What is our refund policy?"); // Follow-up: context is preserved var r2 = await chat.SubmitAsync("Does that apply to digital products?"); // Access retrieved partitions foreach (var p in r2.RetrievedPartitions) Console.WriteLine($"Source: {p.DataSourceIdentifier}");
Choose the optimal retrieval strategy for your query complexity. Each mode trades off between speed, recall, and precision.
Mode 01
The user's question is sent directly to the retrieval engine. No reformulation, no overhead. The fastest path from question to answer.
Best for: Direct, self-contained queries
Mode 02
Rewrites the user's follow-up question into a self-contained query using conversation history. Resolves pronouns, ellipsis, and implicit references automatically.
Best for: Multi-turn conversations
Mode 03
Generates multiple query variants and searches independently, then fuses results using Reciprocal Rank Fusion (RRF). Maximizes recall for complex or ambiguous questions.
Best for: Complex, multi-faceted queries
Mode 04
Generates a hypothetical answer first, then retrieves passages similar to that answer. Bridges the vocabulary gap between questions and documents.
Best for: Technical, domain-specific questions
RagChat orchestrates a five-stage pipeline for every question. Each
stage is observable via events and fully configurable.
Stage 01
Follow-up questions are rewritten into self-contained queries using conversation history.
Stage 02
Semantic search across your RagEngine's data sources. Respects MinRelevanceScore and MaxRetrievedPartitions.
Stage 03
Results ordered by source, section, and partition index. Optional reranking refines relevance scores.
Stage 04
Retrieved context injected into your PromptTemplate via @context and @query placeholders.
Stage 05
Grounded response generation with streaming via AfterTextCompletion. Supports tool calls, skills, and memory.
RagChat implements IMultiTurnConversation, giving it
the full power of an AI agent combined with grounded retrieval.
Tool calling & skills
Register custom tools, built-in tools, and Agent Skills that the model can invoke during conversation. Combine knowledge retrieval with live computations, web searches, database lookups, or any external operation.
Tools
Register any ITool implementation. The model decides when and how to invoke tools during RAG conversations.
Skills
Define complex capabilities via SKILL.md files. Skills combine system prompts, tools, and behavioral rules into reusable packages.
Approval
ToolApprovalRequired event for human-in-the-loop control. Approve or deny tool invocations before they execute.
Connect AgentMemory for long-term knowledge that persists across
sessions. Every stage of the pipeline emits events for complete observability.
Memory
AgentMemoryRAG-backed persistent memory that survives across sessions. Recall relevant facts from past conversations automatically.
Events
RetrievalCompleted eventFires after partition retrieval with full details: query used, partitions found, count requested, and elapsed time.
Streaming
AfterTextCompletion event streams tokens in real time for responsive user experiences. Build interactive chat UIs effortlessly.
RagChat vs PdfChat.Both provide conversational RAG, but they serve different use cases. Choose the one that matches your data ownership model.
Custom corpora
RagChatOperates on a pre-populated RagEngine you manage. Full control over data sources, chunking, and the retrieval pipeline. The caller owns the engine lifecycle.
RagEngineBest for: Custom corpora, enterprise knowledge, multi-source data
Document Q&A
PdfChatManages the full document lifecycle: import, chunk, embed, cache, and query. Optimized for quick document Q&A with automatic context management.
Best for: Document Q&A, PDF chat, single-document focus
Fine-grained control over every aspect of the retrieval and generation pipeline.
Reranking
Cross-encoder rerankers refine initial retrieval for significantly higher precision. Access raw and reranked scores on every partition.
Filtering
Set MinRelevanceScore (0.0 to 1.0) and MaxRetrievedPartitions to control quality and quantity of retrieved context.
Templates
Configure how context reaches the model with @context and @query placeholders. Optimize prompts for your domain.
Reasoning
Adjust ReasoningLevel (None, Medium, High) for models that support extended thinking. Balance quality with latency for your use case.
MMR
Configure Maximal Marginal Relevance on the underlying RagEngine to balance relevance with diversity in retrieved results.
Privacy
Zero cloud dependencies. Your knowledge base, embeddings, and conversations stay on your hardware. GDPR and HIPAA compliant by design.
Storage
Works with any IVectorStore backend: in-memory, built-in SQLite, Qdrant, or your own custom implementation.
APIs
Both SubmitAsync and Submit methods available. Build responsive UIs with async or use synchronous for batch processing and console applications.
Sessions
Full ChatHistory access. ClearHistory() resets conversation state without affecting the knowledge base. Start fresh conversations at any time.
Complete API documentation for building conversational RAG applications.
RagChatMulti-turn conversational Q&A over a user-managed RagEngine with tool calling and memory.
RagEngineCore retrieval engine managing data sources, embeddings, and similarity search.
RagQueryResultResponse container with generated text and retrieved partition references.
QueryGenerationModeEnum for retrieval strategies: Original, Contextual, MultiQuery, HypotheticalAnswer.
DataSourceContent repository for text partitions and sections with metadata support.
PartitionSimilarityRetrieved partition with similarity scores, source metadata, and embeddings.
EmbedderGenerate text and image embeddings for semantic similarity search.
AgentMemoryRAG-backed persistent memory for long-term context across conversation sessions.
IVectorStoreInterface for pluggable vector storage backends (in-memory, SQLite, Qdrant, custom).
Clone working examples from our GitHub repository and start building conversational RAG applications.
Multi-turn RAG conversation using RagChat with query contextualization and streaming responses.
Compare query generation modes, relevance thresholds, and reranking strategies to optimize retrieval quality.
Build a help desk assistant with RagChat, multi-source FAQ data, and real-time response streaming.
Conversational Q&A over PDF documents with PdfChat for comparison with RagChat's approach.
Agent with RAG-backed long-term memory that remembers context across conversation sessions.
All samples on GitHub
Browse the complete collection of RAG and conversational AI samples.
Explore the techniques that power RagChat's retrieval and generation pipeline.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.
Turn your knowledge base into an intelligent, multi-turn conversation. One NuGet package, zero cloud dependencies, complete data ownership.