In-memory store
Fast prototyping, zero setup, instant feedback.
Store and retrieve embeddings at any scale. Four pluggable storage backends with a unified API. From in-memory prototyping to production-ready deployments with Qdrant or custom backends. Swap storage strategies without rewriting code. 100% local processing.
Fast prototyping, zero setup, instant feedback.
File-based, handles millions of vectors locally.
FileSystemVectorStoreDirectory-based IVectorStore with caching.
HNSW indexing, distributed, cloud-ready.
LM-Kit provides a unified embedding storage architecture that scales from quick
prototypes to production deployments. At its core is the DataSource
abstraction, which manages embeddings, metadata, and retrieval through a
consistent API regardless of where your vectors are stored.
Start with in-memory storage for rapid iteration, graduate to the built-in file-based vector database for local applications, or connect to Qdrant for distributed workloads. The same code works across all backends with zero modifications.
Think of it as SQLite for vectors: a self-contained, file-based engine that handles millions of embeddings without external infrastructure, while remaining fully compatible with cloud-scale solutions when you need them.
DataSource hierarchy
Choose the storage that fits your application's lifecycle. Switch between them seamlessly without rewriting code.
In-memory
DataSource.CreateInMemoryDataSource()
Embeddings computed and stored in RAM with optional serialization to disk via Serialize() method. Zero setup required. Ideal for fast prototyping, testing, and live classification tasks.
Serialize() and Deserialize() for reusabilityRecommended
DataSource.CreateFileDataSource()
Self-contained, file-based engine optimized for embedding workloads. Think of it as SQLite for dense vectors. No server, no configuration, just a file path.
New
FileSystemVectorStorenew FileSystemVectorStore(path)
IVectorStore implementation that persists collections as individual files on disk. Each collection stored as a separate .ds file with in-memory caching for performance.
IVectorStore interfaceProduction
QdrantEmbeddingStore + DataSource.LoadFromStore()
High-performance, open-source vector database with HNSW indexing. Ideal for production workloads requiring distributed access and advanced filtering.
Custom
IVectorStoreImplement IVectorStore interface
Full control over vector storage logic. Integrate with proprietary databases, internal APIs, or hybrid storage systems using the standardized contract.
Everything you need to build production-grade embedding storage and retrieval.
Hierarchy
Organize embeddings into sections and partitions with optional metadata at each level. Manage multi-modal inputs within a single collection.
Metadata
Attach metadata to sections and partitions for filtering, tagging, and advanced retrieval scenarios across any vector backend.
Portability
Serialize DataSource instances to disk and reload anywhere. Enable checkpointing, debugging, and deployment without external services.
Updates
Efficient insertions, deletions, and metadata edits without rebuilding the entire dataset. Works with both built-in and external stores.
Search
SearchSimilar returns ranked results by vector similarity. Configure top-K, minimum scores, and metadata filters for precise retrieval.
Privacy
Local-only and on-prem options keep data secure and compliant. No external dependencies required for complete vector management.
Same API, different backends. Switch storage strategies without rewriting your application logic.
using LMKit.Model; using LMKit.Data; using LMKit.Retrieval; // Load embedding model var embedModel = LM.LoadFromModelID("embeddinggemma-300m"); // Create in-memory DataSource var dataSource = DataSource.CreateInMemoryDataSource("my-collection", embedModel); // Use RagEngine to import content var ragEngine = new RagEngine(embedModel); ragEngine.AddDataSource(dataSource); // Import text with automatic chunking ragEngine.ImportText( "Your document content here...", new TextChunking() { MaxChunkSize = 500 }, "my-collection", "document-section"); // Optional: Serialize to disk for later reuse dataSource.Serialize("./cache/my-collection.bin"); // Later: Deserialize from disk var restored = DataSource.Deserialize("./cache/my-collection.bin", embedModel);
using LMKit.Model; using LMKit.Data; using LMKit.Retrieval; // Load embedding model var embedModel = LM.LoadFromModelID("embeddinggemma-300m"); const string DATA_SOURCE_PATH = "Ebooks.dat"; const string COLLECTION_NAME = "Ebooks"; DataSource dataSource; if (File.Exists(DATA_SOURCE_PATH)) { // Load existing file-based DataSource dataSource = DataSource.LoadFromFile(DATA_SOURCE_PATH, readOnly: false); } else { // Create new file-based DataSource dataSource = DataSource.CreateFileDataSource(DATA_SOURCE_PATH, COLLECTION_NAME, embedModel); } // Use RagEngine to import and query var ragEngine = new RagEngine(embedModel); ragEngine.AddDataSource(dataSource); // Check if section already exists if (!dataSource.HasSection("Romeo and Juliet")) { string content = File.ReadAllText("romeo_and_juliet.txt"); ragEngine.ImportText(content, new TextChunking() { MaxChunkSize = 500 }, COLLECTION_NAME, "Romeo and Juliet"); }
using LMKit.Model; using LMKit.Data; using LMKit.Data.Storage.Qdrant; using LMKit.Retrieval; // Load embedding model var embedModel = LM.LoadFromModelID("embeddinggemma-300m"); // Connect to Qdrant (docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant) var store = new QdrantEmbeddingStore(new Uri("http://localhost:6334")); const string COLLECTION = "Ebooks"; DataSource dataSource; if (await store.CollectionExistsAsync(COLLECTION)) { // Load existing collection from Qdrant dataSource = DataSource.LoadFromStore(store, COLLECTION); } else { // Create new collection in Qdrant dataSource = await DataSource.CreateVectorStoreDataSourceAsync(store, COLLECTION, embedModel); } // Use RagEngine with Qdrant-backed DataSource var ragEngine = new RagEngine(embedModel, vectorStore: store); ragEngine.AddDataSource(dataSource); // Import content (automatically stored in Qdrant) string content = await new HttpClient().GetStringAsync( "https://gutenberg.org/cache/epub/1513/pg1513.txt"); ragEngine.ImportText(content, new TextChunking() { MaxChunkSize = 500 }, COLLECTION, "Romeo and Juliet");
using LMKit.Model; using LMKit.Data; using LMKit.Data.Storage; using LMKit.Retrieval; // Load embedding model var embedModel = LM.LoadFromModelID("embeddinggemma-300m"); // Create FileSystemVectorStore (NEW in 2026.1.1) // Each collection stored as separate .ds file in directory var fsStore = new FileSystemVectorStore("./vector-collections"); const string COLLECTION = "Ebooks"; DataSource dataSource; if (await fsStore.CollectionExistsAsync(COLLECTION)) { // Load existing collection (auto-cached in memory) dataSource = DataSource.LoadFromStore(fsStore, COLLECTION); } else { // Create new collection dataSource = await DataSource.CreateVectorStoreDataSourceAsync(fsStore, COLLECTION, embedModel); } // FileSystemVectorStore implements IVectorStore interface // Works with RagEngine just like Qdrant var ragEngine = new RagEngine(embedModel); ragEngine.AddDataSource(dataSource); // Directory structure: ./vector-collections/Ebooks.ds Console.WriteLine($"Store path: {fsStore.DirectoryPath}");
From desktop tools to enterprise RAG systems, LM-Kit's vector storage adapts to your needs.
Search
Build intelligent search that understands meaning, not just keywords. Index documents, products, or knowledge bases for natural language queries.
Chatbot
Ground LLM responses with relevant context from your corpus. Use RagEngine with FindMatchingPartitions() and QueryPartitions() for accurate answers.
Memory
Give AI agents persistent memory with AgentMemory class. Store facts via SaveInformationAsync() and recall them automatically in conversations.
Documents
Index and retrieve from large document collections with DocumentRag. Support legal discovery, research assistants, and enterprise knowledge management.
Recommend
Find similar items, content, or users based on embedding similarity. Power product recommendations, content discovery, and personalization.
Offline
Ship portable AI modules with embedded vectors using FileSystemVectorStore. Support air-gapped environments and compliance-sensitive scenarios.
Core components for building vector storage solutions.
DataSourceCentral container for embedding storage. Manages sections, partitions, metadata. Create with CreateFileDataSource(), CreateInMemoryDataSource(), or LoadFromFile().
RagEngineOrchestrates RAG workflows. Import text with automatic chunking via ImportText(). Query with FindMatchingPartitions() and QueryPartitions().
IVectorStoreInterface for custom vector storage backends. Implement for proprietary databases. Methods include CollectionExistsAsync(), CreateCollectionAsync().
FileSystemVectorStoreFile system-based IVectorStore implementation. Persists collections as .ds files in a directory with automatic caching.
QdrantEmbeddingStoreQdrant connector implementing IVectorStore. Bridges LM-Kit.NET with Qdrant's high-performance vector database via gRPC.
PartitionSimilarityResult from similarity search. Contains SectionIdentifier, Similarity score, Metadata, and partition content for retrieval workflows.
AgentMemorySemantic memory for AI agents. SaveInformationAsync() stores facts with embeddings. Integrates with MultiTurnConversation via Memory property.
TextChunkingConfigures text splitting for embeddings. Set MaxChunkSize to control partition size. Used with RagEngine.ImportText() for automatic chunking.
The right storage strategy is critical to performance, scalability, and developer productivity.
01
Same code works across all storage types. Just change the backend configuration.
02
Local-only and on-prem solutions keep data secure and compliant.
03
From desktop experiments to high-scale RAG systems with millions of vectors.
04
Clean APIs, comprehensive documentation, and consistent patterns across all backends.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.
Console demo: built-in vector store driving a full RAG pipeline.
Open on GitHub → How-to guideIndex, embed, query, rerank with built-in or external stores.
Read the guide → API referenceAPI reference for the vector-store contract (filesystem, in-memory, Qdrant, custom).
Open the reference →From in-memory experiments to durable local databases and scalable remote setups, LM-Kit makes switching storage backends effortless.