Solutions · AI agents · Streaming

Tokens flow as they are produced.

Modern chat UIs render words the moment the model generates them. Agent UIs need more: a separation between visible content and internal reasoning, hooks for tool calls, and signals when one agent delegates to another. LM-Kit streams them all over a non-blocking channel, with typed token kinds and multi-handler aggregation.

Channel-based Token kinds Multi-handler

Content tokens

User-visible response text.

Thinking tokens

Internal reasoning, separable from content.

ToolCall + Delegation

Structured signals for live progress UIs.

Why streaming matters

Latency perception is half the product.

A response that takes seven seconds feels slow. A response that starts showing words after 200 ms feels alive. Streaming flips the perception of speed without making the model any faster. For agents, streaming also surfaces the work being done: which tool the agent is calling, which worker it is delegating to, when reasoning ends and the answer begins.

Non-blocking channels

Tokens flow through a System.Threading.Channels writer. Inference runs at full speed; readers see tokens immediately.

Typed token kinds

AgentStreamTokenType distinguishes Content, Thinking, ToolCall, ToolResult. UIs can render or hide each independently.

Orchestration-aware

Orchestrator streams emit Delegation tokens with from/to/task metadata. Multi-agent UIs show which worker is talking.

Multi-handler

MulticastStreamHandler fans tokens out to UI, log file, and analytics simultaneously. No double-iteration of an enumerable.

Built-in adapters

TextWriterStreamHandler writes to console or file. DelegateStreamHandler wraps any callback. Plug straight into SignalR or Server-Sent Events.

Final result

AgentStreamResult aggregates the run. After the stream completes, you still have a final Content, Thinking, and ToolCalls snapshot.

Stream an agent

A few lines, live UI.

Receive content, reasoning, and tool-call tokens from a single agent as they emerge.

StreamAgent.cs
using LMKit.Agents;
using LMKit.Agents.Streaming;

var agent = Agent.CreateBuilder(model).Build();

await foreach (var token in agent.StreamAsync("Explain HNSW indexing in two paragraphs."))
{
    switch (token.Type)
    {
        case AgentStreamTokenType.Content:
            Console.Write(token.Text);                // render in UI
            break;
        case AgentStreamTokenType.Thinking:
            // optionally show in a collapsible "reasoning" panel
            break;
        case AgentStreamTokenType.ToolCall:
            Console.WriteLine($"[calling {token.ToolName}]");
            break;
    }
}
Related capabilities

Streams across the stack.

Reasoning

Thinking tokens connect to the reasoning level controls and Chain-of-Thought handlers.

Reasoning page

Multi-agent workflows

Every orchestrator supports streaming. Delegation tokens identify which worker is producing output.

Multi-agent page

Observability

Token-level events flow into OpenTelemetry spans for forensic analysis.

Observability page

Stream agent responses guide

Step-by-step recipe for SignalR, Blazor, and Server-Sent Events.

How-to guide

Demos & docs

Build it. Read it. Try it.

Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.

Real-time. On-device.

Get Community Edition Download