Solutions · AI agents · Streaming

Tokens flow as they are produced.

Modern chat UIs render words the moment the model generates them. Agent UIs need more: a separation between visible content and internal reasoning, hooks for tool calls, and signals when one agent delegates to another. LM-Kit streams them all over a non-blocking channel, with typed token kinds and multi-handler aggregation.

Start building free API reference

Channel-based Token kinds Multi-handler

`Content` tokens

User-visible response text.

`Thinking` tokens

Internal reasoning, separable from content.

`ToolCall` + `Delegation`

Structured signals for live progress UIs.

Why streaming matters

Latency perception is half the product.

A response that takes seven seconds feels slow. A response that starts showing words after 200 ms feels alive. Streaming flips the perception of speed without making the model any faster. For agents, streaming also surfaces the work being done: which tool the agent is calling, which worker it is delegating to, when reasoning ends and the answer begins.

Non-blocking channels

Tokens flow through a System.Threading.Channels writer. Inference runs at full speed; readers see tokens immediately.

Typed token kinds

AgentStreamTokenType distinguishes Content, Thinking, ToolCall, ToolResult. UIs can render or hide each independently.

Orchestration-aware

Orchestrator streams emit Delegation tokens with from/to/task metadata. Multi-agent UIs show which worker is talking.

Multi-handler

MulticastStreamHandler fans tokens out to UI, log file, and analytics simultaneously. No double-iteration of an enumerable.

Built-in adapters

TextWriterStreamHandler writes to console or file. DelegateStreamHandler wraps any callback. Plug straight into SignalR or Server-Sent Events.

Final result

AgentStreamResult aggregates the run. After the stream completes, you still have a final Content, Thinking, and ToolCalls snapshot.

Stream an agent

A few lines, live UI.

Receive content, reasoning, and tool-call tokens from a single agent as they emerge.

StreamAgent.cs

using LMKit.Agents;
using LMKit.Agents.Streaming;

var agent = Agent.CreateBuilder(model).Build();

await foreach (var token in agent.StreamAsync("Explain HNSW indexing in two paragraphs."))
{
    switch (token.Type)
    {
        case AgentStreamTokenType.Content:
            Console.Write(token.Text);                // render in UI
            break;
        case AgentStreamTokenType.Thinking:
            // optionally show in a collapsible "reasoning" panel
            break;
        case AgentStreamTokenType.ToolCall:
            Console.WriteLine($"[calling {token.ToolName}]");
            break;
    }
}

Multi-agent orchestrators stream delegation events plus content, so the UI can surface who is doing what.

StreamOrchestration.cs

using LMKit.Agents.Orchestration;

var supervisor = new SupervisorOrchestrator(boss, classifier, drafter, reviewer);

// Orchestration streams expose delegation tokens.
await foreach (var token in supervisor.StreamAsync("Process this incoming email."))
{
    if (token.Type == OrchestrationStreamTokenType.Delegation)
    {
        Console.WriteLine($"-> delegating to {token.ToAgent}: {token.Task}");
    }
    else if (token.Type == OrchestrationStreamTokenType.Content)
    {
        Console.Write(token.Text);
    }
}

Related capabilities

Streams across the stack.

Reasoning

Thinking tokens connect to the reasoning level controls and Chain-of-Thought handlers.

Reasoning page

Multi-agent workflows

Every orchestrator supports streaming. Delegation tokens identify which worker is producing output.

Multi-agent page

Observability

Token-level events flow into OpenTelemetry spans for forensic analysis.

Observability page

Stream agent responses guide

Step-by-step recipe for SignalR, Blazor, and Server-Sent Events.

How-to guide

Demos & docs

Build it. Read it. Try it.

Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.

Demo

Real-time. On-device.

Get Community Edition Download

Tokens flow as they are produced.

`Content` tokens

`Thinking` tokens

`ToolCall` + `Delegation`

Latency perception is half the product.

Non-blocking channels

Typed token kinds

Orchestration-aware

Multi-handler

Built-in adapters

Final result

A few lines, live UI.

Streams across the stack.

Reasoning

Multi-agent workflows

Observability

Stream agent responses guide

Build it. Read it. Try it.

Streaming Agent Responses

Streaming Agent Responses walkthrough

Stream agent responses in real time

Real-time. On-device.

Tokens flow as they are produced.

Content tokens

Thinking tokens

ToolCall + Delegation

Non-blocking channels

Typed token kinds

Orchestration-aware

Multi-handler

Built-in adapters

Final result

Reasoning

Multi-agent workflows

Observability

Stream agent responses guide

Streaming Agent Responses

Streaming Agent Responses walkthrough

Stream agent responses in real time

`Content` tokens

`Thinking` tokens

`ToolCall` + `Delegation`