Content tokens
User-visible response text.
Modern chat UIs render words the moment the model generates them. Agent UIs need more: a separation between visible content and internal reasoning, hooks for tool calls, and signals when one agent delegates to another. LM-Kit streams them all over a non-blocking channel, with typed token kinds and multi-handler aggregation.
Content tokensUser-visible response text.
Thinking tokensInternal reasoning, separable from content.
ToolCall + DelegationStructured signals for live progress UIs.
A response that takes seven seconds feels slow. A response that starts showing words after 200 ms feels alive. Streaming flips the perception of speed without making the model any faster. For agents, streaming also surfaces the work being done: which tool the agent is calling, which worker it is delegating to, when reasoning ends and the answer begins.
Tokens flow through a System.Threading.Channels writer. Inference runs at full speed; readers see tokens immediately.
AgentStreamTokenType distinguishes Content, Thinking, ToolCall, ToolResult. UIs can render or hide each independently.
Orchestrator streams emit Delegation tokens with from/to/task metadata. Multi-agent UIs show which worker is talking.
MulticastStreamHandler fans tokens out to UI, log file, and analytics simultaneously. No double-iteration of an enumerable.
TextWriterStreamHandler writes to console or file. DelegateStreamHandler wraps any callback. Plug straight into SignalR or Server-Sent Events.
AgentStreamResult aggregates the run. After the stream completes, you still have a final Content, Thinking, and ToolCalls snapshot.
Receive content, reasoning, and tool-call tokens from a single agent as they emerge.
using LMKit.Agents; using LMKit.Agents.Streaming; var agent = Agent.CreateBuilder(model).Build(); await foreach (var token in agent.StreamAsync("Explain HNSW indexing in two paragraphs.")) { switch (token.Type) { case AgentStreamTokenType.Content: Console.Write(token.Text); // render in UI break; case AgentStreamTokenType.Thinking: // optionally show in a collapsible "reasoning" panel break; case AgentStreamTokenType.ToolCall: Console.WriteLine($"[calling {token.ToolName}]"); break; } }
Multi-agent orchestrators stream delegation events plus content, so the UI can surface who is doing what.
using LMKit.Agents.Orchestration; var supervisor = new SupervisorOrchestrator(boss, classifier, drafter, reviewer); // Orchestration streams expose delegation tokens. await foreach (var token in supervisor.StreamAsync("Process this incoming email.")) { if (token.Type == OrchestrationStreamTokenType.Delegation) { Console.WriteLine($"-> delegating to {token.ToAgent}: {token.Task}"); } else if (token.Type == OrchestrationStreamTokenType.Content) { Console.Write(token.Text); } }
Thinking tokens connect to the reasoning level controls and Chain-of-Thought handlers.
Every orchestrator supports streaming. Delegation tokens identify which worker is producing output.
Token-level events flow into OpenTelemetry spans for forensic analysis.
Step-by-step recipe for SignalR, Blazor, and Server-Sent Events.
Working console demos on GitHub, step-by-step how-to guides on the docs site, and the API reference for the classes used on this page.