Speech-to-Text Support in LM-Kit.NET | Powerful Local Audio Processing for .NET, LM-Kit

Introduction

LM-Kit has already enabled .NET developers to integrate advanced text and vision processing into their .NET applications. Today, we’re excited to expand these capabilities even further by introducing on-device speech recognition and audio analysis.

Whether you’re building transcription tools for customer support, multilingual accessibility features, or local voice-controlled interfaces, LM-Kit.NET’s new audio capabilities provide the flexibility and performance to get started quickly, without compromising on privacy.

A New Dimension: Audio Processing Made Easy

LM-Kit.NET now enables .NET developers to leverage cutting-edge audio processing features without relying on cloud providers. Your data stays local, secure, and under your complete control.

Key audio capabilities include:

On-Device AI Transcription: Fast, secure, and accurate speech-to-text directly on your device.
Voice Activity Detection (VAD): Precisely detect speech segments with SileroVAD 5, customizable for specialized use cases. more information about VAD.
Automatic Language Detection: Quickly identify the language from your audio input.
Real-Time Speech Translation: Instantly translate speech into English from over 100 supported languages.
Universal WAV Compatibility: Effortlessly process audio with standard WAV file support.
Batch Processing: Efficiently handle multiple audio streams simultaneously with built-in multithreading.

Explore LM-Kit Speech to Text

Powering Speech Recognition with Whisper

LM-Kit integrates Whisper v3 models, offering a balance of accuracy and speed optimized for local execution. We’ve included quantized versions of Whisper models in our catalog to maximize performance on any hardware setup.

Explore the Model Catalog

Quick Start: Transcribe Audio with Ease

With LM-Kit, converting audio to text is straightforward. Here are some quick examples:

Identify the spoken language from any WAV file before deciding how to handle it.

var model = LM.LoadFromModelID("whisper-large-turbo3");
            var wavFile = new WaveFile(@"d:\discussion.wav");
            var engine = new SpeechToText(model);
            var language = engine.DetectLanguage(wavFile);
            Console.WriteLine($"Detected language: {language}");

Stream transcribed segments as they emerge from the model, no buffering required.

var model = LM.LoadFromModelID("whisper-large-turbo3");
            var wavFile = new WaveFile(@"d:\discussion.wav");
            var engine = new SpeechToText(model);
            engine.OnNewSegment += (sender, e) => Console.WriteLine(e.Segment);
            var transcription = engine.Transcribe(wavFile);

Flip the engine into translation mode to convert any of 100+ source languages into English in one pass.

var model = LM.LoadFromModelID("whisper-large3");
            var wavFile = new WaveFile(@"d:\discussion.wav");
            SpeechToText engine = new(model)
            {
                Mode = SpeechToText.SpeechToTextMode.Translation
            };
            engine.OnNewSegment += (sender, e) => Console.WriteLine(e.Segment);
            var transcription = engine.Transcribe(wavFile);

Access Demo on GitHub

Try It Yourself , No Registration Required

Curious to see how it works? Check out our demo repository:

👉 LM-Kit.NET Speech-to-Text Demo Repository

Download, run the example, and experience powerful speech-to-text locally, completely registration-free.

SpeechToText Class Documentation

What's Next?

We’re continuously enhancing our speech processing capabilities. Upcoming features include:

Real-Time Recognition: Continuous streaming with minimal delay.
Speaker Diarization: Distinguish between multiple speakers in a single recording.
Expanded VAD Logic: Improved speech detection in noisy environments.
More Models: Broader support for open and commercial STT models.

Change History

Unleash the Power of Local Speech AI

LM-Kit.NET empowers your applications with state-of-the-art audio processing while preserving your data privacy and control. Upgrade your apps today, and transform audio data into actionable insights effortlessly.

Get Started Today!

🎙️Introducing Speech-to-Text Support in LM-Kit