Lightweight, on-device streaming speech-to-text that transcribes naturally spoken language, with the speed, privacy, and control real-time applications demand.
Cheetah Streaming Speech-to-Text is on-device transcription software that automatically transcribes voice data in real time without network delay or compromising accuracy. Cheetah Streaming Speech-to-Text processes voice data locally, enabling live transcription on-device, mobile, web browsers, on-premise, or cloud.
1o = pvcheetah.create(access_key)2
3partial_transcript, is_endpoint =4 o.process(get_next_audio_frame())
1const o = new Cheetah(accessKey)2
3const [partialTranscript, isEndpoint] =4 o.process(audioFrame);
1Cheetah o = new Cheetah.Builder()2 .setAccessKey(accessKey)3 .setModelPath(modelPath)4 .build(appContext);5
6CheetahTranscript partialResult =7 o.process(getNextAudioFrame());
1let cheetah = Cheetah(2 accessKey: accessKey,3 modelPath: modelPath)4
5let partialTranscript, isEndpoint =6 try cheetah.process(7 getNextAudioFrame())
1Cheetah o = new Cheetah.Builder()2 .setAccessKey(accessKey)3 .build();4
5CheetahTranscript r =6 o.process(getNextAudioFrame());
1Cheetah o =2 Cheetah.Create(accessKey);3
4CheetahTranscript partialResult =5 o.Process(GetNextAudioFrame());
1const {2 result,3 isLoaded,4 isListening,5 error,6 init,7 start,8 stop,9 release,10} = useCheetah();11
12await init(13 accessKey,14 model15);16
17await start();18await stop();19
20useEffect(() => {21 if (result !== null) {22 // Handle transcript23 }24}, [result])
1_cheetah = await Cheetah.create(2 accessKey,3 modelPath);4
5CheetahTranscript partialResult =6 await _cheetah.process(7 getAudioFrame());
1const cheetah = await Cheetah.create(2 accessKey,3 modelPath)4
5const partialResult =6 await cheetah.process(7 getAudioFrame())
1pv_cheetah_t *cheetah = NULL;2pv_cheetah_init(3 access_key,4 model_file_path,5 endpoint_duration_sec,6 enable_automatic_punctuation,7 &cheetah);8
9const int16_t *pcm = get_next_audio_frame();10
11char *partial_transcript = NULL;12bool is_endpoint = false;13const pv_status_t status = pv_cheetah_process(14 cheetah,15 pcm,16 &partial_transcript,17 &is_endpoint);
1const cheetah =2 await CheetahWorker.create(3 accessKey,4 (cheetahTranscript) => {5 // callback6 },7 {8 base64: cheetahParams,9 // or10 publicPath: modelPath,11 }12 );13
14WebVoiceProcessor.subscribe(cheetah);
Cloud-based transcription APIs introduce latency by sending audio to external servers, making them vulnerable to network delays, throttling, and outages. On-device engines with large model sizes or without true streaming architectures can also lag, adding compute-related latency.
Cheetah Streaming Speech-to-Text avoids both pitfalls: it’s a lightweight, on-device streaming engine that processes audio instantly at the point of capture, delivering real-time transcription with guaranteed response time.
Real-time transcription, also known as real-time speech-to-text, streaming transcription, streaming speech-to-text, live transcription, or live speech-to-text, refers to the technology and tools that convert audio streams to text synchronously with audio generation.
Cloud-based real-time transcription APIs record and send voice data to vendor servers where the transcription engine resides to convert voice into text. On-device real-time transcription brings the transcription engine where voice data is, offering guaranteed real-time experience by eliminating unpredictable delays.
Cloud-based real-time transcription converts voice data into text with delay due to network latency and connectivity issues. On-device real-time transcription eliminates these inherent latency and reliability limitations by processing voice data on the device without sending it to a 3rd party cloud. For time-sensitive applications, such as agent assistance, medical dictation, or meeting transcription, delays affect the experience and productivity. A recent study on delays in virtual communication depicts internet lag as a wrench in mental gears.
Yes. You can run Cheetah Streaming Speech-to-Text in the cloud, whether private, public, or hybrid. Picovoice on-device voice recognition technology allows enterprises to decide where to run the transcription engine instead of making the Picovoice cloud mandatory for voice processing.
Key metrics for evaluating real-time transcription engines are latency, reliability & resiliency, accuracy, availability of features, the total cost of ownership, and data privacy and governance. Each metric may have different weights in different projects of the same company.
Yes. Cheetah Streaming Speech-to-Text is trained on diverse audio conditions including background noise, multiple speakers, and various accents. For specialized environments or specific accent patterns, custom training is available for Enterprise Plan customers via Picovoice Consulting to optimize performance for specific use cases.
Cheetah Streaming Speech-to-Text runs entirely within your infrastructure, eliminating external dependencies that could cause outages. You can deploy across multiple instances, regions, or availability zones using standard load balancing and failover strategies.
Yes, you can train custom speech-to-text models on Picovoice Console to optimize Cheetah Streaming Speech-to-Text for specific industries, terminologies, or use cases. This includes medical terminology, legal language, technical jargon, or company-specific vocabulary.
Cheetah Streaming Speech-to-Text can be configured to handle multiple languages through separate instances or language-specific models. Enterprise Plan customers can work with Picovoice Consulting to get custom algorithms trained to get multilingual models or automatic language detection capabilities.
Cheetah Streaming Speech-to-Text currently supports English, French, German, Italian, Portuguese, and Spanish.
Reach out to Picovoice Sales to tell us about your commercial endeavor.
Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice AI, Picovoice technology, and how to start building transcription products. Enterprise customers get dedicated support specific to their applications from Picovoice Product & Engineering teams. While Picovoice customers reach out to their contacts, prospects can also purchase Enterprise Support before committing to any paid plan.