Leopard Speech-to-Text

On-device voice to text transcription built for regulated industries

Lightweight, enterprise-grade on-device STT that converts voice to text with cloud-level accuracy while meeting the strictest compliance requirements.

Press the button
to start transcribing with Leopard
What is Leopard Speech-to-Text?

Leopard Speech-to-Text is a software that converts audio and video recordings into text with cloud-level accuracy without sending them to the cloud. Leopard Speech-to-Text is compliant with any regulations, including GDPR, and HIPAA as it processes voice data offline, without transmitting it to 3rd party platforms.

Get started with just a few lines of code

1o = pvleopard.create(access_key)
2
3transcript, words =
4 o.process_file(path)
1const o = new Leopard(accessKey)
2
3const { transcript, words } =
4 o.processFile(path)
1Leopard o = new Leopard.Builder()
2 .setAccessKey(accessKey)
3 .setModelPath(modelPath)
4 .build(appContext);
5
6LeopardTranscript r =
7 o.processFile(path);
1let o = Leopard(
2 accessKey: accessKey,
3 modelPath: modelPath)
4
5let r = o.processFile(path)
1Leopard o = new Leopard.Builder()
2 .setAccessKey(accessKey)
3 .build();
4
5LeopardTranscript r =
6 o.processFile(path);
1Leopard o =
2 Leopard.Create(accessKey);
3
4LeopardTranscript result =
5 o.ProcessFile(path);
1const {
2 result,
3 isLoaded,
4 error,
5 init,
6 processFile,
7 startRecording,
8 stopRecording,
9 isRecording,
10 recordingElapsedSec,
11 release,
12} = useLeopard();
13
14await init(
15 accessKey,
16 model
17);
18
19await processFile(audioFile);
20
21useEffect(() => {
22 if (result !== null) {
23 // Handle transcript
24 }
25}, [result])
1Leopard o = await Leopard.create(
2 accessKey,
3 modelPath);
4
5LeopardTranscript result =
6 await o.processFile(path);
1const o = await Leopard.create(
2 accessKey,
3 modelPath)
4
5const {transcript, words} =
6 await o.processFile(path)
1pv_leopard_t *leopard = NULL;
2pv_leopard_init(
3 access_key,
4 model_path,
5 enable_automatic_punctuation,
6 &leopard);
7
8char *transcript = NULL;
9int32_t num_words = 0;
10pv_word_t *words = NULL;
11pv_leopard_process_file(
12 leopard,
13 path,
14 &transcript,
15 &num_words,
16 &words);
1const leopard =
2 await LeopardWorker.
3 fromPublicDirectory(
4 accessKey,
5 modelPath
6 );
7
8const {
9 transcript,
10 words
11} =
12 await leopard.process(pcm);
Why Leopard Speech-to-Text is the best transcription solution for regulated industries

Enterprises in regulated industries, such as healthcare, finance, and defense must meet strict data privacy and retention policies. Cloud-dependent Speech-to-text APIs require enterprises to send their data to a 3rd party cloud, giving away control over their data and products.

Leopard converts voice to text by keeping all voice data local to the device, and complies with HIPAA, GDPR, SOC 2, and other regulations.

Why choose Leopard Speech-to-Text over other Transcription Engines?

Get started with
Leopard Speech-to-Text
The best way to learn about Leopard Speech-to-Text is to use it!
Start Free
  • Pre-trained models
  • Custom vocabulary
  • Keyword boosting
  • Intuitive SDKs
  • Speaker Diarization
  • Trucasing and Punctuation
  • Word-level Confidence Scores
  • Word-level Timestamps
  • English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish

Frequently asked questions

What are the use cases and applications of Speech-to-Text?
  1. Call Center QA & Compliance: Transcribe customer service recordings in bulk to assess agent performance, ensure compliance (HIPAA, PCI, GDPR), and flag risky interactions securely and at scale.
  2. Legal & Law Enforcement Recordings: Processes police interviews, depositions, and court recordings offline to ensure evidence integrity. On-device transcription avoids chain-of-custody issues tied to cloud uploads.
  3. Enterprise Knowledge Management: Automate transcription of internal training sessions, workshops, and engineering reviews into searchable knowledge bases or wiki entries.
What is speech-to-text?

Speech-to-text (STT), also known as Automatic Speech Recognition (ASR) and Open-Domain Large Vocabulary Speech Recognition (LVSR), refers to the technology and methodologies that convert voice data into text.

How does on-device speech-to-text differ from cloud-based speech-to-text?

Cloud-based speech-to-text APIs send voice data to vendors’ servers, where the transcription engine resides. On-device voice processing brings voice recognition to where the voice data resides, eliminating all the unnecessary steps related to cloud processing.

What are the benefits of on-device speech-to-text over cloud speech-to-text APIs?

On-device speech-to-text empowers enterprises to retain ownership and control over their data and products. Sending voice data to the cloud has privacy, latency, reliability, and cost implications. On-device speech-to-text overcomes these challenges, bringing control back to enterprises.

How is Leopard Speech-to-Text faster and better than other on-device speech-to-text models like Whisper?

Most on-device speech-to-text solutions rely on pre-trained models or third-party frameworks like PyTorch, ONNX, or TensorFlow for runtime. This reliance limits fine-tuned optimizations and adds unnecessary overhead, restricting performance and adaptability.

In contrast, Leopard Speech-to-Text is built end-to-end by Picovoice’s team, who develop proprietary training frameworks and inference engines. This complete control allows for deep optimization, enabling Leopard to deliver cloud-level accuracy directly on-device—without the typical latency, power, or memory costs of deep learning models.

As a result, Leopard Speech-to-Text is:

  • Fully customizable
  • Highly efficient
  • Exceptionally accurate
Does Leopard Speech-to-Text support real-time transcription?

Leopard Speech-to-Text doesn’t, but Cheetah Streaming Speech-to-Text does. Cheetah is Picovoice’s on-device streaming speech-to-text engine that provides text output in real time.

Can I use Leopard Speech-to-Text in the cloud?

Yes. You can run Leopard Speech-to-Text in the cloud, whether private, public, or hybrid. Picovoice’s on-device voice recognition technology ensures that data doesn’t have to leave the enterprises’ premises regardless of the platform, instead of making the cloud mandatory. Don’t forget to check tutorials for serverless speech-to-text with AWS Lambda and transcription microservice with gRPC.

Does Leopard Speech-to-Text support Speaker Diarization?

Leopard Speech-to-Text offers an optimized Falcon Speaker Diarization embedded to simplify the development process. Please check Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text perform Trucasing and Punctuation?

Leopard Speech-to-Text performs Trucasing and Punctuation. Please refer to the Leopard Speech-to-Text documentation to enable or disable automatic punctuation.

Does Leopard Speech-to-Text return Word-level Confidence Scores?

Leopard Speech-to-Text returns Word-level Confidence Scores. Please refer to the Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text generate Word-level Timestamps?

Leopard Speech-to-Text generates Word-level Timestamps. Please refer to the Leopard Speech-to-Text documentation for more information.

How do I choose the best speech-to-text for my project?

“Best” is a subjective term. Every use case has different business requirements. Several factors, such as accuracy, availability of features, the total cost of ownership, and data privacy and governance, have different weights in different use cases.

Which platforms does Leopard Speech-to-Text support?
Which languages does Leopard Speech-to-Text support?

Leopard Speech-to-Text supports English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish.

What should I do if I need support for other languages?

Reach out to Picovoice Sales to tell us about your commercial endeavor.

How do I get technical support for Leopard Speech-to-Text?

Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice AI, Picovoice technology, and how to start building transcription products. Enterprise customers get dedicated support specific to their applications from Picovoice Product & Engineering teams. While Picovoice customers reach out to their contacts, prospects can also purchase Enterprise Support before committing to any paid plan.

How can I get informed about updates and upgrades?

Version changes appear in the and LinkedIn. Subscribing to GitHub is the best way to get notified of patch releases. If you enjoy building with Leopard Speech-to-Text, show it by giving a GitHub star!