Cheetah Speech-to-Text
Python API

API Reference for the Python Cheetah SDK (PyPI).

pvcheetah.`create()`

def create(
        access_key: str,
        model_path: Optional[str] = None,
        device: Optional[str] = None,
        library_path: Optional[str] = None,
        endpoint_duration_sec: Optional[float] = None,
        enable_automatic_punctuation: bool = False) -> Cheetah

Factory method for Cheetah Speech-to-Text engine.

Parameters

access_key str : AccessKey obtained from Picovoice Console.
model_path Optional[str] : Absolute path to the file containing model parameters.
device Optional[str] : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
library_path Optional[str] : Absolute path to Cheetah's dynamic library.
endpoint_duration_sec Optional[float] : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set to None to disable endpoint detection.
enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.

Returns

Cheetah: An instance of Cheetah Speech-to-Text engine.

Throws

CheetahError

pvcheetah.`available_devices()`

def available_devices(library_path: Optional[str] = None) -> Sequence[str]

Lists all available devices that Cheetah can use for inference. Each entry in the list can be the device argument of create() factory method or Cheetah constructor.

Parameters

library_path Optional[str] : Absolute path to Cheetah's dynamic library. If not set it will be set to the default location.

Returns

Sequence[str]: List of all available devices that Cheetah can use for inference.

Throws

CheetahError

pvcheetah.Cheetah

class Cheetah(object)

Class for the Cheetah Speech-to-Text engine. Cheetah can be initialized either using the module level create() function or directly using the class __init__() method. Resources should be cleaned when you are done using the delete() method.

pvcheetah.Cheetah.`version`

self.version: str

The version string of the Cheetah library.

pvcheetah.Cheetah.`frame_length`

self.frame_length: int

The number of audio samples per frame that Cheetah accepts.

pvcheetah.Cheetah.`sample_rate`

self.sample_rate: int

The audio sample rate the Cheetah accepts.

pvcheetah.Cheetah.`init()`

def __init__(
        self,
        access_key: str,
        model_path: str,
        device: str,
        library_path: str,
        endpoint_duration_sec: Optional[float] = 1.0,
        enable_automatic_punctuation: bool = False) -> Cheetah

__init__ method for Cheetah Speech-to-Text engine.

Parameters

access_key str : AccessKey obtained from Picovoice Console.
model_path str : Absolute path to the file containing model parameters.
library_path str : Absolute path to Cheetah's dynamic library.
device str : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
endpoint_duration_sec float : Duration of endpoint in seconds.
enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.

Returns

Cheetah: An instance of Cheetah Speech-to-Text engine.

Throws

CheetahError

pvcheetah.Cheetah.`delete()`

def delete(self)

Releases resources acquired by Cheetah.

pvcheetah.Cheetah.`process()`

def process(self, pcm: Sequence[int]) -> Tuple[str, bool]

Processes a frame of audio and returns newly-transcribed text and a flag indicating if an endpoint has been detected. Upon detection of an endpoint, the client may invoke .flush() to retrieve any remaining transcription.

The number of samples per frame can be attained by calling .frame_length. The incoming audio needs to have a sample rate equal to .sample_rate and be 16-bit linearly-encoded. Furthermore, Cheetah operates on single-channel audio.

Parameters

pcm Sequence[int] : A frame of audio samples.

Returns

Tuple[str, bool] : Any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.

Throws

CheetahError

pvcheetah.Cheetah.`flush()`

def flush(self) -> str

Marks the end of the audio stream, flushes internal state of the object, and returns any remaining transcribed text.

Returns

str : Any remaining transcribed text. If none is available then an empty string is returned.

Throws

CheetahError

pvcheetah.CheetahError

class CheetahError(Exception)

Error thrown if an error occurs within Cheetah Speech-to-Text engine.

Exceptions

class CheetahActivationError(CheetahError)
class CheetahActivationLimitError(CheetahError)
class CheetahActivationRefusedError(CheetahError)
class CheetahActivationThrottledError(CheetahError)
class CheetahIOError(CheetahError)
class CheetahInvalidArgumentError(CheetahError)
class CheetahInvalidStateError(CheetahError)
class CheetahKeyError(CheetahError)
class CheetahMemoryError(CheetahError)
class CheetahRuntimeError(CheetahError)
class CheetahStopIterationError(CheetahError)

Was this doc helpful?

Issue with this doc?

Cheetah Speech-to-Text Python API

Cheetah Speech-to-Text
Python API