Eagle Speaker Recognition
C API

API Reference for the Eagle C SDK.

pv_eagle_profiler_t

typedef struct pv_eagle_profiler pv_eagle_profiler_t;

Struct representing the profiler component of the Eagle Speaker Recognition engine.

pv_eagle_profiler_init()

pv_status_t pv_eagle_profiler_init(
        const char *access_key,
        const char *model_path,
        const char *device,
        int32_t min_enrollment_chunks,
        float voice_threshold,
        pv_eagle_profiler_t **object);

Creates an instance of the profiler component of the Eagle Speaker Recognition engine. Resources should be cleaned when you are done using the pv_eagle_profiler_delete() function.

Parameters

access_key const char * : AccessKey obtained from Picovoice Console.
model_path const char * : Absolute path to the file containing model parameters (.pv).
device char * : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
min_enrollment_chunks int Minimum number of chunks to be processed before enroll returns 100%. The value should be a number greater than or equal to 1. A higher number results in more accurate profiles at the cost of needing more data to create the profile.
voice_threshold float Sensitivity threshold for detecting voice. The value should be a number within [0, 1]. A higher threshold increases detection confidence values at the cost of potentially missing frames of voice.
object pv_eagle_profiler_t * * : Constructed instance of the Eagle profiler.

Returns

pv_status_t : Status code.

pv_eagle_profiler_delete()

void pv_eagle_profiler_delete(pv_eagle_profiler_t *object);

Releases resources acquired by the Eagle profiler.

Parameters

object pv_eagle_profiler_t * : Eagle profiler object.

pv_eagle_profiler_enroll()

pv_status_t pv_eagle_profiler_enroll(
        pv_eagle_profiler_t *object,
        const int16_t *pcm,
        float *percentage);

Enrolls a speaker. This function should be called multiple times with different utterances of the same speaker until percentage reaches 100.0. Any further enrollment can be used to improve the speaker voice profile. The required frame length of audio samples can be obtained by calling pv_eagle_profiler_frame_length(). The audio data used for enrollment should satisfy the following requirements:

only one speaker should be present in the audio
the speaker should be speaking in a normal voice (i.e. not whispering or shouting)
the audio should contain no speech from other speakers and no other sounds (e.g. music)
it should be captured in a quiet environment with no background noise

Parameters

object pv_eagle_profiler_t * : Eagle profiler object.
pcm int16_t : A frame of audio samples. The number of samples per frame can be attained by calling pv_eagle_profiler_frame_length(). The incoming audio needs to have a sample rate equal to pv_sample_rate() and be 16-bit linearly-encoded. Eagle operates on single-channel audio.
percentage float : Percentage of enrollment progress. When this value reaches 100.0, the enrollment process is complete.

Returns

pv_status_t : Status code.

pv_eagle_profiler_flush()

pv_status_t pv_eagle_profiler_flush(
        pv_eagle_profiler_t *object,
        float *percentage);

Marks the end of the audio stream, flushes internal state of the object, and returns the percentage of enrollment completed.

Parameters

object pv_eagle_profiler_t * : Eagle profiler object.
percentage float : Percentage of enrollment progress. When this value reaches 100.0, the enrollment process is complete.

Returns

pv_status_t : Status code.

pv_eagle_profiler_frame_length()

int32_t pv_eagle_profiler_frame_length(void);

Getter for number of audio samples per frame.

Returns

int32_t : Frame length.

pv_eagle_profiler_export()

pv_status_t pv_eagle_profiler_export(
        const pv_eagle_profiler_t *object,
        void *speaker_profile);

Exports the speaker profile to a buffer. The exported profile can be used in pv_eagle_init() or stored for later use.

Parameters

object pv_eagle_profiler_t * : Eagle profiler object.
speaker_profile void * : Buffer where the speaker profile will be stored. Must be pre-allocated with a size obtained by calling pv_eagle_profiler_export_size().

Returns

pv_status_t : Status code.

pv_eagle_profiler_reset()

pv_status_t pv_eagle_profiler_reset(pv_eagle_profiler_t *object);

Resets the EagleProfiler object and removes all enrollment data. It must be called before enrolling a new speaker.

Parameters

object pv_eagle_profiler_t * : Eagle profiler object.

Returns

pv_status_t : Status code.

pv_eagle_t

typedef struct pv_eagle pv_eagle_t;

Struct representing the recognizer component of the Eagle Speaker Recognition engine.

pv_eagle_init()

pv_status_t pv_eagle_init(
        const char *access_key,
        const char *model_path,
        const char *device,
        float voice_threshold,
        pv_eagle_t **object);

Creates an instance of the recognizer component of the Eagle Speaker Recognition engine. Resources should be cleaned when you are done using the pv_eagle_delete() function.

Parameters

access_key const char * : AccessKey obtained from Picovoice Console.
model_path const char * : Absolute path to the file containing model parameters (.pv).
device char * : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
voice_threshold float Sensitivity threshold for detecting voice. The value should be a number within [0, 1]. A higher threshold increases detection confidence values at the cost of potentially missing frames of voice.
object pv_eagle_t * : Constructed instance of the Eagle engine.

Returns

pv_status_t : Status code.

pv_eagle_delete()

void pv_eagle_delete(pv_eagle_t *object);

Releases the resources acquired by the Eagle engine.

Parameters

object pv_eagle_t * : Eagle object.

Returns

pv_status_t : Status code.

pv_eagle_process()

pv_status_t pv_eagle_process(
        pv_eagle_t *object,
        const int16_t *pcm,
        int32_t num_samples,
        void **speaker_profiles,
        int32_t num_speakers,
        float **scores);

Processes a frame of the incoming audio stream. The minimum number of required samples can be obtained by calling pv_eagle_process_min_audio_length_samples().

Parameters

object pv_eagle_t * : Eagle object.
pcm int16_t : Audio data. The required sample rate can be attained by calling pv_sample_rate(). The required audio format is 16-bit linearly-encoded single-channel PCM. The minimum audio length required for enrollment can be attained by calling pv_eagle_process_min_audio_length_samples().
num_speakers int32_t : Number of speakers to enroll.
speaker_profiles void ** : Speaker profiles. This can be created using the EagleProfiler object and its related functions.
num_samples int32_t : Number of audio samples in pcm.
scores float ** : Similarity scores for each enrolled speaker. Must be pre-allocated with a size equal to the number of enrolled speakers. The scores are in the range [0, 1] with 1 being a perfect match.

Returns

pv_status_t : Status code.

pv_eagle_process_min_audio_length_samples()

pv_status_t pv_eagle_process_min_audio_length_samples(
        const pv_eagle_t *object,
        int32_t *num_samples);

Gets the minimum length of the input pcm required by pv_eagle_process().

Parameters

object pv_eagle_t * : Eagle object.

Returns

pv_status_t : Status code.

pv_eagle_version()

const char *pv_eagle_version(void);

Getter for version.

Returns

const char * : Eagle version.

pv_eagle_list_hardware_devices()

pv_status_t pv_eagle_list_hardware_devices(
        char ***hardware_devices,
        int32_t *num_hardware_devices);

Gets a list of hardware devices that can be specified when calling pv_eagle_init().

Parameters

hardware_devices const char * * * : Array of available hardware devices. Devices are NULL terminated strings. The array must be freed using pv_eagle_free_hardware_devices().
num_hardware_devices int32_t * : The number of devices in the hardware_devices array.

Returns

pv_status_t : Returned status code.

pv_eagle_free_hardware_devices()

void pv_eagle_free_hardware_devices(
        char ***hardware_devices,
        int32_t *num_hardware_devices);

This function frees the memory allocated by pv_eagle_list_hardware_devices().

Parameters

hardware_devices const char * * * : Array of available hardware devices allocated by pv_eagle_list_hardware_devices().
num_hardware_devices int32_t * : The number of devices in the hardware_devices array

pv_sample_rate()

int32_t pv_sample_rate(void);

Audio sample rate accepted by Eagle.

Returns

int32_t : Sample rate.

pv_status_t

typedef enum {
    PV_STATUS_SUCCESS = 0,
    PV_STATUS_OUT_OF_MEMORY,
    PV_STATUS_IO_ERROR,
    PV_STATUS_INVALID_ARGUMENT,
    PV_STATUS_STOP_ITERATION,
    PV_STATUS_KEY_ERROR,
    PV_STATUS_INVALID_STATE,
    PV_STATUS_RUNTIME_ERROR,
    PV_STATUS_ACTIVATION_ERROR,
    PV_STATUS_ACTIVATION_LIMIT_REACHED,
    PV_STATUS_ACTIVATION_THROTTLED,
    PV_STATUS_ACTIVATION_REFUSED
} pv_status_t;

Status code enum.

pv_status_to_string()

const char *pv_status_to_string(pv_status_t status);

Parameters

status int32_t : Status code.

Returns

const char * : String representation of status code.

pv_get_error_stack()

pv_status_t pv_get_error_stack(
        char ***message_stack,
        int32_t *message_stack_depth);

If a function returns a failure (any pv_status_t other than PV_STATUS_SUCCESS), this function can be called to get a series of error messages related to the failure. This function can only be called only once per failure status on another function. The memory for message_stack must be freed using pv_free_error_stack.

Regardless of the return status of this function, if message_stack is not NULL, then message_stack contains valid memory. However, a failure status on this function indicates that future error messages may not be reported.

Parameters

message_stack const char * * * : Array of messages relating to the failure. Messages are NULL terminated strings. The array and messages must be freed using pv_free_error_stack().
message_stack_depth int32_t * : The number of messages in the message_stack array.

Returns

pv_status_t : Returned status code.

pv_free_error_stack()

void pv_free_error_stack(char **message_stack);

This function frees the memory used by error messages allocated by pv_get_error_stack().

Parameters

message_stack const char * * * : Array of messages relating to the failure.

Was this doc helpful?

Issue with this doc?

Eagle Speaker Recognition C API

Eagle Speaker Recognition
C API