Speaker Identification Across Meetings

Know who is speaking by recognizing their voices across recordings

Diarize a recording first and then match each voice against a roster of known speakers to find who spoke when. Recordings and voiceprints are never sent to the 3rd party cloud.

Products used

Falcon Speaker Diarization Eagle Speaker Recognition

Platforms supported

AndroidiOSLinuxmacOSWindowsChromeEdgeFirefoxSafariRaspberry Pi

Start Building

Loved by developers, trusted by enterprises

How speaker identification across meetings is built

Diarize first, recognize second, to identify speakers in meetings

On-device speaker identification across meetings combines Falcon Speaker Diarization and Eagle Speaker Recognition. Speaker diarization and speaker recognition solve different halves of the same problem. Falcon segments a recording into speaker turns without knowing who anyone is. Eagle scores each turn against stored voiceprints to attach a name. Running them in sequence turns anonymous recordings into named and timestamped transcripts.

Why Falcon Speaker Diarization?

Diarization that runs anywhere, decoupled from transcription

221x

less compute than pyannote at comparable accuracy (0.02x vs 4.42x core-hour; 10% vs 9% DER, 20% vs 27% JER)

15x

less memory than pyannote (0.1 GB vs 1.5 GB)

20%

Jaccard Error Rate, beating every cloud API (Amazon 30%, Azure 30%, Google 58%)

Falcon Speaker Diarization takes a meeting recording and returns timestamped segments, each tagged with an anonymous speaker label, answering who spoke when before any name is attached. It is decoupled from transcription, so it slots in alongside any speech-to-text engine you already run, and it does the work on CPU without requiring GPU or cloud processing.

Jaccard Error Rate

lower is better

Falcon20%

pyannote27%

Amazon Transcribe30%

Azure Speech30%

Google Enhanced58%

* Jaccard Error Rate assigns equal weight to each speaker's contribution, regardless of their speech duration.

Diarization Memory Usage

lower is better

Falcon116.8 MB

pyannote1.5 GB

* Measured on AMD Ryzen 7 5700X @ 3.4GHz, 64GB RAM

Why Eagle Speaker Recognition?

Recognize the same voice across every recording, no passphrase

26x

smaller than SpeechBrain (4.5 MB vs 117.5 MB model size)

2.7x

lower error rate than SpeechBrain (0.18% vs 0.49% EER)

3.9x

lower error rate than pyannote (0.18% vs 0.70% EER)

Eagle Speaker Recognition enrolls a speaker from a few seconds of natural speech and exports a voiceprint to a file, with no language or passphrase restriction. Load those voiceprints into later runs and the same people are recognized across meetings, calls, and interviews. Speakers not yet in the roster can be profiled and saved, so the next recording recognizes them automatically. Everything runs on-device, so Picovoice never accesses end-user audio.

Equal Error Rate (EER) — lower is better

Eagle Speaker Recognition0.18%

SpeechBrain Speaker Recognition0.49%

pyannote Speaker Recognition0.70%

* Benchmarked on VoxConverse, a widely used multi-speaker dataset containing real conversations across multiple languages.

Model Size (MB to initialize) — lower is better

Eagle Speaker Recognition4.5 MB

pyannote Speaker Recognition46.5 MB

SpeechBrain Speaker Recognition117.5 MB

Where speaker identification across meetings ships

On-device speaker identification from board meetings to call centers

Meetings

Meeting intelligence and named meeting recaps

Automatically identify and label every speaker in recorded standups, interviews, and board meetings. On-device speaker identification keeps audio private while improving meeting transcripts, notes, action items, and AI-generated recaps.

Legal

Speaker identification for legal depositions and interviews

Use speaker identification to label attorneys, witnesses, investigators, and interview subjects by name across recordings. Create searchable legal transcripts, improve attribution accuracy, and reduce manual review time.

Governance

Speaker attribution for corporate governance and board meetings

Track who said what across board meetings, executive reviews, and governance sessions. Named speaker attribution supports accurate meeting minutes, decision tracking, compliance workflows, and audit-ready records.

Call Center

Call center speaker identification and customer support analytics

Identify agents and recurring callers across support conversations using on-device speaker recognition. Improve call center transcription, quality assurance, customer history tracking, and support analytics while keeping audio private.

Get started

Build an app to identify speakers across meetings: Code example

A complete working recipe in Python. Open-source on GitHub. Runs 100% on-device.

recipe · speaker-identification-across-meetings

Difficulty

Beginner

Runtime

100% on-device

Language

Python

Platforms supported

AndroidiOSLinuxmacOSWindowsChromeEdgeFirefoxSafariRaspberry Pi

Prerequisites

Picovoice AccessKey from Picovoice Console and GitHub Repo Clone.

Usage

These instructions assume your current working directory is recipes/speaker-identification-across-meetings/python.

1

Create a virtual environment

Isolate the recipe's dependencies from your system Python and set up the virtual environment.

2

Activate the virtual environment

Activation makes pip install into .venv instead of system Python.

Linux, macOS, or Raspberry Pi

Windows

3

Install dependencies

Download the Falcon Speaker Diarization and Eagle Speaker Recognition Python SDKs, plus soundfile for reading audio.

4

Create Speaker Profiles from a Meeting

Run the demo with --profile_unknown_speakers to create Eagle speaker profiles for each diarized speaker.

The demo first diarizes the meeting, then creates one Eagle profile per speaker.

5

Identify Known Speakers in Another Meeting

Pass your AccessKey, the audio path, and the known-speaker profiles. The recipe diarizes with Falcon, matches each turn with Eagle, and prints a named, timestamped transcript.

If a diarized speaker matches a known speaker above the similarity threshold, the output uses the known speaker name.

6

Optional: Tune the Similarity Threshold

By default, the similarity threshold is 0.5. Increase it to make identification stricter

Have questions or looking for implementations in other languages? Visit the GitHub pico-cookbook Speaker Identification Across Meetings recipe, where you can find the open-source demo code and create an issue for demo-related technical questions.

On-device AI cookbook examples

More recipes from picoCookbook

Frequently asked questions

FAQ

+

What is the difference between speaker diarization and speaker recognition?

Speaker diarization finds who spoke when and labels speakers anonymously, such as Speaker 1 and Speaker 2. Speaker recognition matches a voice to a known identity; the one-to-many case that names a voice from a roster of enrolled speakers is speaker identification. This recipe runs diarization with Falcon Speaker Diarization, then recognition with Eagle Speaker Recognition, to turn anonymous turns into named ones.

+

Can the same speaker be identified across separate recordings?

Yes. Eagle Speaker Recognition exports each speaker as a compact voiceprint file. Loading the same files into later runs identifies the same people across different meetings, calls, and interviews.

+

How long does enrollment take?

Eagle Speaker Recognition enrolls a speaker from a few seconds of natural speech, with no passphrase or scripted phrase required. Enrollment time varies with how much clear speech is available.

+

Does meeting speaker identification work in any language?

Yes. Both Falcon Speaker Diarization and Eagle Speaker Recognition are language-independent. So both timestamps and voiceprints enrolled once identify who spoke when regardless of the language, accent, or words spoken in a later recording.

+

How many speakers can be identified in a single recording?

Falcon Speaker Diarization groups a recording into distinct speakers automatically, without being told the count in advance, and Eagle Speaker Recognition matches each one against however many known voiceprints you load, so there is no fixed limit.

+

Is a GPU required to identify speakers across meetings?

No. Both engines run on CPU across mobile, desktop, embedded, and single-board computers such as Raspberry Pi.

+

How can I get technical support?

Visit the GitHub pico-cookbook Speaker Identification Across Meetings recipe, where you can find the code and create an issue for the demo-related technical questions.