Intro
When comparing three or more speech synthesis models, a ranking evaluation is an effective method for determining the relative quality of each model. Rather than comparing pairs individually, evaluators listen to a set of audio samples generated from the same script and rank them from best to worst. The Ranking evaluation is flexible in its evaluation criteria. You can rank models based on naturalness, overall preference, clarity, expressiveness, or any other quality dimension that matters to your use case.- Objective: Determine the relative ordering of multiple models by having evaluators rank them.
- Use Case: Ideal for comparing TTS providers, model versions, or synthesis configurations side by side.
- Type:
RANKINGin the SDK.
Example
In this example, we compare three different TTS providers by generating speech from the same scripts and submitting them for ranking evaluation. Here is a code example that you can immediately execute:python
Generate speech and add ranking sets
For each script, generate speech from all providers and add a ranking set. Each ranking set contains one audio file per provider.
python

