Skip to main content

Documentation Index

Fetch the complete documentation index at: https://podonos.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Intro

The Comparative Similarity (CSMOS) evaluation is designed to assess which of two audio samples is more similar to a reference audio. This evaluation is particularly useful in scenarios where the goal is to match or mimic a reference audio, such as in voice cloning or audio restoration tasks.
  • Objective: Determine the similarity of two audio samples to a reference audio.
  • Use Case: Ideal for applications requiring audio matching or quality assessment against a standard.
  • Type: CSMOS in the SDK.
CSMOS

Example

1

Initialize the Client

Begin by initializing the Podonos client with your API key.
import podonos

client = podonos.init("<API_KEY>")
2

Create the Evaluator

Set up the evaluator for a CSMOS evaluation.
evaluator = client.create_evaluator(
    name="Comparative Similarity Test",
    desc="Evaluate similarity of audio samples to a reference",
    type="CSMOS"
)
CSMOS is not allowed by the create_evaluator_from_template_json method
3

Add Files for Evaluation

Add two audio samples and one reference audio. The reference file must be specified with is_ref=True.
from podonos import File

evaluator.add_files(
    file0=File(path="audio_sample1.wav", model_tag="Sample 1", tags=["test"], is_ref=False),
    file1=File(path="audio_sample2.wav", model_tag="Sample 2", tags=["test"], is_ref=False),
    file2=File(path="reference_audio.wav", model_tag="Reference", tags=["reference"], is_ref=True)
)
  • File Order: Ensure the reference file is the third file in the add_files method.
4

Finalize the Evaluation

Close the evaluator to complete the setup.
evaluator.close()

Key Considerations

  • File Configuration: The reference file must be clearly marked with is_ref=True and should be the last file in the add_files method.
  • Evaluation Logic: The CSMOS evaluation logic will compare the two audio samples against the reference to determine which is more similar.
  • Applications: Useful for tasks like voice cloning, audio restoration, and quality assurance where matching a reference is critical.

Use Case

Consider a scenario where you are developing a new speech synthesis model and want to evaluate how closely the generated audio matches a reference recording. Using CSMOS, you can objectively assess which version of your model produces audio that is more similar to the desired reference.