Skip to main content

Documentation Index

Fetch the complete documentation index at: https://podonos.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Intro

In case you are building a new AI model that can speak like Elon Musk or Taylor Swift. Now you wonder how similar the generated output is to the real human in their voice. Here, the similarity means their tone, prosody, and word articulation. similarity

Example

In this example, we will assume you have your own AI model for generating human voice similar to the given human voice. See an example below.
import podonos
from podonos import *

client = podonos.init()
etor = client.create_evaluator(
    name='Taylor Swift voice similarity',
    desc='How similar voice can my AI model generate to Taylor Swift?',
    type='SMOS', num_eval=10)

original_speech_path = ['ts0.wav', 'ts1.wav', 'ts2.wav']
generated_speech_path = ['ts0_gen.wav', 'ts1_gen.wav', 'ts2_gen.wav']

for org, syn in zip(real_speech_path, generated_speech_path):
    org_file = File(path=org, model_tag='real', tags=['female'])
    syn_file = File(path=syn, model_tag='model1', tags=['female', 'Taylor Swift'])
    etor.add_files(file0=org_file, file1=syn_file)

etor.close()