Now you know how to get overall rating statistics for each file or files per evaluation types. You may wonder why the evaluators think so.
Additional question we can ask to the evaluators is “why do you think so?”
With additional configuration, you can ask the evaluators why they give such ratings. For example, if the rating is low, you can ask the details on such a low rating.
import podonosfrom podonos import *import boto3# Generate a sample speech file.script = "Hello, how is your day going?"polly_client = boto3.Session().client('polly')response = polly_client.synthesize_speech( VoiceId='Brian', OutputFormat='mp3', Text=script, Engine='neural')filename = '/path/to/1.mp3'file = open(filename, 'wb')file.write(response['AudioStream'].read())file.close()client = podonos.init()etor = client.create_evaluator( name="Speech AI Preferences Test", desc="Preference test between speech synthesis models", type="NMOS", lan="en-us", num_eval=10, use_annotation=True # You set this to True)etor.add_file( File(path='/path/to/1.mp3', model_tag='Polly', tags=["Brian", "neural"], script=script))etor.close()
For this feature, you need to set use_annotation=True when creating an
Evaluator object. Also, you need to provide the script when adding files.
This feature is only available for single stimulus evaluations.
Once the evaluation finishes, please click the analysis tab. In the evaluation, you can see the files on the top:Please click one of the files. Then, you will see the original text and
marked words or phrases where the evaluators left reasoning behind each rating with detailed descriptions like