[Case Study] How Resemble AI Used Podonos to Benchmark Chatterbox

AI model evaluation isn’t just a final step. It’s the launchpad.

That’s the approach Resemble AI took when preparing to opensource the Chatterbox, their newest text-to-speech (TTS) model. Determined to release not only a high-performance model but also one backed by transparent benchmarks, the Resemble AI team used Podonos’ evaluation solution to put the Chatterbox on the ring. See the evaluation report.

[Eleven Labs VS Chatterbox by Resemble AI]


The Challenge: Proving Readiness Before Open Source

Many voice AI players say, “Our model sounds great. Just listen to our samples.” But without accurate measurements, such claims remain subjective and rely heavily on guesswork. In reality, most models tend to be “guesstimated,” which introduces significant bias. By conducting in-depth analysis and publishing evaluation reports, teams can provide a clear, data-driven picture of how well their models truly perform.

For Resemble AI, it wasn’t enough to say that Chatterbox performed well compared to others. Internally, they believed the model could compete with leading alternatives like Eleven Labs. But without third-party evaluation, it would be hard to establish trust with the broader AI community.


The Solution: Fast & Automated, Human-Centric Evaluation on Podonos

Podonos provided the ideal platform to benchmark Chatterbox. With our evaluation service, Resemble AI was able to:

  • Compare Chatterbox head-to-head against Eleven Labs in a controlled A/B test

  • Evaluate on real-world use cases and nuanced prompts

  • Receive detailed feedback from diverse, trusted evaluators

  • Get results in less than 12 hours

Podonos' workflow made it easy to set up, customize, and launch the evaluation process without the usual headaches of contractor management, pipeline setup, and manual analysis.


[Click image to see the full report]


The Outcome: Data-Backed Confidence to Go Open

The results spoke for themselves. With clear strengths in naturalness, Chatterbox earned competitive marks that validated the model’s release.

Armed with this data, Resemble AI confidently open sourced Chatterbox on both Github and Hugging Face, inviting the global AI community to explore, adopt, and build upon their work.


Launch with Confidence

Evaluating AI models is one of the most critical steps in improving their performance. In Resemble AI’s case, fast and accurate evaluation not only helped enhance their model’s quality, but also boosted trust by leveraging Podonos’ transparent and reliable benchmarking. They didn’t just release a TTS model. They released it with credibility.

Other readings

Product Update: Podonos Wizard launch

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

July 28, 2025

|

2 min read

Why Post-Refining Matters in Voice AI: Making Sense of Raw Evaluation Data

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

July 21, 2025

|

2 min read

Prescreening Human Evaluators: The First Step Toward Reliable Voice AI Evaluation

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

July 7, 2025

|

3 min read

Podonos TTS Voice AI Model Evaluation Multilanguage
Podonos TTS Voice AI Model Evaluation Multilanguage

Beyond English: Expanding TTS Evaluation into Multi-languages

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

June 19, 2025

|

2 min read

Gemini vs ElevenLabs Podonos Voice AI Evaluation
Gemini vs ElevenLabs Podonos Voice AI Evaluation

Gemini 2.5 TTS vs. ElevenLabs: A Side-by-side Performance

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

June 12, 2025

|

2 min read

Image
Image

Evaluate leading text-to-speech models – US English

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

November 24, 2024

|

4 min read

Teal Flower
Teal Flower

Podonos joins Google for AI Academy program

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

October 18, 2024

|

1 min read

Pink Flower
Pink Flower

Speech Synthesis Performance: OpenAI Text To Speech for Korean

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

September 23, 2024

|

3 min read

Yellow Flower
Yellow Flower

Podonos joins NVidia Inception program

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

August 1, 2024

|

1 min read

Purple Flower
Purple Flower

What is subjective audio evaluation?

Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.

June 3, 2024

|

3 min read

Ready to unlock the potential of your voice AI Model?

Ready to unlock the potential of your voice AI Model?

Improve your model with trust

Improve your model with trust