[Case Study] How Resemble AI Used Podonos to Benchmark Chatterbox
AI model evaluation isn’t just a final step. It’s the launchpad.
That’s the approach Resemble AI took when preparing to opensource the Chatterbox, their newest text-to-speech (TTS) model. Determined to release not only a high-performance model but also one backed by transparent benchmarks, the Resemble AI team used Podonos’ evaluation solution to put the Chatterbox on the ring. See the evaluation report.
[Eleven Labs VS Chatterbox by Resemble AI]
The Challenge: Proving Readiness Before Open Source
Many voice AI players say, “Our model sounds great. Just listen to our samples.” But without accurate measurements, such claims remain subjective and rely heavily on guesswork. In reality, most models tend to be “guesstimated,” which introduces significant bias. By conducting in-depth analysis and publishing evaluation reports, teams can provide a clear, data-driven picture of how well their models truly perform.
For Resemble AI, it wasn’t enough to say that Chatterbox performed well compared to others. Internally, they believed the model could compete with leading alternatives like Eleven Labs. But without third-party evaluation, it would be hard to establish trust with the broader AI community.
The Solution: Fast & Automated, Human-Centric Evaluation on Podonos
Podonos provided the ideal platform to benchmark Chatterbox. With our evaluation service, Resemble AI was able to:
Compare Chatterbox head-to-head against Eleven Labs in a controlled A/B test
Evaluate on real-world use cases and nuanced prompts
Receive detailed feedback from diverse, trusted evaluators
Get results in less than 12 hours
Podonos' workflow made it easy to set up, customize, and launch the evaluation process without the usual headaches of contractor management, pipeline setup, and manual analysis.

[Click image to see the full report]
The Outcome: Data-Backed Confidence to Go Open
The results spoke for themselves. With clear strengths in naturalness, Chatterbox earned competitive marks that validated the model’s release.
Armed with this data, Resemble AI confidently open sourced Chatterbox on both Github and Hugging Face, inviting the global AI community to explore, adopt, and build upon their work.
Launch with Confidence
Evaluating AI models is one of the most critical steps in improving their performance. In Resemble AI’s case, fast and accurate evaluation not only helped enhance their model’s quality, but also boosted trust by leveraging Podonos’ transparent and reliable benchmarking. They didn’t just release a TTS model. They released it with credibility.
Other readings
Evaluate leading text-to-speech models – US English
Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.
November 24, 2024
|
4 min read
Podonos joins Google for AI Academy program
Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.
October 18, 2024
|
1 min read
Speech Synthesis Performance: OpenAI Text To Speech for Korean
Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.
September 23, 2024
|
3 min read
Podonos joins NVidia Inception program
Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.
August 1, 2024
|
1 min read
What is subjective audio evaluation?
Quickly uncover deep insights into your voice AI's strengths and drive faster development, smarter marketing, and flawless delivery.
June 3, 2024
|
3 min read