Reliability

Why this section exists

Podonos looks simple from the outside: upload audio, get a report. Behind that interface, every evaluation runs through layers of sourcing, screening, quality control, bias minimization, and statistical analysis that we have engineered over years. This section documents what we actually do — so you can trust the numbers and design better experiments on top of them.

The pillars

Evaluators

How we source, qualify, and filter the humans who rate your audio.

In-Session Quality Control

Acoustic environment, fatigue, attention, reliability, and automatic audio sanity checks.

Bias Minimization

Order shuffling, anchoring, and loudness normalization to keep results clean.

Evaluation Design & Review

Science-backed templates and human review of your custom evaluations.

Recommendations

Best practices for audio length, instructions, and anchors.

How a Podonos evaluation flows

Sourcing & qualification

Evaluators come from vetted partner pools. Each candidate is pre-screened for hearing capability, language proficiency, and instruction-following ability.

Per-session checks

Every time an evaluator joins a session, we measure their acoustic environment and re-verify their setup. No headphones? No quiet room? They are rejected before they rate a single file.

Smart assignment

Our algorithm splits your evaluation into subsessions sized to fit within the 45–60 minute fatigue limit, then assigns evaluators per subsession to hit your requested votes-per-query.

During the session

Attention tests are embedded throughout. Audio order is shuffled. Anchors are pinned next to the rating scale. A mid-session break is mandatory.

Post-session reliability

We score each evaluator’s reliability against the cohort and automatically drop unreliable evaluators. New evaluators are recruited to backfill until your votes-per-query target is hit.

Statistical analysis

Aggregated, anchored, normalized, and ready to read in your Workspace.

The customer-facing knobs are simple: language, evaluator count, votes per query, evaluation type. Everything else on this page is automatic.

Evaluators

⌘I

Overview

How We Do It

Best Practices

Why this section exists

The pillars

Evaluators

In-Session Quality Control

Bias Minimization

Evaluation Design & Review

Recommendations

How a Podonos evaluation flows

Overview

How We Do It

Best Practices

Documentation Index

​Why this section exists

​The pillars

Evaluators

In-Session Quality Control

Bias Minimization

Evaluation Design & Review

Recommendations

​How a Podonos evaluation flows

Why this section exists

The pillars

How a Podonos evaluation flows