Listening interfaces from the ongoing research. Pick a session below.
→Blind A/B — 21 voices
All current candidates rendering the same JP passage. Engines · blind reveal · pipeline tabs.
→Emotion sweep — instruct mode
Qwen3-TTS vs VoxCPM2 across 5 emotions (calm / sad / happy / angry / anxious).
voxcpm_vd_* variants) →
Irodori-TTS v3 (caption, clone, both) → Qwen3-TTS (custom/clone) → Google Chirp3-HD (cloud reference).
Round-3 consensus says Irodori-TTS v3 fixes the fluent-foreigner accent that bit earlier rounds.