ElevenLabs Dubbing v2 Review: Demon Slayer Comparison (v1 vs v2)

Experience a firsthand review of ElevenLabs Dubbing v2 using 'Demon Slayer: Mugen Train.' Unlike v1, which required manual per-clip cloning, v2 automatically handles tone, intonation, and acting nuances without a studio. We compare performance, optimize speaker similarity settings, and address translation pitfalls for specific terms (e.g., 'Oni' to 'Demon'). Discover how v2 revolutionizes AI dubbing workflows in this in-depth analysis.

⚡ 3-Minute Summary

No more manual studio editing: Tone, intonation, and performance are captured automatically—a significant leap beyond v1.

Hands-on test: We auto-dubbed a scene from Demon Slayer: Mugen Train into English to put the tech to the test.

Creator revolution: The era where you can scale your content globally while retaining your original voice is officially here.

Welcome back to Sonetho! ⚡

When we announced the release of Dubbing v2, we promised to put it to the test ourselves. We kept our word. We took a clip from Demon Slayer: Mugen Train and ran it through the Dubbing v2 engine.
To put it bluntly: The performance is stellar. That missing 2% that previously prevented a truly "human" feel? It's not just filled—it's surpassed.

▲ ElevenLabs Dubbing v2 · Auto-dubbing (Japanese to English) · Speaker similarity set to 7


The "v1 Struggles"

As we discussed in our previous post on Anime Dubbing Cloning (Clip vs. Track vs. IVC), the workflow in the v1 era was quite labor-intensive:

  • Auto-dubbing often suffered from sync issues and questionable translations, making post-production essential.

  • We had to manually slice the original audio to create specific clones for every character.

  • Inconsistent tone: Because input clips were limited, the AI occasionally struggled to maintain a consistent character voice across different scenes.

  • Trial-and-error: We had to regenerate clips repeatedly to capture the right take, and even then, the performance could feel slightly mechanical.

In short, it wasn't quite "AI dubbing for you"; it was more "you dubbing with AI assistance."


v2: Studio-grade results without the studio

Honestly? v2 blew us away.

Without any manual adjustments in a professional booth, and with just a single pass:

  • It perfectly captures the original speaker's tone and inflection.

  • It delivers nuanced emotional performance with incredible accuracy.

  • The tedious manual slicing and cloning process is now a relic of the past.

Compared to v1, this is a massive generational leap.
Give the video above a listen—the "uncanny" AI quality has virtually vanished.


The Secret Sauce: Speaker Similarity

In the Advanced settings, you’ll find a slider called 'Speaker similarity.'
This lets you control the balance between "how much the output sounds like the original speaker" and "how natural it flows in the target language."

For our Demon Slayer test, we used the default setting of 7 (on a scale of 0 to 10).

Speaker similarity 7 settings screen

▲ Set to 7 — The ideal balance between natural target-language flow and the original speaker's vocal profile.

You can push this slider to either extreme. We tested the same scene at 0 and 10—see the results below.

Value

Result

0 (Natural Focus)

Greater emotional range; feels more like a seasoned voice actor’s performance. Slightly less resemblance to the original timbre.

7 (Recommended)

The "sweet spot" for combining natural cadence with character identity (Recommended: 4–7).

10 (Similarity Focus)

Maximum similarity to the original speaker. Performance may sound flatter or more robotic in the target language.

🔊 Speaker similarity 0 — Maximizing Natural Flow

Speaker similarity 0 settings screen

▲ Setting to 0 — Produces the most natural flow for the target language.

Surprisingly, 0 performed exceptionally well.
The emotional delivery felt broader and more theatrical, reminiscent of a professional dubbing actor. While it mimics the original voice slightly less, the overall quality of the English dub felt superior.

🔊 Speaker similarity 10 — Maximizing Original Identity

Speaker similarity 10 settings screen

▲ Setting to 10 — Closest to the original tone, but can sacrifice emotional range.

Conversely, 10 felt a bit stiffer.
By forcing the AI to strictly adhere to the original cadence, it lost some of the natural melodic qualities of the English language. We recommend using 10 only when maintaining the exact vocal profile is non-negotiable; otherwise, aim for 4–7 for a high-quality production.

🎬 0 vs 10 — Hear the difference

▲ Compare the same scene with 0 (Natural) vs 10 (Similarity) to hear how it affects the acting.


A quick note: Check your proper nouns

The translation quality is impressive, but you should always double-check your proper nouns.

For instance, in Demon Slayer, the term 'Oni' is officially localized as 'Demons.' Occasionally, an AI might translate it as 'Goblins' or 'Ogres'—which, while technically similar, misses the established terminology fans expect. 😅

Human oversight is still a superpower. We recommend a quick script review to ensure your terminology remains canon-accurate.

💡 Note: Dubbing v2 is currently in its stable launch phase, and we are constantly refining the Dubbing Studio interface. You'll soon have even more control to customize translations and correct terminology directly within the dashboard!


What this means for the industry

When this level of quality is just a few clicks away, it’s more than just an update—it’s a shift in the media landscape.

  • Dubbing market: The gap in speed and cost-efficiency between professional human dubbing and AI is widening rapidly.

  • Creators: This is a massive win for scalability.

  • Global expansion: You can now launch your content in 90+ languages without hiring a massive international cast, all while maintaining your original vocal identity.

If you're serious about taking your YouTube channel or SaaS content global, this is no longer just an "experiment"—it's a requirement.


Want to try it yourself?

Get started with 30 minutes of free dubbing on Creator plans and above (15 mins for Starter, 1 min for Free). That’s more than enough to test your own clips using the same quality we demonstrated today.

🎬 Test Dubbing v2 for Free

※ The link above is an official Sonetho partner link (no extra cost to you).

📚 More Reads

ElevenLabs Dubbing v2: 90+ Languages, Original Performance

Official Announcement · Full v2 Feature List

The 99% Sync Rate Secret: 3 Methods for Anime Dubbing

Pro Tips · Comparing v1 Cloning vs. v2 Dubbing

Complete Guide to ElevenLabs Dubbing

ElevenLabs Tips · Dubbing Workflow

🚀 Final Thoughts

The "robotic" era is officially behind us. While you’ll still want to provide a human touch for final quality control, the starting point has fundamentally changed. Your content is ready to go global—with your own voice leading the way.

Happy creating!
Sonetho ⚡