Hi, this is Sonetho. ⚡
Quick recap of the January 13 ElevenLabs webinar. ElevenLabs is no longer just "an AI voice service" — they're positioning as an all-in-one AI creative platform.
From Studio 3.0 (with top-tier video models like Sora 2 and Veo 3) to Scribe v2 (more accurate than the human ear) — deep-dive on everything they announced.

1. Studio 3.0 — All-in-One Creation
Headline drop: Studio 3.0. Core idea: workflow consolidation.
🎥 Studio 3.0's 3 big innovations
- Top-tier video models integrated: Google Veo 3, OpenAI Sora 2, Kling, Ideogram — the best video/image generators are now inside ElevenLabs Studio. No separate subscriptions needed.
- One-stop timeline: enter text → [TTS + SFX + BGM + captions + video] all auto-generate on one timeline.
- Inline editing: don't like one section? Drag and edit just that part, no full re-generation.
This isn't an incremental feature — it's the output of strategic partnerships with Disney, NVIDIA, and Adobe.
2. Scribe v2 — #1 STT Accuracy Globally
Scribe v2 stomps the existing STT field. The announced WER (Word Error Rate) data:
| Model | WER | Note |
|---|---|---|
| ElevenLabs Scribe v2 | 2.2% | #1 globally |
| GPT-4o Transcribe | 2.7% | - |
| Gemini 1.5 Pro | 3.0% | - |
| Deepgram Nova 3 | 6.9% | - |
* Lower = better (averaged across English/French/Spanish)
Scribe v2 killer features:
- Audio Event Tagging: tag laughter, applause, footsteps as text events
- Smart Diarization: identifies speakers even on overlapping speech
- Word-level Timestamps: per-word timing for perfect caption sync
3. Enterprise security & scaling
🔒 Security & Compliance
- SOC 2 / ISO 27001 — top-tier infosec certs
- Zero Retention — optional, no data persists server-side
- GDPR-compliant — full EU data protection alignment
🤝 Collaboration
- Project sharing + approval workflows across teams
- Granular access control for internal teams and agencies
4. [Q&A] Live questions from the session
Q. When does V3 ship?
A. Final stages. Late January, at latest mid-February.
Q. Can we control breath / pitch after generation?
A. Yes — heavily requested. Fine-tune parameters for post-generation tweaking are in research, update coming.
Q. Localized UI roadmap?
A. Yes — multi-language UI landing this year. No more juggling translators.
Bottom line: imagination → reality, faster
Webinar's core message: "Just imagine. AI handles production." One text prompt now generates video + voice + sound simultaneously.
If you want to ride this wave first, try Studio 3.0 now.
Try Studio 3.0 — 50% off first month
▶ Start now📚 Related
Sonetho. ⚡