2026 Roadmap

  1. Frontier Speech AI models

    Universal 3.1 and 3.2 Pro ship in Q2 and Q3 across async and realtime, followed by Universal 4 Pro Realtime in Q4 targeting the lowest end-to-end turn latency and best-in-class voice-agent audio handling. Universal TTS 1 ships in Q3, completing the in-house voice AI stack. Universal 1 Duplex (Preview), our native speech-to-speech model, follows in Q1 2027. Native-language coverage jumps from 6 to 30+, with noise cancellation, streaming PII redaction, and self-hosted realtime.

  2. Voice AI infrastructure platform

    One API for every voice AI model — ours and the best open and community ones. Voice Agent API ships in Q2 with Twilio integration, and the Edge Voice Agent Platform in Q3 adds agent management, session storage, and edge functions. Community Models span Async STT, Streaming STT, LLM Gateway, and TTS, so customers get the right model for every language, domain, and price point without ever leaving the platform.

Async #

Upcoming

Recently shipped

  • Universal 3 Pro Async Timestamp Improvements #Major improvement to Universal-3 Pro’s timestamp calculation, delivering median precision gains of 15.3% for English and 8.6% for non-English, with P99 improvements of 15.0% and 58.4% respectively.
  • Hebrew & Swedish #Major accuracy gains in Hebrew and Swedish via community-model integrations. Word error rates dropped 37% and 47%.
  • Medical Mode #An LLM-powered correction pass for medical terminology (drug names, procedures, clinical entities). On our medical benchmark, it achieves a 4.97% error rate versus 7.32% for the next-best vendor. Available as an add-on to Universal-3 Pro in English, Spanish, German, French, Portuguese, and Italian.
  • PII Audio Redaction using Silence #Redact PII with silence instead of a beep. Reduces listener fatigue when redacted audio is replayed at scale in call-center and compliance workflows.
  • Universal 3 Pro Async #Promptable speech-to-text with natural-language and custom-vocabulary prompts, mid-sentence language switching across six core languages, and audio tagging.
  • Improved Short-Audio Diarization #19% better speaker-count accuracy and 6% lower speaker-attributed word error rate on audio under two minutes.
  • Multichannel Diarization #Per-channel speaker labels for multi-microphone recordings. Eliminates crosstalk ambiguity in call-center and meeting audio.

Realtime #

Upcoming

Recently shipped

  • Medical Mode #An LLM-powered correction pass for medical terminology. 4.97% error rate versus 7.32% for the next-best vendor. Available in both async and streaming on universal-realtime-3-pro.
  • Streaming Diarization v1.5 #Speaker-aware sentence splitting for cleaner segmentation. 4–5% lower word error rate, 56% fewer phantom speakers, and clear gains on the CallHome and AMI speaker-labeling benchmarks.
  • Universal 3 Pro Realtime #Realtime speech-to-text with inline streaming speaker labeling, custom vocabulary prompts up to 1,000 words, audio tagging, filler-word control, mid-sentence language switching, and 99+ language support via Whisper routing for long-tail languages. EU region support.
  • Whisper Streaming #The first community model in our streaming API, shipped alongside Universal 3 Pro Realtime.
  • Edge Routing and Data Zone Endpoints #Global low-latency routing with US/EU data-residency endpoints. No additional charge.

Voice Agents #

Upcoming

Recently shipped

  • Voice Agent Preview #First public release of end-to-end voice AI. Combines universal-realtime-3-pro, LLM Gateway, and text-to-speech on LiveKit.

TTS #

Upcoming

Speech Understanding #

Upcoming

LLM Gateway #

Upcoming

Recently shipped

  • Claude Opus 4.7 #Anthropic’s most capable model, available through the Gateway on day one.
  • Automatic Model Fallbacks #The Gateway retries failed requests against a configurable fallback model. Single-provider outages no longer surface as customer-facing failures.
  • Qwen3, Qwen3 Next, Kimi K2.5 #Three new high-capability models added to the catalog.
  • Claude Sonnet 4.6 #Anthropic’s best price-performance frontier model at release.
  • Claude Opus 4.5 and 4.6 #Anthropic’s most capable models, available through the Gateway on day one.

Open Benchmarks #

Upcoming

Developer Experience #

Upcoming

Recently shipped

  • AssemblyAI Skill for AI Coding Agents #Claude Code, Cursor, and Codex now ship with a native AssemblyAI skill. It gives them accurate knowledge of our API out of the box and cuts hallucinated API usage in agent-generated code.
  • Shareable Playground Transcripts #One-click shareable links to Playground output. Trivial to show off a transcript or hand one off for internal review.