Understanding synchronous vs asynchronous text-to-speech
Method | Best For | Response Time | How It Works |
---|---|---|---|
Sync /text-to-speech | Direct playback, small texts | ~500ms-2s total | Returns complete audio in response |
Streaming /text-to-speech/stream | Real-time playback through your server | ~300ms first chunk | Streams audio chunks through your server |
Async /text-to-speech-async | Bots, webhooks, CDN delivery | Instant (returns URL) | Returns URL, complete audio available in 1-2s |
Async Streaming /text-to-speech/stream-async | Frontend streaming without proxy | Instant (returns URL) | Returns URL, ~300ms first chunk when retrieved |
/text-to-speech-async
):/text-to-speech/stream-async
):/text-to-speech/stream
):/text-to-speech
):"v_meklc281"
- Urdu female voice"v_8eelc901"
- Info/Education"v_30s70t3a"
- Nostalgic News"v_yypgzenx"
- Dada Jee (storytelling)MP3_22050_64
- Best for messaging apps (smaller files)MP3_22050_128
- Best quality/size balanceWAV_22050_32
- When you need lossless audioULAW_8000_8
- For telephony systems