Real-time text-to-speech streaming for conversational AI
type
field.
requestId
: Unique ID for tracking this requesttext
: Text to synthesize (max 10,000 characters)voiceId
: Voice to use (e.g., “v_meklc281” for Urdu female)outputFormat
: Audio format (optional, defaults to PCM_22050_16)message
event:
Format | Description | Use Case |
---|---|---|
PCM_22050_16 | Raw PCM, 22.05kHz, 16-bit | Direct audio processing |
MP3_22050_32 | MP3, 22.05kHz, 32kbps | Small file size, web |
MP3_22050_128 | MP3, 22.05kHz, 128kbps | High quality streaming |
WAV_22050_32 | WAV, 22.05kHz, 32-bit | Lossless audio |
ULAW_8000_8 | μ-law, 8kHz, 8-bit | Telephony systems |
v_meklc281
- Urdu femalev_8eelc901
- Info/Educationv_30s70t3a
- Nostalgic Newsv_yypgzenx
- Dada Jee (storytelling)Code | Description | Action |
---|---|---|
auth_failed | Invalid API key | Check your API key |
synthesis_failed | TTS service error | Retry with backoff |
duplicate_request | Request ID already used | Use unique IDs |
rate_limit_exceeded | Too many requests | Slow down requests |
text_too_long | Text > 10,000 chars | Split into chunks |
Use Unique Request IDs
Maintain Single Connection
Buffer Audio Chunks
Handle Reconnection