This endpoint initiates text-to-speech synthesis and immediately returns a mediaId and token. The audio is generated asynchronously and can be retrieved using the returned credentials.
When to use this endpoint:
For best results with Urdu, use Urdu script. For English words within Urdu text, use ASCII characters. Example: “یہ ایک exerted force ہے”
The generated audio URL can be shared directly with end users or services without proxying through your server.
API key with format "Bearer sk_api_..."
Request for asynchronous text-to-speech synthesis
The text to synthesize
2500"سلام، آپ اِس وقت اوریٹر کی آواز سن رہے ہیں۔"
Format of the output audio. Wav files are usually 10x larger, we recommend using MP3 or OGG for best compression results while maintaining quality.
PCM_22050_16, WAV_22050_16, WAV_22050_32, MP3_22050_32, MP3_22050_64, MP3_22050_128, OGG_22050_16, ULAW_8000_8 Identifier for the voice to use. Named voices: v_meklc281 (Urdu female), v_8eelc901 (Info/Edu), v_kwmp7zxt (Gen Z), v_yypgzenx (Dada Jee), v_30s70t3a (Nostalgic News)
"v_meklc281"
Optional ID of a phrase replacement configuration to apply
Successfully initiated audio synthesis