Stream text to speech Example JS

async function streamTextToSpeech() {
  try {
    const response = await fetch('https://api.upliftai.org/v1/synthesis/text-to-speech/stream', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
      },
      body: JSON.stringify({
        voiceId: "v_8eelc901",
        text: "سلام، آپ اِس وقت اوریٹر کی آواز سن رہے ہیں۔",
        outputFormat: "MP3_22050_128"
      })
    });

    if (!response.ok) {
      throw new Error(`HTTP error! Status: ${response.status}`);
    }

    // Get the reader from the stream
    const reader = response.body.getReader();
    
    // Process each chunk as it arrives
    while (true) {
      const { done, value } = await reader.read();
      
      if (done) {
        break;
      }
      
      // Process each chunk
      // you can stream this audio to your to your clients etc.
    }
    
  } catch (error) {
    console.error('Error:', error);
  }
}

This response does not have an example.

Orator API Endpoints

Stream Text to Speech

Converts the provided text to speech audio using the specified voice, response is streamed with “Transfer-Encoding” “chunked” header. We currently aim the p90 first chunk latency to be around 300ms in Pakistan, we are also actively working reducing this.

For best results, we expect you to use Urdu script. To get better pronounciation of English words, use ASCII characters for them. Example “یہ ایک exerted force ہے”

Returns the audio data directly in the response.

POST

synthesis

text-to-speech

stream

Stream text to speech Example JS

async function streamTextToSpeech() {
  try {
    const response = await fetch('https://api.upliftai.org/v1/synthesis/text-to-speech/stream', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
      },
      body: JSON.stringify({
        voiceId: "v_8eelc901",
        text: "سلام، آپ اِس وقت اوریٹر کی آواز سن رہے ہیں۔",
        outputFormat: "MP3_22050_128"
      })
    });

    if (!response.ok) {
      throw new Error(`HTTP error! Status: ${response.status}`);
    }

    // Get the reader from the stream
    const reader = response.body.getReader();
    
    // Process each chunk as it arrives
    while (true) {
      const { done, value } = await reader.read();
      
      if (done) {
        break;
      }
      
      // Process each chunk
      // you can stream this audio to your to your clients etc.
    }
    
  } catch (error) {
    console.error('Error:', error);
  }
}

This response does not have an example.

Authorizations

Authorization

string

header

required

API key with format "Bearer sk_api_..."

Body

application/json

Request for text-to-speech synthesis

voiceId

enum<string>

required

Identifier for the voice to use. Options include v_8eelc901 (Info/Edu), v_kwmp7zxt (Gen Z), v_yypgzenx (Dada Jee), v_30s70t3a (Nostalgic News)

Available options:

v_8eelc901,

v_kwmp7zxt,

v_yypgzenx,

v_30s70t3a

text

string

required

The text to synthesize

Maximum length: 2500

outputFormat

enum<string>

required

Format of the output audio. Wav files are usually 10x larger, we recommend using MP3 or OGG for best compression results while maintaining quality.

Available options:

WAV_22050_16,

WAV_22050_32,

MP3_22050_32,

MP3_22050_64,

MP3_22050_128,

OGG_22050_16,

ULAW_8000_8

phraseReplacementConfigId

string

Optional ID of a phrase replacement configuration to apply

Response

Successful audio synthesis

The response is of type file.

Text to Speech Async Text to Speech

⌘I

Getting Started

Core Concepts

Orator API Endpoints

Scribe API Endpoints

Stream Text to Speech

Authorizations

Body

Response