Async TTS Concepts

Learn when to use async text-to-speech for optimal performance in your applications.

Choosing the Right Approach

Quick Rule: Use async TTS when you don’t want audio data passing through your server.

Comparison

Method	Best For	Response Time	How It Works
Sync `/text-to-speech`	Direct playback, small texts	~500ms-2s total	Returns complete audio in response
Streaming `/text-to-speech/stream`	Real-time playback through your server	~300ms first chunk	Streams audio chunks through your server
Async `/text-to-speech-async`	Bots, webhooks, CDN delivery	Instant (returns URL)	Returns URL, complete audio available in 1-2s
Async Streaming `/text-to-speech/stream-async`	Frontend streaming without proxy	Instant (returns URL)	Returns URL, ~300ms first chunk when retrieved

When to Use Each Method

Use Async (`/text-to-speech-async`):

WhatsApp/Telegram bots that need complete audio files
Webhook workflows where you process later
Batch processing multiple texts

Use Async Streaming (`/text-to-speech/stream-async`):

Frontend apps that want streaming without proxy
When you need low latency first-byte delivery (~300ms)
Direct client streaming from CDN
Real-time playback that starts before full generation

Use Regular Streaming (`/text-to-speech/stream`):

When you need to process audio through your server
Adding custom headers or authentication

Use Sync (`/text-to-speech`):

Simple, one-time conversions
Small texts with immediate playback

How Async TTS Works

Simple Example

WhatsApp Bot Integration

import requests
import json

def send_voice_to_whatsapp(text: str, phone_number: str):
    # Step 1: Get audio URL from Uplift AI
    response = requests.post(
        "https://api.upliftai.org/v1/synthesis/text-to-speech-async",
        headers={
            'Authorization': 'Bearer YOUR_API_KEY',
            'Content-Type': 'application/json'
        },
        json={
            "voiceId": "v_meklc281",  # Urdu female voice
            "text": text,
            "outputFormat": "MP3_22050_64"  # Smaller for WhatsApp
        }
    )
    
    result = response.json()
    audio_url = f"https://api.upliftai.org/v1/synthesis/stream-audio/{result['mediaId']}?token={result['token']}"
    
    # Step 2: Send URL directly to WhatsApp
    whatsapp_response = requests.post(
        "https://graph.facebook.com/v17.0/YOUR_PHONE_ID/messages",
        headers={'Authorization': 'Bearer WHATSAPP_TOKEN'},
        json={
            "messaging_product": "whatsapp",
            "to": phone_number,
            "type": "audio",
            "audio": {"link": audio_url}  # Direct URL - no download needed!
        }
    )
    
    return whatsapp_response.json()

Key Benefits

No Proxy Needed

Audio goes directly from Uplift AI to your users

Instant Response

Get URL immediately, audio generates in background

Secure Access

JWT tokens ensure only authorized access

CDN Ready

URLs work with any CDN or caching layer

Voice & Format Options

Use the same voice IDs and output formats as regular TTS:

Output Formats

MP3_22050_64 - Best for messaging apps (smaller files)
MP3_22050_128 - Best quality/size balance
WAV_22050_32 - When you need lossless audio
ULAW_8000_8 - For telephony systems

Next Steps

API Reference

See the full API documentation

Voice Samples

Listen to available voices

Regular TTS

Learn about sync TTS

Getting Started

Core Concepts

Orator API Endpoints

Scribe API Endpoints

Choosing the Right Approach

Comparison

When to Use Each Method

Use Async (`/text-to-speech-async`):

Use Async Streaming (`/text-to-speech/stream-async`):

Use Regular Streaming (`/text-to-speech/stream`):

Use Sync (`/text-to-speech`):

How Async TTS Works

Simple Example

WhatsApp Bot Integration

Key Benefits

No Proxy Needed

Instant Response

Secure Access

CDN Ready

Voice & Format Options

Output Formats

Next Steps

API Reference

Voice Samples

Regular TTS

Getting Started

Core Concepts

Orator API Endpoints

Scribe API Endpoints

​Choosing the Right Approach

​Comparison

​When to Use Each Method

​Use Async (/text-to-speech-async):

​Use Async Streaming (/text-to-speech/stream-async):

​Use Regular Streaming (/text-to-speech/stream):

​Use Sync (/text-to-speech):

​How Async TTS Works

​Simple Example

​WhatsApp Bot Integration

​Key Benefits

No Proxy Needed

Instant Response

Secure Access

CDN Ready

​Voice & Format Options

​Output Formats

​Next Steps

API Reference

Voice Samples

Regular TTS

Choosing the Right Approach

Comparison

When to Use Each Method

Use Async (`/text-to-speech-async`):

Use Async Streaming (`/text-to-speech/stream-async`):

Use Regular Streaming (`/text-to-speech/stream`):

Use Sync (`/text-to-speech`):

How Async TTS Works

Simple Example

WhatsApp Bot Integration

Key Benefits

Voice & Format Options

Output Formats

Next Steps