WebSocket TTS API

Stream text-to-speech audio in real-time using WebSocket connections. Perfect for conversational AI applications that need low-latency audio synthesis.

When to Use WebSocket TTS

Best for: Real-time conversational AI, voice agents, and applications needing continuous TTS streaming with multiple concurrent requests. Visit this tutorial for an implementation.

Key Benefits

Low latency: ~300ms to first audio chunk
Multiple requests: Handle multiple synthesis requests on single connection
Real-time streaming: Audio chunks stream as they’re generated
Persistent connection: Reuse connection for entire conversation

Connection

Endpoint

wss://api.upliftai.org/text-to-speech/multi-stream

Authentication

Connect using your API key:

const socket = io('wss://api.upliftai.org/text-to-speech/multi-stream', {
  auth: {
    token: 'sk_api_your_key_here'
  },
  transports: ['websocket']
});

Message Protocol

All messages use a unified format with a type field.

Client → Server Messages

Synthesize Text

{
  "type": "synthesize",
  "requestId": "unique_request_id",
  "text": "سلام، آپ کیسے ہیں؟",
  "voiceId": "v_meklc281",
  "outputFormat": "MP3_22050_32"
}

Parameters:

requestId: Unique ID for tracking this request
text: Text to synthesize (max 10,000 characters)
voiceId: Voice to use (e.g., “v_meklc281” for Urdu female)
outputFormat: Audio format (optional, defaults to PCM_22050_16)

Cancel Request

{
  "type": "cancel",
  "requestId": "unique_request_id"
}

Server → Client Messages

All server messages come through the message event:

Connection Ready

{
  "type": "ready",
  "sessionId": "session_abc123"
}

Audio Start

{
  "type": "audio_start",
  "requestId": "unique_request_id",
  "timestamp": 1234567890
}

Audio Chunk

{
  "type": "audio",
  "requestId": "unique_request_id",
  "audio": "base64_encoded_audio_data",
  "sequence": 0
}

Audio End

{
  "type": "audio_end",
  "requestId": "unique_request_id",
  "timestamp": 1234567890
}

Error

{
  "type": "error",
  "requestId": "unique_request_id",
  "code": "synthesis_failed",
  "message": "Voice not found"
}

Simple Example

import { io } from 'socket.io-client';

// Connect to WebSocket
const socket = io('wss://api.upliftai.org/text-to-speech/multi-stream', {
  auth: { token: 'sk_api_your_key' },
  transports: ['websocket']
});

// Handle messages
socket.on('message', (data) => {
  switch(data.type) {
    case 'ready':
      console.log('Connected!');
      // Start synthesis
      socket.emit('synthesize', {
        type: 'synthesize',
        requestId: 'req_001',
        text: 'سلام، یہ ایک ٹیسٹ ہے۔',
        voiceId: 'v_meklc281',
        outputFormat: 'MP3_22050_32'
      });
      break;
      
    case 'audio':
      // Decode and play audio chunk
      const audioData = Buffer.from(data.audio, 'base64');
      // Play audioData...
      break;
      
    case 'audio_end':
      console.log('Audio complete!');
      break;
      
    case 'error':
      console.error('Error:', data.message);
      break;
  }
});

Output Formats

Format	Description	Use Case
`PCM_22050_16`	Raw PCM, 22.05kHz, 16-bit	Direct audio processing
`MP3_22050_32`	MP3, 22.05kHz, 32kbps	Small file size, web
`MP3_22050_128`	MP3, 22.05kHz, 128kbps	High quality streaming
`WAV_22050_32`	WAV, 22.05kHz, 32-bit	Lossless audio
`ULAW_8000_8`	μ-law, 8kHz, 8-bit	Telephony systems

Available Voices

Use the same voice IDs as REST API:

v_meklc281 - Urdu female
v_8eelc901 - Info/Education
v_30s70t3a - Nostalgic News
v_yypgzenx - Dada Jee (storytelling)

Error Codes

Code	Description	Action
`auth_failed`	Invalid API key	Check your API key
`synthesis_failed`	TTS service error	Retry with backoff
`duplicate_request`	Request ID already used	Use unique IDs
`rate_limit_exceeded`	Too many requests	Slow down requests
`text_too_long`	Text > 10,000 chars	Split into chunks

Rate Limits

Synthesis requests: 60 per minute per connection
Cancel requests: 100 per minute per connection
Max text length: 10,000 characters per request
Daily limit: Based on your plan

Best Practices

Use Unique Request IDs

Generate unique IDs (like UUIDs) for each synthesis request to track audio chunks properly.

Maintain Single Connection

Keep one WebSocket connection open and reuse it for multiple synthesis requests.

Buffer Audio Chunks

Collect audio chunks before playback for smooth streaming experience.

Handle Reconnection

Implement exponential backoff for reconnection attempts on connection loss.

Testing with wscat

Quick test using command line:

# Install wscat
npm install -g wscat

# Connect
wscat -c wss://api.upliftai.org/text-to-speech/multi-stream \
  -H "Authorization: Bearer sk_api_your_key"

# Send synthesize message
{"type":"synthesize","requestId":"test-1","text":"Hello world","voiceId":"v_meklc281"}

Getting Started

Core Concepts

Orator API Endpoints

Scribe API Endpoints

When to Use WebSocket TTS

Key Benefits

Connection

Endpoint

Authentication

Message Protocol

Client → Server Messages

Synthesize Text

Cancel Request

Server → Client Messages

Connection Ready

Audio Start

Audio Chunk

Audio End

Error

Simple Example

Output Formats

Available Voices

Error Codes

Rate Limits

Best Practices

Testing with wscat

Next Steps

WebSocket Tutorial

Voice Samples

Getting Started

Core Concepts

Orator API Endpoints

Scribe API Endpoints

​When to Use WebSocket TTS

​Key Benefits

​Connection

​Endpoint

​Authentication

​Message Protocol

​Client → Server Messages

​Synthesize Text

​Cancel Request

​Server → Client Messages

​Connection Ready

​Audio Start

​Audio Chunk

​Audio End

​Error

​Simple Example

​Output Formats

​Available Voices

​Error Codes

​Rate Limits

​Best Practices

​Testing with wscat

​Next Steps

WebSocket Tutorial

Voice Samples

When to Use WebSocket TTS

Key Benefits

Connection

Endpoint

Authentication

Message Protocol

Client → Server Messages

Synthesize Text

Cancel Request

Server → Client Messages

Connection Ready

Audio Start

Audio Chunk

Audio End

Error

Simple Example

Output Formats

Available Voices

Error Codes

Rate Limits

Best Practices

Testing with wscat

Next Steps