Skip to main content

client.tts.connect()

Opens a persistent WebSocket connection for low-latency, real-time TTS streaming. Supports multiplexed concurrent streams over a single connection. Use one connection per conversation/user session. Defaults to PCM_22050_16 output format.

Connection Setup

import UpliftAI from '@upliftai/sdk-js';
import { writeFileSync } from 'fs';

const client = new UpliftAI({
  apiKey: 'your-api-key',
});

const ws = await client.tts.connect();
console.log(`Connected! Session ID: ${ws.sessionId}`);

ws.on('error', (err) => console.error('WS Error:', err));
ws.on('close', (code, reason) => console.log(`WS Closed: ${code} ${reason}`));

Single Stream

const audioChunks: Buffer[] = [];

const stream = ws.stream({
  text: 'ویب ساکٹ سے سلام',
  voiceId: 'v_meklc281',
  outputFormat: 'PCM_22050_16',
});

for await (const event of stream) {
  switch (event.type) {
    case 'audio_start':
      console.log(`audio_start (requestId: ${event.requestId})`);
      break;
    case 'audio':
      audioChunks.push(event.audio);
      process.stdout.write(
        `\r  chunks: ${audioChunks.length}, bytes: ${audioChunks.reduce((s, c) => s + c.length, 0)}`
      );
      break;
    case 'audio_end':
      console.log(`\n  audio_end (requestId: ${event.requestId})`);
      break;
    case 'error':
      console.error(`  error: ${event.code} - ${event.message}`);
      break;
  }
}

const audio = Buffer.concat(audioChunks);
writeFileSync('output.pcm', audio);

Real-time Voice Agent Pattern

Pipe LLM output sentence-by-sentence into the WebSocket for continuous speech:
const ws = await client.tts.connect();

// LLM streams tokens -> your tokenizer emits complete sentences
for await (const sentence of tokenizeSentences(llmStream)) {
  const stream = ws.stream({ text: sentence, voiceId: 'v_meklc281' });

  for await (const event of stream) {
    if (event.type === 'audio') player.write(event.audio);
  }
}

// User interrupts mid-response
ws.cancelAll(); // stops all in-flight audio immediately

ws.close();
We will be building a context-aware streaming solution in the future, so you don’t have to worry about tokenization and sentence breaking. Stay tuned!

Concurrent Streams

Send multiple requests over the same connection — they are multiplexed and processed concurrently.
const s1 = ws.stream({ text: 'پہلا جملہ', voiceId: 'v_meklc281' });
const s2 = ws.stream({ text: 'دوسرا جملہ', voiceId: 'v_meklc281' });

console.log(`Active streams: ${ws.activeStreams}`); // 2

async function consumeStream(s: AsyncIterable<TTSStreamEvent>, label: string) {
  const chunks: Buffer[] = [];
  for await (const event of s) {
    if (event.type === 'audio') chunks.push(event.audio);
  }
  const total = Buffer.concat(chunks);
  console.log(`  ${label}: ${total.length} bytes`);
  return total;
}

await Promise.all([
  consumeStream(s1, 'stream 1'),
  consumeStream(s2, 'stream 2'),
]);

Cancel / Barge-in

Cancel in-flight streams when the user interrupts mid-response:
ws.stream({ text: 'یہ جملہ منسوخ ہو جائے گا', voiceId: 'v_meklc281' });
ws.stream({ text: 'یہ بھی منسوخ ہو جائے گا', voiceId: 'v_meklc281' });

console.log(`Active streams before cancel: ${ws.activeStreams}`);

ws.cancelAll(); // stops all in-flight audio immediately

console.log(`Active streams after cancel: ${ws.activeStreams}`); // 0

ws.close();

Stream Events

EventFieldsDescription
audio_startrequestId, timestampSynthesis has started
audiorequestId, sequence, audioAudio chunk (Buffer)
audio_endrequestId, timestampSynthesis complete
errorrequestId, code, messageError occurred

WebSocket Properties

PropertyTypeDescription
activeStreamsnumberNumber of in-flight streams
readyStatestringconnecting · open · closing · closed
sessionIdstringCurrent session identifier

WebSocket Methods

MethodDescription
stream(request)Start a new TTS stream (returns async iterable)
cancelAll()Cancel all in-flight streams
close()Close the connection
on('error', fn)Listen for connection errors
on('close', fn)Listen for connection close