client.tts.connect()
Opens a persistent WebSocket connection for low-latency, real-time TTS streaming. Supports multiplexed concurrent streams over a single connection. Use one connection per conversation/user session. Defaults to PCM_22050_16 output format.
Connection Setup
import UpliftAI from '@upliftai/sdk-js';
import { writeFileSync } from 'fs';
const client = new UpliftAI({
apiKey: 'your-api-key',
});
const ws = await client.tts.connect();
console.log(`Connected! Session ID: ${ws.sessionId}`);
ws.on('error', (err) => console.error('WS Error:', err));
ws.on('close', (code, reason) => console.log(`WS Closed: ${code} ${reason}`));
Single Stream
const audioChunks: Buffer[] = [];
const stream = ws.stream({
text: 'ویب ساکٹ سے سلام',
voiceId: 'v_meklc281',
outputFormat: 'PCM_22050_16',
});
for await (const event of stream) {
switch (event.type) {
case 'audio_start':
console.log(`audio_start (requestId: ${event.requestId})`);
break;
case 'audio':
audioChunks.push(event.audio);
process.stdout.write(
`\r chunks: ${audioChunks.length}, bytes: ${audioChunks.reduce((s, c) => s + c.length, 0)}`
);
break;
case 'audio_end':
console.log(`\n audio_end (requestId: ${event.requestId})`);
break;
case 'error':
console.error(` error: ${event.code} - ${event.message}`);
break;
}
}
const audio = Buffer.concat(audioChunks);
writeFileSync('output.pcm', audio);
Real-time Voice Agent Pattern
Pipe LLM output sentence-by-sentence into the WebSocket for continuous speech:
const ws = await client.tts.connect();
// LLM streams tokens -> your tokenizer emits complete sentences
for await (const sentence of tokenizeSentences(llmStream)) {
const stream = ws.stream({ text: sentence, voiceId: 'v_meklc281' });
for await (const event of stream) {
if (event.type === 'audio') player.write(event.audio);
}
}
// User interrupts mid-response
ws.cancelAll(); // stops all in-flight audio immediately
ws.close();
We will be building a context-aware streaming solution in the future, so you don’t have to worry about tokenization and sentence breaking. Stay tuned!
Concurrent Streams
Send multiple requests over the same connection — they are multiplexed and processed concurrently.
const s1 = ws.stream({ text: 'پہلا جملہ', voiceId: 'v_meklc281' });
const s2 = ws.stream({ text: 'دوسرا جملہ', voiceId: 'v_meklc281' });
console.log(`Active streams: ${ws.activeStreams}`); // 2
async function consumeStream(s: AsyncIterable<TTSStreamEvent>, label: string) {
const chunks: Buffer[] = [];
for await (const event of s) {
if (event.type === 'audio') chunks.push(event.audio);
}
const total = Buffer.concat(chunks);
console.log(` ${label}: ${total.length} bytes`);
return total;
}
await Promise.all([
consumeStream(s1, 'stream 1'),
consumeStream(s2, 'stream 2'),
]);
Cancel / Barge-in
Cancel in-flight streams when the user interrupts mid-response:
ws.stream({ text: 'یہ جملہ منسوخ ہو جائے گا', voiceId: 'v_meklc281' });
ws.stream({ text: 'یہ بھی منسوخ ہو جائے گا', voiceId: 'v_meklc281' });
console.log(`Active streams before cancel: ${ws.activeStreams}`);
ws.cancelAll(); // stops all in-flight audio immediately
console.log(`Active streams after cancel: ${ws.activeStreams}`); // 0
ws.close();
Stream Events
| Event | Fields | Description |
|---|
audio_start | requestId, timestamp | Synthesis has started |
audio | requestId, sequence, audio | Audio chunk (Buffer) |
audio_end | requestId, timestamp | Synthesis complete |
error | requestId, code, message | Error occurred |
WebSocket Properties
| Property | Type | Description |
|---|
activeStreams | number | Number of in-flight streams |
readyState | string | connecting · open · closing · closed |
sessionId | string | Current session identifier |
WebSocket Methods
| Method | Description |
|---|
stream(request) | Start a new TTS stream (returns async iterable) |
cancelAll() | Cancel all in-flight streams |
close() | Close the connection |
on('error', fn) | Listen for connection errors |
on('close', fn) | Listen for connection close |