Skip to main content
The Realtime Assistants API is currently in beta. Features and specifications may change as we continue to improve the platform.

What are Realtime Assistants?

Realtime Assistants are AI-powered voice agents that can engage in natural, real-time conversations with users. The key benefit is easily creating voice assistants with UpliftAI models, agent hosting, and WebRTC delivery in frontend, mobile, or web apps. They provide:
  • End-to-end latency of ~1 second for natural conversations (depends on model choices etc.)
  • Natural conversation flow with interruption handling
  • Multi-modal capabilities supporting voice, text, and custom tools
  • Dynamic configuration for real-time behavior updates
    • update tools available for the agent mid session
    • completely update the agent prompt mid session
  • Scalable infrastructure supporting thousands of concurrent sessions

Key Features

Use Cases

Realtime Assistants are perfect for:
  • Customer Support - 24/7 voice-enabled support agents
  • Virtual Receptionists - Automated call handling and routing
  • Educational Tutors - Interactive learning experiences
  • Healthcare Assistants - Patient intake and appointment scheduling
  • Sales Agents - Product demonstrations and lead qualification
  • Personal Assistants - Task management and information retrieval

Architecture Overview

Provider Support

Speech-to-Text (STT)

  • Groq Whisper (recommended for Pakistani languages, whisper-large-v3)
  • Deepgram (nova-3 recommended for English)
  • OpenAI: gpt-4o-transcribe or gpt-4o-mini-transcribe
  • UpliftAI The best Pakistani STT coming soon!

Text-to-Speech (TTS)

  • UpliftAI Orator (ultra-fast, natural voices - supports Urdu, Sindhi, Balochi)
  • OpenAI (standard voices), use model gpt-4o-mini-tts

Language Models (LLM)

  • Groq (recommended)
    • openai/gpt-oss-120b (best quality)
    • openai/gpt-oss-20b (faster responses)
  • OpenAI GPT-4 (alternative)

Getting Started

1

Create an Assistant

Use the API or platform to create your first assistant configuration
curl -X POST https://api.upliftai.org/v1/realtime-assistants \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Dream Psychologist",
    "description": "Understands the mysterious world of your dreams",
    "config": {
      "agent": {
        "instructions": "You are a dream interpreter who helps people understand their dreams.",
        "initialGreeting": true,
        "greetingInstructions": "Salam! How may I assist you today?"
      },
      "stt": {
        "default": {
          "provider": "groq",
          "model": "whisper-large-v3",
          "language": "en"
        }
      },
      "tts": {
        "default": {
          "provider": "upliftai",
          "voiceId": "v_meklc281",
          "outputFormat": "MP3_22050_32"
        }
      },
      "llm": {
        "default": {
          "provider": "groq",
          "model": "openai/gpt-oss-120b"
        }
      },
      "session": {
        "ttl": 1800
      }
    }
  }'
2

Create a Session

Generate a session token for your client application
curl -X POST https://api.upliftai.org/v1/realtime-assistants/{assistantId}/createSession \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "participantName": "User"
  }'
3

Connect Your Client

Use our SDKs to connect to the session
import { UpliftAIRoom } from '@upliftai/assistants-react';

<UpliftAIRoom
  token={sessionToken}
  serverUrl={wsUrl}
  connect={true}
  audio={true}
>
  <YourAssistantUI />
</UpliftAIRoom>

Next Steps

I