Skip to main content
The Realtime Assistants API is currently in beta. We’re actively improving performance and adding features based on user feedback.

How Realtime Assistants Work

Realtime Assistants combine several advanced technologies to create seamless voice interactions:
  1. WebRTC Connection - Low-latency audio/video streaming
  2. Speech Processing - Real-time STT and TTS
  3. AI Agent - Intelligent conversation management
  4. Tool Execution - Dynamic function calling via RPC

Key Components

Sessions

A session represents a single conversation between a user and an assistant. Each session:
  • Has a unique room identifier
  • Supports multiple participants (user + agent)
  • Maintains conversation context
  • Can be configured with custom settings
  • Has a configurable TTL (time-to-live)
{
  "token": "eyJ0eXAiOiJKV1...",
  "wsUrl": "wss://upliftai-livekit-url...",
  "roomName": "assistant-room-abc123"
}

Agents

The AI agent is the brain of your assistant. It:
  • Processes user speech in real-time
  • Manages conversation flow
  • Handles interruptions gracefully
  • Executes tools when needed
  • Maintains context throughout the session

Configuration

Each assistant can be configured with:
  • Agent Settings
  • STT Settings
  • TTS Settings
  • LLM Settings
{
  "agent": {
    "instructions": "You are a helpful assistant",
    "initialGreeting": true,
    "greetingInstructions": "Say hello and ask how you can help",
    "tools": []
  }
}

Tools and Functions

Tools extend your assistant’s capabilities. These tools can be executed through RPC communication from the agent to your user device, or external API calls directly from the agent (coming soon.). Your frontend can then execute: Custom Tools:
  • Access external APIs
  • Perform calculations
  • Query databases
  • Execute custom business logic
The agent will also support MCP (Model Context Protocol) Tools:
  • Connect to MCP servers
  • Bridge to enterprise systems and common places like Shopify.

Tool Definition

{
  name: "get_weather",
  description: "Get current weather for a location",
  parameters: {
    type: "object",
    properties: {
      location: {
        type: "string",
        description: "City and state"
      }
    },
    required: ["location"]
  },
  timeout: 10
}

Tool Execution Flow

Dynamic Configuration

Assistants can be updated in real-time without disrupting active sessions. The follow show using React example with @upliftai/assistants-react package:

Update Instructions

Change the assistant’s behavior on the fly
await updateInstruction("You are now a pirate. Speak like one.");

Manage Tools

Add or remove tools dynamically:
// Add a new tool
await addTool(calculatorTool);

// Remove a tool
await removeTool("calculator");

// Update existing tools
await upsertTools([tool1, tool2]);

Connection Lifecycle

1

Session Creation

Client requests a session token from your backend
2

WebRTC Negotiation

Client connects to LiveKit server using the token
3

Agent Join

AI agent joins the room and initializes
4

Conversation

Real-time audio streaming and processing
5

Tool Execution

Agent requests tool execution via RPC when needed
6

Session End

Client disconnects or session expires

Public vs Private Assistants

For agents that are created and public flag is enabled, then a session can be created for them without an API key. This is recommended for public facing agents like on websites etc if user authentication isn’t required. You can use the upliftai widget for public agents. See npm for more options.
<script src="https://cdn.jsdelivr.net/npm/@upliftai/assistant-widget@0.0.1"></script>
<upliftai-assistant 
  assistant-id="YOUR_ASSISTANT_ID">
</upliftai-assistant>
Or if you want more control on the UI, you can create the session in the frontend without API key like:
// Public assistant session
const response = await fetch(
  `https://api.upliftai.org/v1/realtime-assistants/${id}/createPublicSession`,
  {
    method: 'POST',
    body: JSON.stringify({ participantName: 'User' })
  }
);

Performance Optimization

Latency Reduction

  • Geographic Distribution: Agents AND MODELS deployed close to Pakistan
  • Provider Selection: Choose fastest providers for your use case
  • Connection Pooling: Pre-warmed connections to providers

Security Considerations

Always validate and sanitize tool inputs and outputs. Never expose sensitive API keys or credentials in client-side code. Tokens given by UpliftAI using createSession are safe to use.

Best Practices

  1. Authentication: Always use backend session creation for production
  2. Tool Validation: Implement strict input validation in tool handlers. ALWAYS make sure inputs are in the format you expect.

Next Steps

I