Quickstart

In this quickstart, we’ll create a voice AI agent using LiveKit that can have real-time conversations with users. This example demonstrates how to integrate LMNT into LiveKit’s multimodal agent framework.

Set up your project

Create a project directory

mkdir livekit-lmnt-agent && cd livekit-lmnt-agent

Set up a virtual environment

macOS/Linux
Windows

python -m venv venv
source venv/bin/activate

Install dependencies

pip install livekit-agents[lmnt,deepgram,openai,silero,turn-detector] python-dotenv

Configure the environment

Create a file named .env in your project directory and add:

LMNT_API_KEY=your_lmnt_api_key
LIVEKIT_URL=wss://your-livekit-server.com
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
DEEPGRAM_API_KEY=your_deepgram_api_key
OPENAI_API_KEY=your_openai_api_key

Replace the placeholder values with your actual API keys:

Get your LMNT API key from the LMNT playground
Set up a LiveKit server or use LiveKit Cloud
Get your Deepgram API key from Deepgram Console
Get your OpenAI API key from OpenAI Platform

Create the agent

Create a file named agent.py:

from dotenv import load_dotenv
from livekit import agents
from livekit.agents import AgentSession, Agent
from livekit.plugins import (
    openai,
    lmnt,
    deepgram,
    silero,
)
from livekit.plugins.turn_detector.multilingual import MultilingualModel

load_dotenv()


class VoiceAssistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions=(
                "You are a helpful voice assistant. "
                "Keep your responses concise and conversational. "
                "Avoid using punctuation that doesn't translate well to speech."
            )
        )


async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        stt=deepgram.STT(model="nova-2", language="en-US"),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=lmnt.TTS(
            voice="leah",   # Voice ID from LMNT library
        ),
        vad=silero.VAD.load(),  # Voice activity detection
        turn_detection=MultilingualModel(),  # Contextual turn detection
        preemptive_generation=True,  # Preemptive generation for faster response times
    )

    await session.start(
        room=ctx.room,
        agent=VoiceAssistant(),
    )

    await session.generate_reply(
        instructions="Greet the user and ask how you can help them today."
    )


if __name__ == "__main__":
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

Run the agent

Start your agent:

python agent.py dev

The agent will connect to your LiveKit server and wait for participants to join rooms. When someone joins a room, the agent will automatically start a conversation.

Understanding the code

Let’s examine the key components:

Agent Class Definition

class VoiceAssistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions=(
                "You are a helpful voice assistant. "
                "Keep your responses concise and conversational. "
                "Avoid using punctuation that doesn't translate well to speech."
            )
        )

The agent class defines the personality and behavior of your voice assistant.

LMNT TTS Configuration

tts=lmnt.TTS(
    model="blizzard",           # High-quality TTS model
    voice="leah",               # Voice ID from LMNT library
    language="en",              # ISO 639-1 language code
    temperature=0.7,            # Speech expressiveness (0.3-1.0)
    top_p=0.9,                 # Speech generation stability
)

The LMNT TTS service supports these parameters:

model: TTS model (default: “blizzard”)
voice: Voice ID from LMNT’s voice library
language: Two-letter ISO 639-1 language code
temperature: Controls expressiveness - lower values (0.3) for neutral speech, higher (1.0) for dynamic range
top_p: Controls stability - lower values for consistency, higher for flexibility

Agent Session Pipeline

session = AgentSession(
    stt=deepgram.STT(model="nova-2"),  # Speech-to-text
    llm=openai.LLM(model="gpt-4o-mini"),  # Language model
    tts=lmnt.TTS(...),                 # Text-to-speech with LMNT
    vad=silero.VAD.load(),             # Voice activity detection
    turn_detection=MultilingualModel(), # Contextual turn detection
    preemptive_generation=True, # Preemptive generation for faster response times
)

This creates a complete STT-LLM-TTS pipeline with:

Speech recognition with Deepgram Nova-2 model
Language generation with OpenAI GPT-4o-mini
Voice synthesis with LMNT
Voice activity detection for turn-taking

Customize your agent

Try these modifications to enhance your agent:

Change the voice

Update the voice ID to use a different LMNT voice:

tts=lmnt.TTS(
    model="blizzard",
    voice="morgan",  # British female voice
    language="en",
    temperature=0.7,
    top_p=0.9,
)

Find more voices at LMNT’s voice library.

Add multilingual support

Configure the TTS for different languages:

tts=lmnt.TTS(
    model="blizzard",
    voice="your_voice_id",
    language="es",  # Spanish
    temperature=0.7,
    top_p=0.9,
)

Update your agent instructions to match the target language.

Fine-tune speech characteristics

Adjust expressiveness and stability:

tts=lmnt.TTS(
    model="blizzard",
    voice="leah",
    language="en",
    temperature=0.4,  # More neutral speech
    top_p=0.7,        # More consistent delivery
)

temperature: 0.3 (neutral) to 1.0 (expressive)
top_p: Lower values for consistency, higher for flexibility

Customize the personality

Modify the agent instructions to change behavior:

class CustomerServiceAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions=(
                "You are a helpful customer service agent for a tech company. "
                "Be friendly, professional, and concise. "
                "Always ask how you can help and provide clear solutions."
            )
        )

Testing your agent

To test your agent:

Make sure your LiveKit server is running
Clone LiveKit’s frontend example and run it with your livekit room credentials
Join a room - your agent will automatically connect and start the conversation
Speak naturally and experience real-time voice interactions

Overview

Getting Started (AI Tools)

Getting Started (SDKs)

Guides

Integrations

Migrations

Set up your project

Create a project directory

Set up a virtual environment

Install dependencies

Configure the environment

Create the agent

Run the agent

Understanding the code

Agent Class Definition

LMNT TTS Configuration

Agent Session Pipeline

Customize your agent

Testing your agent

Next steps

LiveKit Documentation

LMNT Voice Library

Overview

Getting Started (AI Tools)

Getting Started (SDKs)

Guides

Integrations

Migrations

​Set up your project

​Create a project directory

​Set up a virtual environment

​Install dependencies

​Configure the environment

​Create the agent

​Run the agent

​Understanding the code

​Agent Class Definition

​LMNT TTS Configuration

​Agent Session Pipeline

​Customize your agent

​Testing your agent

​Next steps

LiveKit Documentation

LMNT Voice Library

Set up your project

Create a project directory

Set up a virtual environment

Install dependencies

Configure the environment

Create the agent

Run the agent

Understanding the code

Agent Class Definition

LMNT TTS Configuration

Agent Session Pipeline

Customize your agent

Testing your agent

Next steps