In this quickstart, we'll create a voice AI agent using LiveKit that can have real-time conversations with users. This example demonstrates how to integrate LMNT into LiveKit's multimodal agent framework.
Set up your project
Create a project directory
mkdir livekit-lmnt-agent && cd livekit-lmnt-agentSet up a virtual environment
python -m venv venv
source venv/bin/activateInstall dependencies
pip install livekit-agents[lmnt,deepgram,openai,silero,turn-detector] python-dotenvConfigure the environment
Create a file named .env in your project directory and add:
LMNT_API_KEY=your_lmnt_api_key
LIVEKIT_URL=wss://your-livekit-server.com
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
DEEPGRAM_API_KEY=your_deepgram_api_key
OPENAI_API_KEY=your_openai_api_keyReplace the placeholder values with your actual API keys:
- Get your LMNT API key from the LMNT playground
- Set up a LiveKit server or use LiveKit Cloud
- Get your Deepgram API key from Deepgram Console
- Get your OpenAI API key from OpenAI Platform
Create the agent
Create a file named agent.py:
from dotenv import load_dotenv
from livekit import agents
from livekit.agents import AgentSession, Agent
from livekit.plugins import (
openai,
lmnt,
deepgram,
silero,
)
from livekit.plugins.turn_detector.multilingual import MultilingualModel
load_dotenv()
class VoiceAssistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions=(
"You are a helpful voice assistant. "
"Keep your responses concise and conversational. "
"Avoid using punctuation that doesn't translate well to speech."
)
)
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(
stt=deepgram.STT(model="nova-2", language="en-US"),
llm=openai.LLM(model="gpt-4o-mini"),
tts=lmnt.TTS(
voice="leah", # Voice ID from LMNT library
),
vad=silero.VAD.load(), # Voice activity detection
turn_detection=MultilingualModel(), # Contextual turn detection
preemptive_generation=True, # Preemptive generation for faster response times
)
await session.start(
room=ctx.room,
agent=VoiceAssistant(),
)
await session.generate_reply(
instructions="Greet the user and ask how you can help them today."
)
if __name__ == "__main__":
agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))Run the agent
Start your agent:
python agent.py devThe agent will connect to your LiveKit server and wait for participants to join rooms. When someone joins a room, the agent will automatically start a conversation.
Understanding the code
Let's examine the key components:
Agent Class Definition
class VoiceAssistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions=(
"You are a helpful voice assistant. "
"Keep your responses concise and conversational. "
"Avoid using punctuation that doesn't translate well to speech."
)
)The agent class defines the personality and behavior of your voice assistant.
LMNT TTS Configuration
tts=lmnt.TTS(
model="blizzard", # High-quality TTS model
voice="leah", # Voice ID from LMNT library
language="en", # ISO 639-1 language code
temperature=0.7, # Speech expressiveness (0.3-1.0)
top_p=0.9, # Speech generation stability
)The LMNT TTS service supports these parameters:
model: TTS model (default: "blizzard")voice: Voice ID from LMNT's voice librarylanguage: Two-letter ISO 639-1 language codetemperature: Controls expressiveness - lower values (0.3) for neutral speech, higher (1.0) for dynamic rangetop_p: Controls stability - lower values for consistency, higher for flexibility
Agent Session Pipeline
session = AgentSession(
stt=deepgram.STT(model="nova-2"), # Speech-to-text
llm=openai.LLM(model="gpt-4o-mini"), # Language model
tts=lmnt.TTS(...), # Text-to-speech with LMNT
vad=silero.VAD.load(), # Voice activity detection
turn_detection=MultilingualModel(), # Contextual turn detection
preemptive_generation=True, # Preemptive generation for faster response times
)This creates a complete STT-LLM-TTS pipeline with:
- Speech recognition with Deepgram Nova-2 model
- Language generation with OpenAI GPT-4o-mini
- Speech generation with LMNT
- Voice activity detection for turn-taking
Customize your agent
Try these modifications to enhance your agent:
Testing your agent
To test your agent:
- Make sure your LiveKit server is running
- Clone LiveKit's frontend example and run it with your livekit room credentials
- Join a room - your agent will automatically connect and start the conversation
- Speak naturally and experience real-time voice interactions