LMNT's capabilities are organized into two main areas:
- Model capabilities: Control how LMNT generates speech that matches the feeling your want.
- Streaming & realtime: Realtime serving for latency sensitive use cases like voice agents.
If you're new, familiarize yourself with model capabilities first.
Model capabilities
Ways to steer the model to generate speech that matches the feeling you're looking for.
| Feature | Description |
|---|---|
| Voice cloning | Create custom voices from 5–10 seconds of reference speech. |
| Accents | Steer the accent of generated speech. |
| Languages | Generate speech in 20 languages with native code-switching. |
| Word timestamps | Get precise per-word timing to sync subtitles, lip movement, and other modalities. |
Streaming & realtime
Ways to meet latency deadlines for your specific use cases.
| Feature | Description |
|---|---|
| Speech API | Streams generated speech to you. Full text must be known ahead of time. |
| Speech Sessions API | Stream in text from an LLM, and LMNT streams speech back to you. Great for voice agents. |