Text-to-speech example

Text-to-speech (TTS) is exactly what it sounds like: in goes text and out comes speech. LMNT’s TTS model is the best in the industry, with consistently low latency (~150ms between input and output), reliable service (competitors often experience multi hour long outages), and superb speech quality (we infuse the human “LMNT” in our speech models!). It can be useful in a variety of contexts, including supporting accessibility services, language learning, and content creation. The mechanistic process primarily consists of (a) text processing, analyzing the input text for linguistic structure (including pronunciation, intonation, and rhythm) and (b) speech synthesis (generating an audio output based on the processed text, imitating pre-recorded audio). This example shows how to synthesize a simple text using the leah voice. The same API can be used for more complex use cases, like synchronizing speech with captions or a video. See the API reference for more details on the capabilities of the API.

import asyncio
from lmnt import AsyncLmnt

async def main():
    client = AsyncLmnt()
    response = await client.speech.generate(
        text='Hello world',
        voice='leah',
    )
    with open('output.mp3', 'wb') as f:
        f.write(await response.read())

asyncio.run(main())

Overview

Getting Started (AI Tools)

Getting Started (SDKs)

Guides

Integrations

Migrations

Text-to-speech example