Reference for the Speech class in the Python SDK v1
This is the reference for the v1 Lmnt Python SDK. The v2 SDK has a different API and is not compatible with this reference.
The Speech class is your primary touch-point.Instantiate a Speech object with your
Copy
Ask AI
from lmnt.api import Speechspeech = Speech('LMNT_API_KEY')
When you’re done with the Speech instance, make sure to clean up by calling its close() method.
Copy
Ask AI
await speech.close()
Alternatively, you can use this class as an async context manager, which will call close() for you:
Copy
Ask AI
async with Speech('LMNT_API_KEY') as speech: pass
While you can provide an api_key argument, we recommend using python-dotenv to add LMNT_API_KEY="My API Key" to your .env file so that your API key is not stored in source control.
For unclean audio with background noise, applies processing to attempt to improve quality. Not on by default as it can also degrade quality in some circumstances.
The desired language of the synthesized speech. Two letter ISO 639-1 code. One of de, en, es, fr, pt, zh, ko, hi. Does not work with professional clones and the blizzard model.
Creates a new, full-duplex streaming session. You can use the returned session object to concurrently stream text content to the server
and receive speech data from the server.
mp3: 96kbps MP3 audio. This format is useful for applications that need to play the audio directly to the user.
raw: 16-bit little-endian linear PCM audio. This format is useful for applications that need to process the audio further, such as adding effects or mixing multiple audio streams.
ulaw: 8-bit G711 µ-law audio with a WAV header. This format is most useful for telephony applications.