GET
/
v1
/
ai
/
speech

Specify either speed or length, not both; otherwise a request with both will result in an 500 server error as the desired speed might not match the desired length.

For more detailed output such as the duration of each spoken word, use the Speech POST request.

Query Parameters

X-API-Key
string
required

Your API key; get it from your LMNT account page.

voice
string
required

The voice id of the voice to use for synthesis; voice ids can be retrieved by calls to List voices or Voice info.

text
string
required

The text to synthesize; max 5000 characters per request (including spaces).

language
string
default: en

The desired language of the synthesized speech. Two letter ISO 639-1 code. One of de, en, es, fr, pt, zh, ko, hi. Does not work with professional clones and the blizzard model.

model
string
default: aurora

The model to use for synthesis. One of aurora (default) or blizzard. Learn more about models here.

format
string
default: mp3

The file format of the synthesized audio output, either aac, mp3, mulaw, raw, wav.

conversational
boolean
default: false

Set this to true to generate conversational-style speech rather than reading-style speech. Does not work with the blizzard model.

sample_rate
number
default: 24000

The desired output sample rate in Hz, one of: 8000, 16000, 24000; defaults to 24000 for all formats except mulaw which defaults to 8000.

speed
number
default: 1.0

The talking speed of the generated speech, a floating point value between 0.25 (slow) and 2.0 (fast).

length
number

Produce speech of this length in seconds; maximum 300.0 (5 minutes). Does not work with the blizzard model.

seed
integer

Seed used to specify a different take; defaults to random (see here for more details).

Response

200 - */*

The response is of type object.

Was this page helpful?