POST
/
v1
/
ai
/
speech
/
bytes

Authorizations

X-API-Key
string
header
required

Your API key; get it from your LMNT account page.

Body

multipart/form-data
text
string
required

The text to synthesize; max 5000 characters per request (including spaces)

voice
string
required

The voice id of the voice to use for synthesis; voice ids can be retrieved by calls to List voices or Voice info

conversational
boolean
default:
false

Set this to true to generate conversational-style speech rather than reading-style speech. Does not work with the blizzard model.

format
enum<string>
default:
mp3

The file format of the synthesized audio output

Available options:
aac,
mp3,
mulaw,
raw,
wav
language
enum<string>
default:
en

The desired language of the synthesized speech. Two letter ISO 639-1 code. Does not work with professional clones and the blizzard model.

Available options:
de,
en,
es,
fr,
pt,
zh,
ko,
hi
length
number

Produce speech of this length in seconds; maximum 300.0 (5 minutes). Does not work with the blizzard model.

Required range: x < 300
model
enum<string>
default:
aurora

The model to use for synthesis. One of aurora (default) or blizzard. Learn more about models here.

Available options:
aurora,
blizzard
sample_rate
enum<number>
default:
24000

The desired output sample rate in Hz

Available options:
8000,
16000,
24000
seed
integer

Seed used to specify a different take; defaults to random

speed
number
default:
1

The talking speed of the generated speech, a floating point value between 0.25 (slow) and 2.0 (fast).

Required range: 0.25 < x < 2

Response

200 - application/octet-stream

The response is of type file.

Was this page helpful?