Create voice
Submits a request to create a voice given configuration data and some source audio.
For Professional Voices, at least 5 minutes of source audio is required for a clone; the more, the better, up to 250MB total source file size.
For Instant Voices, as little as 5 seconds of source audio gets you an instant clone.
For more on voices in general, visit our guide.
Authorizations
Your API key; get it from your LMNT account page.
Body
One or more input audio files to train the voice in the form of binary wav
, mp3
, mp4
, m4a
, or webm
attachments.
- Max attached files: 20.
- Max total file size: 250 MB.
- Professional voices require at least 5 minutes of source audio to train from.
Information about the voice you are creating; a stringified JSON object containing the following fields:
name
required: string; The display name for this voiceenhance
required: bool; For unclean audio with background noise, applies processing to attempt to improve quality. Default isfalse
as this can also degrade quality in some circumstances.type
optional: string; The type of voice to create. Defaults to instant.gender
optional: string; A tag describing the gender of this voice. Has no effect on voice creation.description
optional: string; A text description of this voice.
Response
Voice details
The unique identifier of this voice.
The display name of this voice.
The owner of this voice.
system
, me
, other
The state of this voice in the training pipeline (e.g., ready
, training
).
A text description of this voice.
A tag describing the gender of this voice, e.g. male
, female
, nonbinary
.
Whether this voice has been starred by you or not.
The method by which this voice was created: instant
or professional
.
instant
, professional
Was this page helpful?