POST
/
v1
/
ai
/
voice
curl --request POST \
  --url https://api.lmnt.com/v1/ai/voice \
  --header 'Content-Type: multipart/form-data' \
  --header 'X-API-Key: <x-api-key>' \
  --form 'metadata={"name": "new-voice", "type": "instant", "enhance": false}; type=application/json' \
  --form files=@/Users/user/filename.wav
{
  "description": "a newly created voice",
  "gender": "male",
  "id": "123456789abcdef",
  "name": "new-voice",
  "owner": "me",
  "starred": false,
  "state": "ready",
  "type": "instant"
}

The metadata field must come before the files in your request.

For Professional Voices, at least 5 minutes of source audio is required for a clone; the more, the better, up to 250MB total source file size.

For Instant Voices, as little as 30 seconds of source audio gets you an instant clone.

For more on voices in general, visit our guide.

Headers

X-API-Key
string
required

Your API key; get it from your LMNT account page.

Body

multipart/form-data
metadata
string
required

Information about the voice you are creating; a stringified JSON object containing the following fields:

  • name required: string; The display name for this voice
  • enhance required: bool; For unclean audio with background noise, applies processing to attempt to improve quality. Default is false as this can also degrade quality in some circumstances.
  • type optional: string; The type of voice to create. Defaults to instant.
  • gender optional: string; A tag describing the gender of this voice. Has no effect on voice creation.
  • description optional: string; A text description of this voice.
files
string
required

One or more input audio files to train the voice in the form of binary mp3 or wav attachments.

  • Max attached files: 20.
  • Max total file size: 250 MB.
  • Professional voices require at least 5 minutes of source audio to train from.

Response

200 - application/json
description
string | null

A text description of this voice.

gender
string

A tag describing the gender of this voice, e.g. male, female, nonbinary.

id
string
required

The unique identifier of this voice.

name
string
required

The display name of this voice.

owner
enum<string>
required

The owner of this voice.

Available options:
system,
me,
other
starred
boolean

Whether this voice has been starred by you or not.

state
string
required

The state of this voice in the training pipeline (e.g., ready, training).

type
enum<string>

The method by which this voice was created: instant or professional.

Available options:
instant,
professional