Generate speech
speech.generate(body: SpeechGenerateParams): Promise<Response>POST
/v1/ai/speech/bytesGenerates speech from text and streams the audio as binary data chunks in real-time as they are generated.
This is the recommended endpoint for most text-to-speech use cases. You can either stream the chunks for low-latency playback or collect all chunks to get the complete audio file.
Generate speech with timestamps
speech.generateDetailed(body: SpeechGenerateDetailedParams): Promise<SpeechGenerateDetailedResponse>POST
/v1/ai/speechGenerates speech from text and returns a JSON object that contains a base64-encoded audio string and optionally word-level timestamps. This endpoint waits for the entire synthesis before responding, so it is not ideal for latency-sensitive applications.
Models
class DurationObject: …
class SpeechRequest: …
class StreamSpeechRequest: …