StreamingSynthesisConnection

This is the reference for the v1 Lmnt Node SDK. The v2 SDK has a different API and is not compatible with this reference.

This class represents a full-duplex streaming connection with the server. The expected use is to call appendText as text is produced and to iterate over the object to read audio. Make sure to call finish() when you’re done submitting the entire text snippet.

When you’re done with the StreamingSynthesisConnection instance, you can explicitly clean up its resource utilization by calling the close() method.

To create an instance of this class, use the synthesizeStreaming method on the Speech class.

const speech = new Speech('LMNT_API_KEY')
const connection = await speech.synthesizeStreaming('mara-wilson')

appendText

appendText(text)

Sends additional text to synthesize to the server. The text can be split at any point. For example, the two snippets below are semantically equivalent:

connection.appendText('This is a test of ')
connection.appendText('the emergency broadcast system.')

connection.appendText('This is a test of the eme')
connection.appendText('rgency broadcast system.')

Parameters

text

string

required

Some or all of the text to synthesize.

Note

The text is guaranteed to be synthesized in the order that it is sent.

You can access the synthesized data by using the async iterator on the connection object.

Streaming Data Iterator

The connection object provides an async iterator that yields data from the server as it arrives. See the notes below for details on the return value. Here’s a short snippet that shows how to iterate over the data:

for await (const message of connection) {
  // `message` is an object as described below
  const audio = message.audio;
  const audioBytes = Buffer.byteLength(audio);
  process.stdout.write(`Received ${audioBytes} bytes.`);
  audioFile.write(audio);
}

Each returned object contains the following properties:

audio

Buffer

A binary string containing the synthesized audio data as 96kbps mono MP3 stream with a sampling rate of 24kHz.

durations

array of duration objects

An array of text duration objects. Only returned if return_extras was set to true in when the connection was created.

Show properties

Each object describes the duration of a chunk of text (e.g., words, punctuation, and spaces) with the following keys:

text

string

The text itself.

start

number

The time at which the text starts, in seconds

duration

number

The overall duration of the text, in seconds

The durations array resets its start time for each chunk of audio returned.

warnings

string

If return_extras was set to true in when the connection was created, this will include any warnings about the synthesis.

close

close()

Releases resources associated with this instance.

flush

flush()

Call this when you want to trigger the server to synthesize all the text it currently has and return the audio data. Audio will be returned via the async iterator. This is recommended to be used only sparingly, if at all, as it can result in a less natural sounding speech. This could be useful if you are sure you have sent text that comes at a natural stop and want all the audio returned without closing the connection.

finish

finish()

Call this function when you’ve written all the text you’re expecting to submit. It will flush any remaining data on the server, return the last chunks of audio as described above, and close the streaming connection.

Overview

Python

Node

Unity

StreamingSynthesisConnection

appendText

Parameters

Note

Streaming Data Iterator

close

flush

finish

Overview

Python

Node

Unity

​appendText

​Parameters

​Note

​Streaming Data Iterator

​close

​flush

​finish

appendText

Parameters

Note

Streaming Data Iterator

close

flush

finish