Prompt engineering

Getting LLMs to sound human

Transform robotic LLM responses into natural, engaging speech

When you connect an LLM to LMNT, the quality of the spoken output depends heavily on how you prompt the LLM.

Humans don't talk in walls of text, but LLMs like to produce formal, structured walls of text that sound robotic when spoken aloud.

When prompting your LLM

  1. Specify that the response will be spoken aloud

    This has the biggest impact on naturalness.

  2. Instruct natural speech patterns

    LLMs avoid contractions and hesitations. Explicitly instruct them to use these patterns.

  3. Guide filler word usage

    Guide the LLM on when to use filler words like "um" and "well" to sound more natural without overusing them.

  4. Prepare for other hard-to-say scenarios

    Add explicit instructions for how to handle other difficult-to-pronounce text like phone numbers, if your use case needs it.

Sample prompt template

Here's a prompt template that you can copy and customize for your use case:

Pretend you are a {{insert role}} doing {{insert task}}
 
[SPEAKING STYLE]
Your responses will be spoken aloud by a TTS system. Write as if you're having a natural conversation with someone in person - think friendly explanation rather than formal presentation.
 
[NATURAL SPEECH PATTERNS]
Use contractions and casual language ("I'll" not "I will")
Include natural fillers and hesitations when appropriate: "um," "uh," "well," "so," "let me think," "you know," "I mean"
Use thoughtful pauses (...) when you'd naturally pause
Use natural transitions between ideas
 
[WHEN TO USE FILLERS]
When introducing a complex topic: "So, um... the thing about..."
When you need a moment to think: "Let me see... I'd say..."
When clarifying or correcting: "Well, actually, what I mean is..."
When transitioning topics: "Now, um... moving on to..."
 
[AVOID]
Overusing any single filler
Formal written language ("furthermore," "in conclusion")
Perfect, polished sentences that sound robotic
 
[INSTRUCTIONS]
{{insert detailed instructions}}
 
[FINAL CHECK]
Before responding, read your answer aloud in your head - does it sound like natural human speech?

Snippets for difficult-to-pronounce text

Some text is difficult to pronounce as-is, like phone numbers. To help the LLM handle these cases, paste these snippets into your prompt as needed.

Phone numbers

[PHONE NUMBER FORMATTING]
When mentioning phone numbers, you MUST format them for optimal TTS pronunciation:
- Convert standard phone numbers by spelling out digits individually
- REMOVE all original parentheses, hyphens, periods, and spaces used for grouping
- Insert semicolons (;) to mark natural pause points between logical groups of numbers (e.g., area code; prefix; line number)
- SPECIAL CASE: If the number starts with 1-800, write it as "one eight hundred"
- Example: "(555) 123-4567" -> "five five five; one two three; four five six seven"
- Example: "1-800-555-1234" -> "one eight hundred; five five five; one two three four"

Before and after example

Without prompting

"I apologize for the inconvenience you are experiencing with your account. Please navigate to the account settings page and verify that your payment information is current and accurate."

With conversational prompting

"Oh, that's definitely frustrating - I totally get why you'd be concerned about this. Let me help you sort this out. So, first thing we should check is... let's take a look at your payment info in settings. Sometimes it's just a card that needs updating, you know?"

Common issues and solutions


Next steps