📚 SSML Documentation

Complete guide to Speech Synthesis Markup Language for advanced voice control

Quick Navigation

😊 Emotions ⏸️ Pauses 🎚️ Prosody 💪 Emphasis 📅 Say-As 🔄 Substitution 🗣️ Phoneme ✨ Examples

😊 Emotions & Styles

Add emotional expression to your speech with the mstts:express-as tag.

Syntax:


<mstts:express-as style="STYLE" styledegree="1-2">
  Your text here
</mstts:express-as>

Available Styles:

whispering

Soft, quiet whisper tone


<mstts:express-as style="whispering" styledegree="2">
  I have a secret
</mstts:express-as>

cheerful

Happy, upbeat tone


<mstts:express-as style="cheerful">
  What a wonderful day!
</mstts:express-as>

sad

Sorrowful, melancholic tone


<mstts:express-as style="sad" styledegree="1.5">
  I miss you so much
</mstts:express-as>

angry

Angry, forceful tone


<mstts:express-as style="angry" styledegree="2">
  How could you do this?!
</mstts:express-as>

Other styles: excited, friendly, hopeful, terrified, shouting, unfriendly

styledegree (optional): Controls emotion intensity from 0.01 (very subtle) to 2.0 (very strong). Default is 1.0 when omitted.

⏸️ Pauses & Breaks

Add silence between words or sentences with the break tag.

Syntax:


Hello <break time="500ms"/> World

Or use strength:
Hello <break strength="medium"/> World

Time Values:

• 250ms - Quarter second
• 500ms - Half second
• 1s - One second
• 2s - Two seconds

Strength Values:

• none - No pause
• x-weak - Extra weak
• weak, medium, strong
• x-strong - Extra strong

🎚️ Prosody (Voice Modulation)

Control speaking rate, pitch, and volume with the prosody tag.

Syntax:


<prosody rate="RATE" pitch="PITCH" volume="VOLUME">
  Your text here
</prosody>

Rate (Speed)

Predefined:

• x-slow, slow
• medium (default)
• fast, x-fast

Percentage:

rate="-50%" to "+100%"

Pitch

Predefined:

• x-low, low
• medium (default)
• high, x-high

Percentage:

pitch="-50%" to "+50%"

Volume

Predefined:

• silent, x-soft, soft
• medium (default)
• loud, x-loud

Decibels:

volume="-10dB" to "+10dB"

Example:


<prosody rate="slow" pitch="-20%" volume="soft">
  This is spoken slowly, with lower pitch, and quietly.
</prosody>

✨ Complete Examples

Emotional Storytelling


<speak>
  <voice name="en-US-JennyNeural">
    <mstts:express-as style="sad">
      I can't believe you're gone.
    </mstts:express-as>
    <break time="700ms"/>
    <mstts:express-as style="hopeful">
      But I know we'll meet again someday.
    </mstts:express-as>
  </voice>
</speak>

Dramatic Reading


<speak>
  <voice name="en-US-AriaNeural">
    <mstts:express-as style="whispering">
      The door creaked open slowly.
    </mstts:express-as>
    <break time="1s"/>
    <mstts:express-as style="terrified" styledegree="2">
      Someone was inside!
    </mstts:express-as>
  </voice>
</speak>

Voice Modulation


<speak>
  <voice name="en-US-GuyNeural">
    <prosody rate="fast" pitch="+10%">
      This is read quickly with a higher pitch!
    </prosody>
    <break time="500ms"/>
    <prosody rate="slow" pitch="-20%">
      Now it's slow and deep.
    </prosody>
  </voice>
</speak>

🎙️ Start Creating