memo@amazon avatar image
memo@amazon posted

New SSML Tag: Audio

The audio tag allows you to provide a URL for an MP3 file that will be played as part of your service's response to a request.

In order to use this, you must include the audio tag with a src attribute as part of your text-to-speech response. An example:

 That is the wrong answer.
 <audio src=""/>

The MP3 must be:

  • A valid (MPEG version 2) MP3 file
  • No longer than 90 seconds
  • Encoded with a bit rate of exactly 48 kbps
  • Encoded with a sample rate of 16000 Hz
  • Hosted on a HTTPS endpoint
  • Hosted on a domain that presents a trusted SSL certificate (not self-signed)
  • Additionally, it must NOT contain any customer-specific or other sensitive information.

An example exporting your audio using ffmpeg:

ffmpeg -i  -ac 2 -codec:a libmp3lame -b:a 48k -ar 16000

When using Audacity:

Set your project sample rate shown in the lower left-hand corner of the application.

Click File/Export Audio and set your dropdown to MP3. You will see an Options button where you can set the bit rate to 48kbs.

You will need to make sure you have the Lame library installed in order to export to MP3, which can be found at:

Note: When including multiple audio tags, please be aware that no more than five audio tags may be included in a single response, and the total duration of the audio must not exceed 90 seconds.


alexa skills kit
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.



kiruluta contributed to this article doringme contributed to this article