question

gnuhc avatar image
gnuhc asked ·

Problem with Audio Tag in SSML when using with SpeakItem APL command directive

When using the SpeakItem APL Command Directive, the audio tag does not work properly.

{
    "datasources": {
            "catFactData": {
                "type": "object",
                "properties": {
                    "backgroundImage": "https://.../catfacts.png",
                    "title": "Cat Fact #9",
                    "logoUrl": "https://.../logo.png",
                    "image": "https://.../catfact9.png",
                    "catFactSsml": "<speak>Not all cats like <emphasis level='strong'>catnip</emphasis>.</speak>"
                },
                "transformers": [
                    {
                      "inputPath": "catFactSsml",
                      "outputName": "catFactSpeech",
                      "transformer": "ssmlToSpeech"
                    },
                    {
                      "inputPath": "catFactSsml",
                      "outputName": "catFact",
                      "transformer": "ssmlToText"
                    }
                ]
            }
    }
 }

In catFactData.properties.catFactSsml, if an <audio> tag is inserted in between the <speak> tag to play an audio snippet, when the skill progresses through this part, the audio tag is ignored and does not play. The <audio> tag works fine when used inside the .speak method. Is this a bug? Is there a way to go around this problem? Thanks.

alexa skills kitaplalexa presentation language
10 |2000 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Daniel M. | DigiVoice.io avatar image
Daniel M. | DigiVoice.io answered ·

It's not a bug, it's just not implemented right now. I think it will be fixed in a future APL release.

3 comments
10 |2000 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Yup. The <audio> tag currently isn't supported by the SpeakItem command. It's on our backlog.

0 Likes 0 ·

Hi Arun@Amazon , is this possible now?


I want to use this with a speak list command.

In my response, I have something like:

"ssmlText": "<speak> this is my audio: <audio src='https://url.com/myFile.mp3'/></speak>"


Which I want to transform using ssmlToSpeech and then play as part of my speak list command.

Thanks,

Thomas

0 Likes 0 ·

Hi @TommyS,


This is not yet implemented but there is a work around that you can use.

In the speech tag it is possible to send a URL, and it is possible to send multiple SpeakItem commands.


So in your case, you can remove the audio from ssmlText and create a new component in your document with the URL of audio in the speech tag of it.


Then in your response you can send 2 SpeakItem commands in sequence.


Thanks,

Tarun Dagar

0 Likes 0 ·
newuser-3e39f4d8-6cf4-47aa-9350-a74cdb3009ad avatar image
newuser-3e39f4d8-6cf4-47aa-9350-a74cdb3009ad answered ·

To everyone coming here:


There is work around for it.


You just need to segregate your speech and audio response in different components and then you can send sequential speakItem commands to play them.


Thanks,

Tarun Dagar

10 |2000 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

ColinB avatar image
ColinB answered ·

Thanks for this, how does the transformer work, could you post a short sample please?

10 |2000 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jiro avatar image
Jiro answered ·

Can someone show an example of a workaround? I have audio tags using Amazon's sound library mixed into all my speech text. The audio tags work great on any non-visual Alexa device (like echo dot) or even graphically if using withStandardCard on alexa app, but when the APL is used for a visual device (like Echo Show) , no audio tags run and the game is terribly boring. When y'all say use "different components with URL" as the workaround, what exactly do you mean?

For example, if my speech text is:

<speak> Is that a bear? <audio src="soundbank://soundlibrary/animals/amzn_sfx_bear_groan_roar_01"/><break time="1s"/><say-as interpret-as="interjection">Whew! </say-as><amazon:emotion name="excited" intensity="high">That was too close! </amazon:emotion> </speak>


How would I implement my code so the audio and and emotion are used with APL on the Echo Show? I am using Node.js for my skill. Anyone got any code samples?

1 comment
10 |2000 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

As suggested by Alexander Martin, please check out APL-A here which should cover your scenario.

Thanks,
Gaetano

0 Likes 0 ·
Alexander Martin avatar image
Alexander Martin answered ·

Still looking for a solution to use audio tags in APL? Check out APLA!

https://developer.amazon.com/de-DE/docs/alexa/alexa-presentation-language/apl-for-audio-reference.html


Regards,

Alex


10 |2000 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.