question

jjaquinta avatar image
jjaquinta asked

Streaming text

The quality of text to speech has come a long way. I've often thought while learning about the Echo that it would be great if I could get it to read any book I had bought on Amazon to me. I'd rather listen to that while puttering around the house to music or a podcast. I've recorded one of my books as an audio-book. It's a lot of work and, if you are an amateur, it's hard to do well. I'm pretty sure the text-to-voice on Echo would do a better job than I did. The problem is that right now we can only send back a block of text for Echo to say. I don't know what the limit is, but it's probably less than a full novel. More cannot be sent until there is an interaction. It's not good usability to have it read something to you, and then have to say "more" now and again. It would be great if we could "stream" text to Echo to read to the user. A url could be provided in an attribute. The app would remain active, and if the user said something, further intents would be triggered. When those intents were sent to the app they could include the current position in the text of the playback in the attributes. Responses could set new positions, cancel the playback, or start a new playback.
alexa skills kit
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jjaquinta avatar image
jjaquinta answered
On my morning walk I thought of a way to implement this that doesn't impact much on your existing structure. Say that in the Reprompt Object you also had the option of adding a "callback" section instead of (or in addition to) the "outputSpeech" section. Something like: "reprompt": { "callBack": { "intent": { "name": "string", "slots": { "string": { "name": "string", "value": "string" } } } } } If the timeout is reached an a re-prompt triggered, if there is a "callback" section, it will instead (or in addition to) submit that intent back to the application. This would allow the application to do streaming text by having the client kind of automatically hit the "more" key. Additionally this could be a stop-gap for all those users asking for push messages. You could set up something so the app could ensure the client polled it every timeout in case there was a long running task. There is the potential for abuse, sure, but I think there are already plenty of ways for the user to drop a conversation from the Echo that it wouldn't be that much of a problem.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

memo@amazon avatar image
memo@amazon answered
Thanks for the feedback jjaquinta. We've made the Product team aware of the feature request.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.