I am trying to visualize how one would create a dialogue with Echo to create complex data types. For example, let's say I create a calendar app. How should we approach adding an event to a calendar. Ideally, I think it would go something like this (echo responses in caps): Alexa launch calendar Add item [b]WHAT DATE IS EVENT?[/b] November third 2019 [b]WHAT IS THE TITLE OF THE EVENT?[/b] Mom and dads anniversary [b]WHAT IS THE DESCRIPTION OF THE EVENT?[/b] Mom and dad have been married for 25 years Any thoughts on how to achieve this? I suppose on the endpoint, the state could be held in the session, but perhaps there is something built in we can use?
Currently I think the Amazon would say that you should do it all in one statement such as: "Alexa, open 'My Calendar App' and add an event called My Anniversary On July 19th." or something to that effect. They do have some (last I checked) ability to have a back and forth with Alexa, but it did not have anything about specific data that your app is dealing with...just "Can I run another command" prompts for you. However...you could certainly keep the state active as long as the session is the same, and store it in your own structures. According to what I understand of their interface, the best you could do right now is a clunky back and forth... You: "Alexa, open my calendar and add an event on July 19th" Alexa: Event Added, can I do something else? Y: Yes, add title "my anniversary" A: Title added, can I do something else? Y: Add a reminder for 4 hours A: blah Y: set description A: blah Y: Thank you, close my calendar Maybe you can do more with their commands than I assume...but I can't get a response from them on how to diagnose the error I am getting with .Net on a simple HelloWorld command. Once I get past that, I can experiment more with the actual commands and whatnot.
There aren't wildcards yet, so you would literally have to predict every possible combination or event name, which isn't going to happen. This probably is only going to work once they figure out wildcards, which seems to be one of the biggest roadblocks.
Agreed, I was surprised to see in the API that I didn't have access to the full command spoken by the user, to parse on my own. I wanted to create a simple "repeat after me" app, that returned whatever sentence the user said, and spoke it back via the echo. It doesn't seem that I can do this unless I create lots of possible slots of "junk1 junk2 junk3" etc to fill all possible number of words spoken by the user.
In your example blog, don't you still have to have a lot of utterance permutations? They may be short snippets but still need to profile many samples? I agree with your thoughts on the conversational approach to building, I have to do the same thing. But I'm stopped by the shortness of the sessions, my app isn't so quick!
Hi there - I've read this blog post several times and do understand what you're doing (it's analogous to having a default router in a web application, to handle any form of path_info on a URL, as far as I can tell). But what I don't really get is, to make that work you must presumably have a 'default handler' to accept all intents? The session intent (as started from the first question) and the sequence path, dictate the response. For the life of me I can't see any other way to make it work or, or indeed make it work :) Any info on this would be hugely appreciated. Meanwhile I'll keep hacking :) Many thanks David
You might also look at the design section in my book:
http://www.amazon.com/How-Program-Amazon-Echo-Development-ebook/dp/B011J6AP26 It suggests you first design many "ideal conversations" of how, in a perfect world, you would interact with Alexa. Then, it gives a systematic method to breaking those conversations down and, ultimately, deriving the utterances and intents you need to implement them. You do have to remember that Alexa is very poor at general text transcription. That's why they make you do all the intents and utterances. Any skill design that relies on Alexa transcribing arbitrary text will, ultimately, be poor because it is limited by that quality level. It's better to design around Alexa's limitations rather than to wish they weren't there.