question

O. Smith avatar image
O. Smith asked

Echo seems to get confused by similar utterances

I realize this is probably an abuse of what you guys want people to do for apps, but I've been trying to play the old interactive fiction game "Colossal Cave Adventure" through Echo. I set up a server that delivers your commands to the game like: Text say {one|Query} Text say {one two|Query} Text say {one two three|Query} Originally I wanted to pass through everything you say verbatim, but I was never successfully able to get Echo to register "take keys" (Echo parsed it as "take he's", "take these", "take cheese", "take take", etc) until I made a specific intent for it: TakeKeys say take keys Unfortunately, it seems like once you have an utterance for "say take keys", Echo will never correctly parse "say take food" and pass that to the Text utterance like I was expecting. Instead, Echo passes everything that starts with "take" to TakeKeys. It could just be a general speech recognition issue though. What does it mean when Echo beeps but doesn't say anything? That also happens a lot.
alexa skills kit
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jeff Capron avatar image
Jeff Capron answered
This does seem to be the single biggest issue with Echo right now. It is really hard to code for these contingent words. I don't know about the rest of you, but it seems like Echo is getting worse at speech recognition over time, not better.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

aholmes0 avatar image
aholmes0 answered
I am also experiencing this, however I wonder if it is because I end up adding more combinations of phrases. The documentation states that this may ultimately make the experience worse. At the moment, my app doesn't recognize a single phrase, and I can only get "open appname" to work.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

rgr@amazon avatar image
rgr@amazon answered
Thanks for the feedback! Firstly, the beep you are hearing is when the Echo system doesn’t understand your utterance within an application session. On the recognition problem, looking at the definitions above, it seems that the samples for your Text intent are too open for the system to go on, and the sample for the TakeKeys intent is too limiting, given the kind of input you expect. It’s good practice to think of several of the *most likely real-world inputs* to your system. And as you put them together, consider: 1. Specifying multiple salient phrases for each intent. For example, given the “say take X” case, think of several different (but likely) values of X and add those to your training data as different “say take…” samples. Also try to match expected word counts with different samples – e.g. if a phrase like “say take the shiny brass lamp” is likely, include that - not just for the phrase itself and related variants, but because the system will also learn in general that 4-word values for X are possible. 2. Breaking out the different values following a verb into a separate slot type, e.g.: TakeItem say take {keys|Item} TakeItem say take {food|Item} TakeItem say take {the shiny brass lamp|Item} This will not only help the recognition component deal better with different values for the slot (because the generalization will be more scoped), it would also be of practical use if your app is going to be conducting different actions based on slot values after the main verb, rather than the verb itself. 3. Don’t overdo it in the training data – keep to salient examples with major differences in phrasing rather than trying to cover all minor variations. 4. See also the guidance in the document “Defining the Spoken Language Interface”, including the phrase tokenization conventions. We are working on improving our recognition of speech directed at applications, but the tips above are good practices that should help improve accuracy now and in the future. We hope this helps - please let us know. And keep the feedback coming!
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.