question

Chaanmaan avatar image
Chaanmaan asked

Alexa's Responses: Grammar, Syntax, Semantics

I am interested in hearing about how developers decide upon the phrasing of Alexa's responses. As an example, when she doesn't understand me, she says, "I didn't understand the question that I heard." This is very different from a more straight-forward, "I didn't understand your question." In humans, such linguistic choices say a lot about personality, gender, status, etc. I'm curious if there are any guidelines for developers with regard to Alexa's responses. If there are no guidelines, how do you or your company make these decisions internally?
alexa skills kit
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Stefan Negritoiu avatar image
Stefan Negritoiu answered
There are no comprehensive guidelines as of yet, although the ASK documentation does touch upon style here and there. Historically, there are these design guidelines for voice user interfaces http://demos.jellyvisionlab.com/downloads/The_Jack_Principles.pdf. You can also get involved in developer communities building bots for Slack, Telegram, etc. where similar discussions are taking place. Regarding "I didn't understand the question I heard" I suspect this phrasing was chosen over "I didn't understand your question" because it's very neutral and doesn't make the user feel like they did something wrong. Given the state of the technology (not just Alexa, but voice recognition in general) I think it's the right attitude and in our responses our skill tends to take responsibility for misunderstandings.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Chaanmaan avatar image
Chaanmaan answered
Thanks for your response. I was just watching your video on conversational programming with Alexa and freebusy. I agree that "I didn't understand the question I heard" was probably chosen for the reasons you state. It just doesn't seem *natural* like the exchanges that you are creating in the freebusy skill. I also think that avoiding such uses of "I" can help build a stronger AI personality without necessarily offending the user. In your video, for instance, I really like Alexa's response, “OK, but you have a conflicting event at 12 p.m. titled ‘Catch up.’ Do you still want to schedule another event at that time?” Her use of "but" introduces a contradiction (also a level of cognitive processing I haven't seen before) but not one that would seem offensive to the user.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jjaquinta avatar image
jjaquinta answered
I have a whole section on design in my book: http://www.amazon.com/How-Program-Amazon-Echo-Development-ebook/dp/B011J6AP26 One key thing is that I suggest you design your application by first coming up with a large number of "ideal conversations". Preferably by a non-technical person. This way you capture the ideal of the conversational style you want. In my own design, one of the things I made key was to vary the response. One of the criticisms I made of the "number guessing" game I made in my review ( https://www.linkedin.com/pulse/weeks-skills-joseph-jaquinta) was how repetitive it got. So, for my upcoming Starlanes game, I made sure of two things: 1) No generic answers. For example, when you try to drop drones in a system, it distinguishes between not having any to drop, not having the type you are saying to drop, not having as many as requested, not having permission, not being aligned yet, etc. 2) Every answer has at least three variations. No matter how trivial the response, I came up with at least three different ways of saying it (often more). The specific answer is chosen randomly at run-time. It's a lot of work. And it will be a bitch to localize if they ever support more than English. But I was keen on producing a high quality skill for this. We'll find out if it was worth it once they release the marketplace.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Chaanmaan avatar image
Chaanmaan answered
Hi!, I did read your book last night. I did not see a lot of material on the crafting of Alexa's responses. What you say below is very smart (no generic answers, 3 versions of every answer). I think it's important to make her verbal interactions with her humans as natural as possible. I am not a developer but rather a college professor thinking about artificial intelligences and in particular the linguistic patterns that create different personalities among humans--and how we will (or won't) replicate these in AIs like Alexa.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jjaquinta avatar image
jjaquinta answered
You are probably right, I could discuss responses more explicitly. If you have published any papers on the subject you think I should read and cite, please send me a link to them. (Or any other extant literature.) We just pushed out the 3rd edition of the book yesterday. With e-publishing there is a low bar on new editions. So it's easy to add new sections and publish updates.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Chaanmaan avatar image
Chaanmaan answered
Thanks! I'm working on a short essay on this topic now and would love some feedback if you have the time. I see your email on your personal webpage and will send it to you if you are interested.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jjaquinta avatar image
jjaquinta answered
Sure, go ahead!
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.