By far the launching of apps is going to make the echo feel much less streamlined. I'm not sue what users will think. Thru just want answers. I'm hoping someday apps could be associated with maybe not keywords but contextual areas. If the user has say financial apps activated, no reason why "what's my savings balance" couldn't be sent to two bank apps. In the meantime why not suffix launching. "How many messages are in gmail" would send to gmail. "What's the closing price for Amazon in stocks" for stock app.
I agree that launching apps does feel a little awkward as I've discovered testing mine with family/friends. Users get used to more naturally spoken phrases with Echo, but then they have to be very specific in their sentence construction when invoking an app. Although it does impact the UX somewhat, it's probably more reliable than suffixing - at least with ML technology right now :) IIUC, you're proposing: "[some phrase] [app name]" ...where 'in' is the keyword that'll instruct Alexa to look for [app name] to send [some phrase] data over to. Maybe something like: "[some phrase] [app name]" "[some phrase] [app name] " ...could help reduce the ambiguity a little for Alexa from the near infinite set of possible phrases to more focus around app invocation. I like it though, and it's a feature request I'd like to see some thought given to - anything to help adoption and support a better UX :)
I worry greatly that allowing multiple apps to respond to a single request is a minefield. The worst case scenario *always* happens in software. Namespacing requests by requiring an App name helps keep things much safer. I also worry that allowing multiple responses will lead users to believe that all of their apps are listening to all requests and I doubt many people will want to use the Echo if they think such is the case. As for suffixing, doesn't the current 'prefix" approach work just as well? For example "Alexa ask emailmonitor how many emails are in my inbox" (where emailmonitor is the name of the app which can check such status for me). the "Alexa ask..." option already exists. The Getting Starting docs also indicate you can also use "Alexa tell...", etc. A problem I see with suffixing is that you expand the phrase length needed to just get the proper app identified and the expanded phrase length will require more processing time by Alexa servers and increases the possibility of mis-understanding or error.
I still think users having to remember and even install apps is going to take the magic out if it. A user doesn't want to identify apps but identify subjects. Maybe it will evolve to suggest apps when it can't answer a question and will auto install them. Because why don't we have to say ask calendar or tell hue? Because it's not nearly as intuitive! So for me the apps approach is very duct tapeish. But it's all we have right now I suppose.
Let me also add that your comment on users thinking apps always listen. That's my point, I don't think users even want to know about apps. Alexa should be the brains that magically use the resources to answer, users shouldn't have to know more than that.
I completely agree. End users don't want to have to know to say "Alexa, open _____" before they can do something. They want to be able to just say "Alexa, blah blah blah" While I understand the need for having a launch command, end users may not be so understanding.
I think that users are very comfortable with the idea that there are different apps for different activities. And further, I think they want to be confident that the information provided by different apps is siloed. It's not as if there is just one app on everyone's smart phone - we all know that do banking requires the use of a banking app... facebook has its app, to check weather requires a weather app.... a calendar requires a calendar app, etc, etc. It's a paradigm users understand and I think they will be very leery of the feeling that there is a single application that does everything.
Years ago one of those text-driven multi-user-dungeons came on the scene called "MOO". MUD - Object Orientated. Text adventure games were kind of noted for the clunky syntax. Half the game was working out how to express what you wanted to do so that the computer understood it. They also tended to be static and hard to extend. MOO brought a different approach tied in with object orientation. Commands were always in the form of (or shorter). What was nfity was that any object in the environment could declare contextual commands. So, for example, "hit goblin with sword" could be defined as "hit with " and attached to the "goblin" object, or as "hit with " and attached to the "sword" object, or even "hit with " and attached to either the person object, or the containing room object. When the user type a phrase, it parsed it, and then worked out which object had a defined command that could handle it. What was nice is that the set of possible commands was contextual. It depended on what was available in the resolution path. Also, each object could dynamically contribute to the command set. I've thought that something similar would be useful for the Echo. Say, for example, a user's context was comprised of either all the apps they had installed, or some sort of "most recently used", or, once there are more Internet-Of-Things Echo enabled, the physical context. When the user says something, it could first seek to see if it can resolve it amongst all of the apps in the current context. If there is a unique match, it can just invoke that application. If there is ambiguity, it can ask for clarification. Worst case, the user can just use the existing "tell to " in order to make their intent absolutely clear. So it should be doable with existing hardware, and without changing every app so far written.
I agree with everything you're saying. I have to believe the Amazon engineers are thinking along the same lines, they're just looking for the best way to implement it. It would be too easy for someone else to eat Amazon's lunch, if they keep Echo in the world of having to memorize a specific "app" name and use requests that way. The direction i would envision Echo going is more of a cloud-based OS, where voice is merely the primary user interface, but for this to work ,Echo will need to be a lot smarter about discovering actions than it is.