question

Nicolai B. avatar image
Nicolai B. asked

Speech recognition, using Alexa

Hi everyone, we're currently eveloping an android app, helping hearing impaired persons to read the spoken words, using Google's Speech recognition service. Is it possible to write an Alexa Voice Skill, that simply returns the result of the speech recognition, without any interpretation to run on Amazon Echo with it's 7 powerfull microphones? Thank you in advance and best regards, cyberfish007
alexa skills kitdebugging
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jjaquinta avatar image
jjaquinta answered
It's possible. I've written a skill that does it. But it really isn't what you are looking for. You will not get the quality you expect. At lot of this has to do with how Alexa does voice recognition. I just did a data dump of that on this thread here: https://forums.developer.amazon.com/forums/thread.jspa?threadID=8925&tstart=0 Have a look, and it might make things clearer. Personally, for a generic app, I'd suggest using Watson's Speech To Text on IBM's Bluemix. The feature I like most about that is it returns to you the text, marked up with alternative interpretations, indexed by confidence. Here's an example showing where you can get back the confidence rating on each word: [[code]] { "result_index": 0, "results": [ { "final": true, "alternatives": [ { "transcript": "hello world", "confidence": 0.9, "timestamps": [["hello",0.0,1.2],["world",1.2,2.5]], "word_confidence": [["hello",0.95],["world",0.866]] } ] } ] } [[/code] If there were other ways you could interpret what was said, they would come back as additional entries in the "alternatives" array, each with their own "confidence".
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Nicolai B. avatar image
Nicolai B. answered
Thanks for the fast response and the explanation of the Alexa skills and voice interpretation. I'll definitely give Watson a try. Another approach would be the usage of the seven mirophone's of Amazon Echo with another speech recognition service, e.g. Watson. Do you know, whether it is possible to get the audio signal from Amazon Echo's microphone array, and maybe also the direction of the predicted speech source via bluetooth? So far, our bluetooth class, following the instructions from http://stackoverflow.com/questions/14991158/using-the-android-recognizerintent-with-a-bluetooth-headset/14993590#14993590 allow us to play sound over Echo, but we're still not able to get the microphone signal. Thank you in advance and best regards! cyberfish007
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.