I've dealt with STT before when it's had problems with words that are spelt one way, but pronounced differently. English has so many loan words that there's no way it is going to get everything correct. The situation is worse in areas of the world that speak English as their primary language, but there is still a heavy influence from other languages. (I used to live in Ireland. I'd love to see Echo try to pronounce Cliodna!) So, it give a poor perception for an application if there is a word that is going to be consistently pronounced wrong. It would be great if our applications could provide more precise guidance with words we know ahead of time are difficult. The International Phonetic Alphabet (
https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) defines its own symbol set with precise phonetic meanings. There are even conventions for inserting phonetically marked up text inside of [square brackets] or /slashes/. This could be supported seamlessly in all of the existing data structures for Echo (assuming file formats are all UTF-8). Since / are not pronounced anyway, they could be recognized by whatever part of the pipeline converts text to phonemes, and the IPA symbols directly taken instead of processed. For bonus points, you could lets us ass IPA specifics into our utterances to better adapt to regional variances.
I ran into this issue when trying to get my Echo to say "sh\it" (writing a joke app related to the book The Martian). Anyway - I got around it by modifying the spelling until Echo was pronouncing the word correctly. Took a decent bit of trial and error, but might be something to try if they don't add phonetic support. edit:added backslash because otherwise the forum bleeps it out. Message was edited by: RJ S.
Ha ha. Same thing here. My wife and I do Android apps. We have a whole slew of them that take your picture, tint it, slap a slogan on it, and then read the slogan to you. I ended up having to invent a syntax so we could mark up the text and send one interpretation of it to be printed and another one to the TTS engine. I could do that so one stream went to the card and the other to the Echo. It's a pain. I'd like to see Amazon go above and beyond those sorts of tricks.
Another use case came up today. My skill has the word "record" in its name. My intention is "record" /ˈɹɛ.kɚd/ as in, to record something, like a tape recorder. However, Alexa pronounces it "record" /ˈɹɛ.kɔːd/, as in "database record". Presently there is no way to correct Alexa's pronunciation. Support for the IPA would allow this fault to be addressed.