question

jjaquinta avatar image
jjaquinta asked

Allow for more precise pronounciation

I've dealt with STT before when it's had problems with words that are spelt one way, but pronounced differently. English has so many loan words that there's no way it is going to get everything correct. The situation is worse in areas of the world that speak English as their primary language, but there is still a heavy influence from other languages. (I used to live in Ireland. I'd love to see Echo try to pronounce Cliodna!) So, it give a poor perception for an application if there is a word that is going to be consistently pronounced wrong. It would be great if our applications could provide more precise guidance with words we know ahead of time are difficult. The International Phonetic Alphabet ( https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) defines its own symbol set with precise phonetic meanings. There are even conventions for inserting phonetically marked up text inside of [square brackets] or /slashes/. This could be supported seamlessly in all of the existing data structures for Echo (assuming file formats are all UTF-8). Since []/ are not pronounced anyway, they could be recognized by whatever part of the pipeline converts text to phonemes, and the IPA symbols directly taken instead of processed. For bonus points, you could lets us ass IPA specifics into our utterances to better adapt to regional variances.
alexa skills kit
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

The Stig avatar image
The Stig answered
I ran into this issue when trying to get my Echo to say "sh\it" (writing a joke app related to the book The Martian). Anyway - I got around it by modifying the spelling until Echo was pronouncing the word correctly. Took a decent bit of trial and error, but might be something to try if they don't add phonetic support. edit:added backslash because otherwise the forum bleeps it out. Message was edited by: RJ S.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jjaquinta avatar image
jjaquinta answered
Ha ha. Same thing here. My wife and I do Android apps. We have a whole slew of them that take your picture, tint it, slap a slogan on it, and then read the slogan to you. I ended up having to invent a syntax so we could mark up the text and send one interpretation of it to be printed and another one to the TTS engine. I could do that so one stream went to the card and the other to the Echo. It's a pain. I'd like to see Amazon go above and beyond those sorts of tricks.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jamie Grossman avatar image
Jamie Grossman answered
Hey guys, Thanks for the good feedback. We've passed this on to the appropirate team for future consideration, so we'll be investigating this. Regards, Jamie
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jjaquinta avatar image
jjaquinta answered
Another use case came up today. My skill has the word "record" in its name. My intention is "record" /ˈɹɛ.kɚd/ as in, to record something, like a tape recorder. However, Alexa pronounces it "record" /ˈɹɛ.kɔːd/, as in "database record". Presently there is no way to correct Alexa's pronunciation. Support for the IPA would allow this fault to be addressed.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.