There's a reason for the word omission: I'm using the CMUdict Grapheme -> Phoneme database. There are 140,000 entries, but it doesn't capture everything. I've had to add words like "pokemon" and "fortnite".
I'm looking for a model that handles arbitrary grapheme/word -> phoneme/polyphone transformation. I'm also interested in perhaps replacing Arpabet with IPA (using the existing arpabet database to construct it). This might work well for non-English languages.
Do you happen to know an existing model that does this?
There's a reason for the word omission: I'm using the CMUdict Grapheme -> Phoneme database. There are 140,000 entries, but it doesn't capture everything. I've had to add words like "pokemon" and "fortnite".
I'm looking for a model that handles arbitrary grapheme/word -> phoneme/polyphone transformation. I'm also interested in perhaps replacing Arpabet with IPA (using the existing arpabet database to construct it). This might work well for non-English languages.
Do you happen to know an existing model that does this?
So much work to do... :)