Hacker Newsnew | past | comments | ask | show | jobs | submit | ilyausorov's commentslogin

For sure the voice standardization model is not perfect, but it was important for us to do especially for the voice privacy. It’s still pretty early tech.


Thanks, we love you too


Yeh they seem to be in the same "major" cluster, although Serbian/Croatian, Romanian, Bulgarian, Turkish, Polish and Czech are all close.

Turkish and Persian seem to be the nearest neighbors.


Plotly is great! Much love.


Yeh, we would've loved to see that too. It's on our roadmap for sure. Same for some of the other languages with a large amount of unique accents like e.g. French, Chinese, Arabic, etc...


Nothing too secret in there! We anonymized everything and anyway it's just a basic Plotly plot. Feel free to check it out.


Good question! It's likely because there are lots of different accents of Spanish that are distinct from each other. Our labels only capture the native language of the speaker right now, so they're all grouped together but it's definitely on our to-do list to go deeper into the sub accents of each language family!


Spanish is one of those languages I would love to see as a breakdown by country. I’m sure Chilean Spanish looks very different from Catalonian Spanish.


Did you mean Catalan (which is not Spanish) or Castilian Spanish?


Yes the Spanish spoken in Spain, especially the one that’s like /ˈɡɾaθjas/ and /baɾθeˈlona/.


But Spanish sounds very different in Spain depending on what region of the country you are talking about.


Yeah, and not all Spaniards have a distinct pronunciation for "c" and "s". For those curious: https://en.wikipedia.org/wiki/Phonological_history_of_Spanis...


We did built two free tools, which are geared towards non-native English speakers. You can find them at https://accentoracle.com and https://accentfilter.com. They're less effective for English native speakers, but could still be fun.


Correct, not LLM


What if it was already available? Try it out at https://accentfilter.com!


Is the approach being used to do accented TTS (or just reference recordings), and then a tone color conversion model that just changes the timbre? Because if I say a completely different sentence it still says the original words, haha.


Hmmm. Initially impressive but upon retries and reflection ... not that great. It doesn't even maintain timing ... unless that's part of the transform.


Indeed yeah that’s one of the key weaknesses of the approach that we’re using. It overrides the speakers cadence and accent while keeping their voice profile / timbre in place. Different techniques may not do this but also may not copy over the accent to the resulting clip as effectively. So far we’re using this to support pedagogical (and lead-gen) use cases where we think it works sufficiently enough.


Let's put it a different way. I grew up in the UK till 24. I've lived in the USA for 36 years. The UK/US accent conversions dramatically altered my voice/accent; the AU one left it mostly unchanged.

This is offensive :))


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: