Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Best free text-to-speech plugins for browsers?
38 points by ducktective on Oct 11, 2022 | hide | past | favorite | 17 comments
Considering the recent advances in DL methods, is there a good (natural) TTS technology available as Firefox plugin?

I currently use "Read Aloud: A Text to Speech Voice Reader" but the free version's voice is a bit robotic.

Can OpenAI Whisper be used as a plugin?

[edit]: An offline method generating audio for an input text is also fine (non-realtime TTS)




I think A.I. voices will always seem robotic for some applications. I have been thinking about projecting Unreal Framework characters into a mirror like

https://en.wikipedia.org/wiki/Pepper%27s_ghost

and think even a generic character with low-end motion capture and some out-of-the box motion animations should be "good enough" but I'd still need a voice actor to get acceptable vocals.

The trouble I see there isn't just that the default voice is "robotic" but that a real voice actor can take direction. You'd hope a voice actor has a good intuition for how to make a character come alive but you can always ask for adjustments. For current A.I. voices you can at best talk to the hand.

Reading text on a page is less demanding, but the system still has to adjust the tone of voice, prosody and such to match the emotional tone of the content. Maybe this can be done without emotion or a simulation of emotion on an end-to-end basis, but it has to be done.


You probably already know about the Web Speech API and have decided it's not natural sounding enough? Just checking.

https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...


This snippet of code will use whatever your browser uses for TTS: https://www.locserendipity.com/TTS.html


Huh.. this is actually something I'm working on at the moment. The value is being able to asynchronously listen to audio of an article while keeping going with your work.

At the moment I'm relying on AWS primarily, it's got a couple good neural voices that I enjoy listening to, and then sync it up with S3 and a possible SNS (simple notification service). Glad to see someone else has seen a need for it, but I've also been thinking of how to do it agnostic of AWS.

It's possible at the moment for me to go into reader view, copy and paste the content into AWS polly interface, paste the bucket name, paste the SNS ARN, and then wait for it to finish, find it in the bucket and then open it.

I want that all in 3 steps, Start the Conversion, Find it easier in a better user interface, Hit Play.

And then from there, start implementing an SSML builder to modify the speed and prose of different paragraphs and stuff, but that's super far down the line.


From what I understand working with web accessibility, the majority of visually impared users of screen readers use either JAWS[1] on Windows (there's also Narrator built-in to Windows) or VoiceOver[2] on Mac. These are going to be more robust than a browser plugin.

[1] https://www.freedomscientific.com/products/software/jaws/

[2] https://www.apple.com/voiceover/info/guide/_1121.html


Related question: is there software that ask you to read a few pages of text and then is able to do text to speech with result close to your voice?

Note: text is choosen by the software to cover most sounds in the given language.


https://www.descript.com/overdub

> Descript's Overdub lets you create a text-to-speech model of your voice or select one from our ultra-realistic stock voices.


Many browser extensions use azure, google and amazon paid TTS APIs for the best quality. I really like Azure English (Canada) voice and it's free from Edge read aloud. My main browser is chrome, have to use a small ahk script: Get corrent url, open edge small window in reader mode (--app=read://https://"%chromeURL% "--new-window") and press read aloud hotkey. In general, Edge has become more and more attractive lately.


There are some TTS products on clickbank. May be worth trying them:

- https://speechelo-offer.com/

- http://newscastervocalizer.net/

- https://humatars.net/

- http://getscriptvocalizer.com/


https://tinygem.org/listen/

I think it uses the AWS API under the hood.


Microsoft has some amazing TTS technology. https://azure.microsoft.com/en-us/products/cognitive-service...

I know it is not Firefox, but Edge has it built-in and it works great! Highly recommended to try.


Natural reader paid version

https://www.naturalreaders.com/

Google TTS

https://cloud.google.com/text-to-speech/

So far, the bests TTS tools I have found.


If you use a Mac, browsers using WebKit natively (Safari, Orion…) will have system text to speech built in, no extension needed. It is decent and definetely usable.


AFAIK that works for others browsers on Mac too, Firefox and Brave both use same macOS API under the hood.


Do you want to run speech synthesis on remote server (all visitors are getting same experience), or your own computer (Web Speech API)?


my own computer, as a consumer.


> Can OpenAI Whisper be used as a plugin?

I believe Whisper is speech to text only




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: