What does HackerNews think of TTS?
πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
https://github.com/coqui-ai/TTS
Support for it was recently added to vocode:
https://github.com/coqui-ai/TTS
I can never remember the name but always google: incessant loud chirp of the invasive frog
The weak link was the available free/open datasets. You needed a single speaker with a pleasant voice, 20hrs+ material from varied sources, recorded in a good recording enviroment with a good mic etc. For English, the go-to was LJSpeech, which doesn't fulfill all these requirements. I say 'was', as I haven't followed developments recently.
Last year we decided to make our own dataset with a Irish woman, Jenny. She has a soft Irish lilt.
Never got around around to training the model, but I will upload the raw audio and prompts here in a few hours (need to pay my internet bill in town..):
https://github.com/dioco-group/jenny-tts-dataset/blob/main/R...
[1] https://github.com/coqui-ai/TTS [2] https://tts.readthedocs.io/en/latest/
But really, the situation is pretty good, with a lot of code and dataset available as opensource. Notably, if you're not constrained to smartphones and the like, you can run on your computer quite a number of modern models, see for instance https://github.com/coqui-ai/TTS/ (which itself contains many different models).
The work that needs to be done is """just""" to turn those models into something suitable for smartphones (which will most likely include re-training), and to plug them back into Android's TTS API.
https://github.com/NVIDIA/tacotron2
https://github.com/mozilla/TTS
https://github.com/CorentinJ/Real-Time-Voice-Cloning
https://github.com/coqui-ai/TTS
They're not all easy to setup however
[1] The actively developed version of Mozilla TTS, named coqui-TTS. My understanding is that the original team was let go from Mozilla and they formed coqui.
https://github.com/coqui-ai/TTS
They are also on Element Matrix:
https://matrix.to/#/#coqui-ai_TTS:gitter.im
[2] FOSS automated accessibility testing engine for websites and other HTML-based user interfaces:
https://github.com/dequelabs/axe-core
[3] Emacspeak, developed by someone who was blind since childhood:
https://en.wikipedia.org/wiki/Emacspeak
[4] UK government websites are famous for being accessible. They have design guidelines:
https://design-system.service.gov.uk/
[5] Similar system for the US govt.
https://designsystem.digital.gov/
[6] Mozilla MDN learn accessibility:
https://github.com/coqui-ai/TTS
Mozilla TTS is not maintained anymore (at least ATM).
Disclaimer: I've created both of the projects.