This kind of stuff is going to be amazing for indie gamedevs. I want a model trained for "powerful narrator voice" and villain speeches.
I created these samples in a relatively short time using the Free/Open Source (which I think is an important factor for indies) text-to-speech project Larynx & an narrative editor I finally released the other weekend:
* https://github.com/rhasspy/larynx/
* https://rancidbacon.itch.io/dialogue-tool-for-larynx-text-to...
Now, I would really like to link you directly to audio of the next two but considering it's currently in beta behind an (automated response) email address, I think that may not be appropriate, so, instead...
* Visit & get access to the beta here: https://mycroft.ai/blog/mimic-3-preview/
* Copy & paste this SSML into the form: https://pastebin.com/Bwd7LCbj
It's definitely a noticeable step up again in quality.
There's an alternate pair of voices if you move the "_" from one "name" attribute to the other in each "voice" element.
I intentionally didn't edit the text to remove some of the artifacts both to give a realistic impression of the current state & because sometimes they add interesting texture. :)
Note the beta voices are "low" quality.