Are there any free/open-source TTS options out there that are on the same level as Google Cloud’s? I tried a lot of free ones, but they are absolutely awful and still sound like my Amiga did 30 years ago. With LLMs being available as open source, I am hoping there’s also a good TTS offering I just haven’t found yet.
Piper is my choice. Very easy to use from the command line, fairly good sounding voices. Prior to that, for years (decades?) I used espeak-ng, had a very robotic voice but articulated almost everything very clearly, and I got used to it so didn’t actually mind.
tal@lemmy.today 1 year ago
Festival – not cutting edge – will definitely be better than your Amiga, and can handle long text. Last time I set it up, IIRC I wanted some voices generated by Tokyo University or something, which took some setting up. It’ll probably be packaged in your Linux distro.
You can listen to a demo here.
www.cstr.ed.ac.uk/projects/…/onlinedemo.html
It’s not LLM-based.
For short snippets, offline, one can use Tortoise TTS – which is LLM based. But it’s slow and can only generate clips of a limited length. Whether it’s reasonable for you will depend a lot on your application.
github.com/neonbjb/tortoise-tts
Examples at:
nonint.com/static/tortoise_v2_examples.html
state_electrician@discuss.tchncs.de 1 year ago
Ah, I looked at Tortoise, but I do not have an nVidia GPU, so I couldn’t try it. Festival I tried and the results were bad. Not so much for the voice, but for intonation and pronunciation.
tal@lemmy.today 1 year ago
I use it on an AMD GPU.