Author

Topic: Google's AI voice synthesizer sounds eerily more human (Read 114 times)

newbie
Activity: 21
Merit: 0
more humane? ... so funny..how inhuman it is that who creates the technology is human, which must be similar
jr. member
Activity: 200
Merit: 1
Zaydo
I use jarvis on playstore it makes my android awesome like real A.I jarvis
jr. member
Activity: 41
Merit: 3
Interesting, I have never heard of this one. I'll go look it up.

I think a more natural sounding voice will revolutionize the way we interact with AI.
newbie
Activity: 48
Merit: 0
I use a text to voice program called Natural Reader. It's a few years old and I don't know about the newer versions of it, but those examples are light years ahead. Will look more into them.
jr. member
Activity: 41
Merit: 3
For those interested in AI technology developments check out this story:

Google’s new AI voice synthesizer, a service named Cloud Text-to-Speech, will be available for businesses and developers that want to integrate voice synthesis in apps, websites or virtual assistants.

Traditionally, voice synthesizers use what is called concatenate synthesis, in which the program pieces individual syllables together to form words and sentences. While this language is understandable, it doesn’t have the imperfections of human speech that make it sound realistic, despite its development over the years.

In comparison, Google’s new AI voice synthesizer is powered by WaveNet, which uses machine learning to generate audio from scratch. It analyzes a huge database of human speech and re-creates them at a rate of 24,000 samples per second. The end result is a voice with subtleties like lip smacks and accents!

Check out the full story here (includes audio samples of WaveNet vs. a traditional audio synthesizer): https://www.theverge.com/2018/3/27/17167200/google-ai-speech-tts-cloud-deepmind-wavenet

I don't think I hear much of a difference in the English version however, maybe something is wrong with my hearing. Does anyone else hear a big difference? The Japanese sample definitely shows a big difference. I am kind of excited for this because I have been wondering when that very robotic sound in synthesizers would be programmed to be a bit more natural.
Jump to: