Nvidia presented a tool at the Interspeech 2021 conference with which AI voices can learn the natural pronunciation of words. With the RAD-TTS tool, researchers can use the recording of their own voice to train a speech algorithm.
At the GPU Technology Conference in 2017, Nvidia researchers demonstrated the progress they have made in developing artificial intelligence. They also released a synthetic voice at the time, but weren’t entirely satisfied with the performance.
In 2020, a new AI voice was introduced: flutron. This artificial voice sounded more natural and human, but the researchers weren’t done yet. The next step, according to the researchers, was to modify the algorithm when errors occurred during pronunciation, in the same way as with humans: by imitation.
Researchers have developed an AI model for this purpose called RAD-TTS, which a ai- text to speech-Teach an algorithm how to pronounce a word or group of words. They do this by loading their audio recordings onto the algorithm, and converting them into parameters that can then be imitated by the algorithm.
With RAD-TTS, the pitch of the recorded voice can also be changed dramatically. This enabled one researcher to transform his male voice into a synthetic female voice. This voice was used as a voiceover in the promotional video. Some new technology is open source according to Nvidia and will be available on Nvidia NeMo . Toolkit.
“Coffee buff. Twitter fanatic. Tv practitioner. Social media advocate. Pop culture ninja.”
More Stories
Mercedes prototype travels 1,000 kilometers without recharging (and consumes much less)
TomTom loses due to declining car production – tablets and phones – news
At this spa, you plunge into a bath full of wine: ‘It has an anti-inflammatory effect’ | Instagram NINA