OpenAI showcases text-to-speech AI that can imitate a voice after 15 seconds of audio - IT Pro

OpenAI has created a tool that can imitate audio based on a fifteen-second segment. The company has released samples of the audio engine, but doesn't want to release the entire model right away.

OpenAI, the AI company that also makes ChatGPT, He describes the tool in a blog post. The model is called Voice Engine and it can read texts that the user provides as text input. Based on a voice sample, OpenAI claims that the AI can perfectly imitate the voice, including tone and emotion. The company says such a segment only needs to last for fifteen seconds.

The company does not disclose any data about the tool, and no white paper or other technical description is available. So, it's not clear, for example, what audio clips Voice Engine was trained on. says OpenAI To TechCrunch They relate to a mix of licensed and publicly available data. According to the company, Voice Engine is not trained on user data. Samples that users create afterward are also deleted.

According to TechCrunch, the tool should cost money in the future, although OpenAI says nothing about this publicly. The company will charge $15 per million characters, or about 160,000 pronounceable words, according to the documents.

Voice Engine is not yet available to users, as is often the case with similar services these days. Last year, Meta showed that Voicebox can also generate spoken text based on short audio files, but the company isn't making that tool available either. OpenAI says it's also being cautious about it now because of the consequences. The tool can quickly be abused. OpenAI specifically refers to the United States, where the presidential elections will be held at the end of this year and the electoral battle has now begun.

The company has posted a number of examples on a blog showing what the tool can do. Additionally, OpenAI is testing the Voice Engine with a limited number of testers. They had to sign a statement in advance stating that they would not create texts without the permission of the person concerned. The tool will also have a watermark showing that the audio was created and OpenAI says it is “proactively monitoring” how the system is being used. When the tool is released in the future, OpenAI also wants to create a list of sounds that should not be cloned.

Magdalena Zlatica

“Lifelong entrepreneur. Total writer. Internet ninja. Analyst. Friendly music enthusiast.”

OpenAI showcases text-to-speech AI that can imitate a voice after 15 seconds of audio – IT Pro – News

iPhone SE 4: Bigger screen and Face ID according to new rumors

Belgian co-production with acclaimed actor Crispin Glover selected for Toronto Film Festival

Helldivers 2 Escalation of Freedom update will be released on August 6

Koka (Japan) and Riley (New Zealand) start…

SpaceX launches Falcon 9 again after launch failure, sensors investigated – IT – News

“Ask at least one question in return.”

Hein Vanhaezebrouck drops bombshell over Nicky Hein’s position at Club Brugge – Football News

Leave a Reply Cancel reply

More Stories