Via FryAI
“What’s new? OpenAI has unveiled a preview of Voice Engine, a revolutionary tool which offers synthetic speech generation capabilities. | ||||||
How does it work? From just a 15-second recording, Voice Engine is capable of mimicking speech that sounds very similar to the original voice. This voice can be used to read text prompts, even if they are not in the speaker’s native language. | ||||||
|
||||||
When will it be available? OpenAI, cognizant of the risks associated with Voice Engine, has opted for a cautious rollout, prioritizing responsible deployment and soliciting feedback from diverse stakeholders before putting this powerful tool in the hands of the public.” |
OpenAI’s voice cloning AI model only needs a 15-second sample to work
https://www.theverge.com/2024/3/29/24115701/openai-voice-generation-ai-model
Called Voice Generation, the model has been in development since late 2022 and powers the Read Aloud feature in ChatGPT.
Via AI Tool Report
“OpenAI can replicate your voice with a 15-second sample |
Our Report: OpenAI has launched a voice cloning AI model called Voice Engine, which can replicate a person’s voice from just a 15-second sample. |
Key Points: |
|
Why you should care: OpenAI’s Voice Engine represents a significant advancement in voice cloning technology, highlighting the potential for innovative applications and addressing ethical and security challenges.” |
Via The Neuron
“OpenAI’s new AI can say anything in your voice. |
||
OpenAI just announced Voice Engine, an AI capable of replicating any voice in any language based on a brief audio sample. | ||
In layman’s terms, upload a 15-second sample of Pete speaking → Generate Pete saying something new while maintaining his “vibe.” | ||
See it in action: | ||
|
||
|
||
The tech isn’t new (shoutout ElevenLabs and Play.ht). In fact, the same researcher who built Voice Engine had also built an earlier tool that ended up being the tech behind Play.ht. | ||
Still, more people know OpenAI than the other two. So what does this mean in practice? | ||
|
||
On the flip side, this technology also poses significant risks, granting bad actors the potential to misuse someone’s voice—a dilemma we’re all too aware of by now. | ||
|
||
Remember those deepfake robocalls that imitated Biden discouraging voting New Hampshire? Or the scammer who, posing as an employee’s superior, swindled $25.6M? | ||
One can only imagine the awkwardness of explaining that scenario to their employer… | ||
|
||
For the time being, OpenAI plans to keep Voice Engine under wraps until it’s ready for widespread deployment.” |
Via Unwind AI
“OpenAI Teases Again with a New Voice Cloning Model
OpenAI has been releasing demos of its text-to-video model Sora and we’re eagerly waiting for it to be publicly available. But before the anticipation could settle, OpenAI has announced a new model but it’s not available for use, again. The company has developed Voice Engine, an AI model for voice cloning that uses a 15-second audio sample and text input to almost perfectly clone the voice.
Key Highlights:
- Development and Testing: Voice Engine was first developed in late 2022. It is being tested with “trusted partners” for applications like non-readers and children, content translation, and improving essential service delivery in remote settings.
- Training and Data Use: The model is trained on a mix of licensed and publicly available data, with details on the training data being closely guarded considering the ramification of copyright issues.
- Editing: TheVoice Engine currently doesn’t allow editing the generated output. There are no options for adjusting the tone, pitch, or cadence of the voice.
- Pricing: Voice Engine will cost $15 per 1 million characters. It is quite cheap in comparison to the current-best in the industry – Eleven Labs – that charges $11 for 100,000 characters per month but provides editing features also. (Source)
Following is an example of translation from the HeyGen platform that is using OpenAI’s Voice Engine model.
Reference Audio:
LISTEN NOW · 0:16 |
Generated Audio in German:
LISTEN NOW · 0:21 |
”
OpenAI unveils AI voice cloning tool |
||
|
||
Via The Rundown AI: “OpenAI has unveiled a preview of Voice Engine, a model that can clone human voices from a 15-second audio sample and generate natural-sounding speech. | ||
The details: | ||
|
||
Why it matters: OpenAI is clearly far ahead in the space, with Voice Engine being deployed internally since 2022. However, with no public release in sight, the company seems to understand the risks, such as deepfake scams during an election year.” |
Via OpenAI
Navigating the Challenges and Opportunities of Synthetic Voices
We’re sharing lessons from a small scale preview of Voice Engine, a model for creating custom voices.
https://openai.com/blog/navigating-the-challenges-and-opportunities-of-synthetic-voices
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.