What is Meta’s Voicebox AI Tool?

Must read

Studio Time: Jason C. Arnold and His “Free Candy” Open Up in Santa Cruz

HomeSubscribeShopQuarterly Archive Current IssueSpring 2023Winter 2023Fall 2022Summer 2022Winter 2022Spring 2022Fall 2021Spring 2021Summer 2021Winter 2021Fall...

Tori Spelling Confesses to Spending $400 on Denise Richards’ OnlyFans

Tori Spelling admitted to subscribing to Denise Richards' OnlyFans account under a fake name and dropping $400 in two days: "It's riveting."Denise Richards Wet...

How to Combine Serums in Your Skincare Routine

We’re big fans of layering serums in our routine, and now that we’ve fleshed out a collection of...

24 Non-Boring Puffers to Get You Through Winter With a Smile

Your guide to looking fab while keeping warm.Photo: ImaxtreeYour guide to looking fab while keeping warm.We're mere weeks away from the first day of...

In mid-June, Meta introduced “Voicebox” with a post on their Meta blog, Meta Newsroom. The blog post “Introducing Voicebox: The Most Versatile AI for Speech Generation” gave key takeaways, such as, “Voicebox is a generative AI model that can help with audio editing, sampling, and styling.” Want to learn more about Meta’s project? Keep reading!

Here’s some more Meta news for you! Read about Meta’s Project P92!

What Makes Meta’s Voicebox Different? 

Most other voice-generation AI tools are just that: voice generation. Meta’s Voicebox is different in that Voicebox can not only generate multiple synthetic voices from text but it can also edit pre-recorded audio clips. It can modify any part of an audio clip, not just the end of one. 

Meta’s Voicebox can recreate a portion of speech interrupted by noise or replace misspoken words without re-recording the clip. In addition, the model is multilingual, producing audio in six languages (English, French, German, Spanish, Polish, and Portuguese). 

Voicebox’s versatility allows for there to be so many different uses for it:

  • Increase accessibility for text-based platforms for blind users to hear their articles read to them – and possibly in their native language! 
  • Make more English platforms accessible to those who do not know English by having them read in their native language.
  • Making editing audio for videos easier for content creators and making their videos more accessible to viewers by making them in more languages.

Are you on Threads yet? Read all about Instagram’s new Twitter competitor, Threads!

How Well Does Meta’s Voicebox Work? 

Source: Introducing Voicebox, Meta NewsRoom Research.

Voicebox currently outperforms Microsoft’s state-of-the-art English model VALL-E regarding both intelligibility and audio similarity while being as much as 20 times faster at audio generation. Regarding multilingual voice synthesis, regarding cross-lingual style transfer, Voicebox outperforms GitHub’s YourTTS, reducing the average word error rate from 10.9 percent to 5.2 percent and improving audio similarity from 0.335 to 0.481.

Source: Introducing Voicebox, Meta NewsRoom Research

These influencers will be talking about this! Here are the Top AI Influencers!

How Does Meta’s Voicebox Work?

Voicebox was built upon Meta’s latest advancement on non-autoregressive generative models, the Flow Matching model. This model allows Voicebox to learn highly non-deterministic mapping between text and speech. Non-deterministic mapping is useful because it allows Voicebox to learn from varied speech data without labeling them. This enabled Voicebox to train on a much larger and more diverse scale. 

Meta’s Voicebox was trained with more than 50,000 hours of recorded speech and transcripts from public-domain audiobooks in English, French, Spanish, German, Polish, and Portuguese. Voicebox has been trained to predict a segment when given the surrounding speech and the transcript of the segment, having learned to infill speech from context.

Having learned from a diverse set of audio data, Voicebox can generate speech more representative of how people speak in the six languages it recognizes. Meta’s results show that speech recognition models trained on Voicebox-generated speech perform almost as well as models trained on real speech.

The Future of Audio AI 

Meta believes Voicebox could “usher in a new era of generative AI for speech.” However, all innovations in AI come with the potential for misuse and harm. Meta anticipated this and programmed Voicebox with a solution: a highly effective classifier that can distinguish between authentic speech and generative AI speech. 

Though Meta’s Newsroom blog post does not display an example of Voicebox putting this classifier to use, Meta was empathetic and savvy, having programmed the classifier into it without the issue arising first. 

Closing Thoughts

Meta’s Voicebox is a huge innovation in the world of audio-generated AI. With this being the first speech AI to be able to edit audio completely uninhibited, the sky truly seems to be the limit in terms of what and where audio-generated AI can do and go. It is also incredibly comforting to know that Meta has taken the time, energy, and resources to ensure that their AI tool is less likely, if not completely impossible, to be misused or to cause harm. The future of audio-generated AI is out there and ready to be explored!

This article was written by Alyssa Micalizzi

Want to run a successful marketing campaign? Get started with us today!

More articles

Latest article

E! Insider’s 20 Days of Giftmas: Win a Dermalogica Brighter Skin Set

We independently selected these deals and products because we love them, and we think you might like them at these prices. If you buy...

Preview: Sarah Lee’s “Two Skies” Over NYC

albertz benda, NYC // September 07, 2023 - October 14, 2023 ...

The November 2022 Allure Beauty Box — See All the Products Inside

With the weather getting colder, a pick-me-up is in order and have we got just the thing for you. A cheekbone-illuminating, frizz-smoothing routine that’s...

Goldbergs Sneak Peek: Adam, Carmen Hit a Relationship Milestone

The Goldbergs Exclusive PeekIt won't be a blue Christmas for Adam Goldberg. In E! News' exclusive clip of The Goldbergs' Dec. 7 episode, the aspiring filmmaker (Sean...

The Uncertain Future of Clean Beauty

Stay in the know Sign up to get influencer marketing strategies, insights, and event invites sent to your inbox. By...