On Wednesday, August 2, 2023, Meta released a new AI tool called AudioCraft. It is an open-source software designed to enable users to make music and audio by using simple text prompts.
AudioCraft is made up of three models: MusicGen, AudioGen, and EnCodec. The first model, MusicGen, was trained using music owned by Meta and licensed specifically for this purpose. It creates music based on the text prompts provided.
On the other hand, the second model, AudioGen, was trained using public sound effects data and generates audio from text prompts, Meta explained in its official news article.
Meta is turning up the volume in the AI open-source space.
They just announced Audiocraft, an open-source framework that's all about creating high-quality, realistic audio and music from short text prompts.
Audiocraft is made up of three AI models.
1. MusicGen, which learns… pic.twitter.com/7XTTxqCfMn
— Pam, Oru, Vic & Tiago (@SeshWithFriends) August 2, 2023
Moreover, Meta announced the launch of an enhanced version of their EnCodec decoder. This offers users the opportunity to experience higher-quality music generation with fewer artifacts.
Alongside this, the company is also making available its pre-trained AudioGen models, empowering users to create a diverse range of environmental sounds and sound effects, including dogs barking, cars honking, and footsteps on a wooden floor.
Open-source access to all models
The models are now being open-sourced, granting researchers and practitioners the opportunity to access and train their own models using their unique datasets, Meta reported.
Despite widespread enthusiasm for generative AI in images, video, and text, progress in audio generation has been somewhat slower.
Previous attempts have often involved complex and closed systems, making it difficult for people to experiment freely.
“ We're open sourcing the code for AudioCraft, which generates high-quality, realistic audio and music by listening to raw audio signals and text-based prompts.” — Zuck pic.twitter.com/XDyS8vMW9g
— Aiman (@aymen2080) August 2, 2023
Creating high-quality audio, regardless of the type, requires the modeling of intricate signals and patterns at different levels.
Among various audio types, generating music has been proven to be the most challenging task due to its combination of local and long-range patterns. This is because, as Meta also explains, the overall musical structure of any piece is often comprised of the individual notes of multiple instruments.
‘Giving people the full recipe to play’
The AudioCraft models are proficient in producing top-notch audio that maintains long-term consistency, and they come with user-friendly capabilities.
AudioCraft simplifies the design of generative audio models, making it more accessible than previous approaches in the field, as pointed out by Meta.
This provides users with a complete set of tools in experimenting with existing models that Meta has been refining over the past few years.
Furthermore, users are empowered to explore new possibilities and create their own models, pushing the boundaries of AI-generated audio even further.
Meta further explained that AudioCraft is a versatile tool that covers various aspects, including music, sound, compression, and generation, all within a single platform. Its user-friendly nature enables easy building and recycling of functionalities.
With AudioCraft, users aiming to develop enhanced sound generators, compression algorithms, or music generators can achieve this efficiently using the same code base.
See all the latest news from Greece and the world at Greekreporter.com. Contact our newsroom to report an update or send your story, photos and videos. Follow GR on Google News and subscribe here to our daily email!