AudioCraft is a new open-source AI tool from Meta. According to the business, this technology is intended to allow both professional musicians and everyday people to produce audio and music from simple text cues.
MusicGen, AudioGen, and EnCodec are the three models that comprise AudioCraft. MusicGen can generate music from text inputs after being educated with Meta’s own music library. AudioGen, on the other hand, has been educated on public sound effects and can generate audio from text inputs. Furthermore, the EnCodec decoder has been upgraded, enabling higher-quality music creation with less undesirable artifacts.
Utilization of the New AI Tool AudioCraft
Meta is releasing pre-trained AudioGen models that will allow users to generate environmental sounds and sound effects such as dogs barking, automobiles honking, or footfall on a wooden floor. Meta is also giving all of the model weights and code for the AudioCraft tool. This new tool can be used for a variety of tasks, including music composition, sound effects creation, compression methods, and audio production.
Meta hopes that by open-sourcing these models, researchers and practitioners will be able to train their own models using their own datasets.
According to Meta, whereas generative AI has made great advances in images, video, and text, audio has not experienced the same level of progress. AudioCraft fills this void by offering a more accessible and user-friendly framework for producing high-quality audio.
Meta states on its official blog that generating realistic and high-fidelity audio is particularly difficult because it includes simulating complicated signals and patterns at many scales. Music, as a synthesis of local and long-range patterns, poses a unique difficulty in audio creation.
AudioCraft is capable of delivering high-quality audio for extended periods of time. The business claims that it simplifies the design of generative models for audio, allowing users to play with existing models more easily.