Nvidia drops AI music editor capable of creating unheard sounds

by ccadm November 27, 2024

November 27, 2024

The new AI tool can modify an individual’s voice, altering accents or emotional tones to make it sound angry or calm

Nvidia has announced a groundbreaking AI music editor that can produce sounds never heard before, including a trumpet that meows. This innovative tool, named Fugatto, has the capability to generate music, sounds, and speech using text and audio inputs it has not previously encountered.

How does Fugatto demonstrate creative capabilities?

A video demonstration reveals that Fugatto can compose music based on imaginative prompts, such as one involving a saxophone howling and barking alongside electronic music with dogs barking.

What unique sound effects can be generated?

The company also provided additional examples, showcasing its ability to generate distinctive sound effects from descriptions, including deep, rumbling bass pulses accompanied by intermittent, high-pitched digital chirps, which evoke the sound of a massive sentient machine awakening.

In what ways can voices be modified?

Fugatto can even modify an individual’s voice, allowing for alterations in accent or emotional tone, such as making a voice sound angry or calm. Moreover, it offers music editing features; for instance, it can isolate vocals in a track, incorporate different instruments, and even replace a piano melody with an opera singer’s voice.

What research and training data were used?

A research paper accompanying the announcement details the extensive datasets that Nvidia claims Fugatto was trained on, including a sound effects library sourced from the BBC.

While several AI audio tools exist from companies like Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe, none have claimed the ability to create entirely new and unprecedented sounds. Some AI startups are currently facing copyright lawsuits over their music generation tools, and a recent report indicated that Nvidia and others trained their AI models using subtitles from thousands of YouTube videos.

How was Fugatto developed and what is its future availability?

To develop Fugatto, Nvidia states that researchers compiled a dataset containing millions of audio samples. They also designed instructions that significantly broadened the range of tasks the model could perform, enhancing its accuracy and enabling new functionalities without the need for additional data. However, Nvidia has not disclosed when, or if, the tool will be made widely accessible.

Source link

Nvidia drops AI music editor capable of creating unheard sounds

How does Fugatto demonstrate creative capabilities?

What unique sound effects can be generated?

In what ways can voices be modified?

What research and training data were used?

How was Fugatto developed and what is its future availability?

Bluesky’s open API means anyone can scrape your data for AI training

Orange Partners with Meta, OpenAI to Develop AI Language Models in Africa

Related Articles