MITB Banner

Max Planck Releases Moûsai for Text-to-Music Synthesis

Moûsai can generate long-context, high-quality stereo music at 48kHz.

Share

Listen to this story

German research lab Max Planck Institute recently released a research paper for Moûsai, a text-to-music model to generate long-context high-quality 48kHz stereo music beyond the minute-mark based on context exceeding the minute-mark and generate various music. 

The team came up with a new, more efficient way to generate real-time audio. They created a 1D U-Net architecture that can run on a single consumer GPU. This means that it can be trained and run even in universities that don’t have access to huge resources.

The team also introduced a new diffusion magnitude autoencoder to shrink the audio signal 64 times smaller while still keeping the quality mostly the same. This tool is used in the new architecture’s generation stage to improve the audio sound.

Read the full paper here

Generating music involves multiple elements such as temporal dimension, long-term structure, multiple sound layers, and subtleties that only trained ears can pick up.

Joining Meta, last week, big tech Google also unveiled MusicLM, a generative model for creating high-fidelity music from text descriptions, such as “a calming violin melody supported by a distorted guitar riff”. MusicLM makes music at 24 kHz that holds steady for several minutes by modelling the process of conditional music synthesis as a hierarchical sequence-to-sequence modelling problem. 

Read more: Google Unveils MusicLM, a Music DALL-E

Diffusion models are becoming increasingly popular. They’re not just used for images anymore. With the power of these models, anything can be created from text — videos, speech, and even music.

Music synthesis is the latest arena for diffusion models. While there has been some progress, there’s still much more to discover and explore in this exciting field.

Share
Picture of Shritama Saha

Shritama Saha

Shritama (she/her) is a technology journalist at AIM who is passionate to explore the influence of AI on different domains including fashion, healthcare and banks.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.