A band known on YouTube as Dadabots has been streaming a 24/7 death metal broadcast called Relentless Doppelganger. The kicker is that the death metal sounds on the stream are completely generated by a neural network.
CJ Carr and Zack Zukowski, the minds behind Dadabots, outlined the idea for neural synthesis of metal music in a 2018 paper. This was followed up with the release of 10 albums, culminating in the Relentless Doppelganger project.
How DadaBots Works
The project is a neural network that was trained on the music of a death metal band known as Archspire. The network used to create the music is a model known as SampleRNN.
A recurrent neural network, SampleRNN was originally developed for text-to-speech purposes. It was meant to be trained on raw samples of audio speech, and would generate a sound that was indicative of new raw audio examples of speech.
The RNN functions to identify patterns in specific genres of music and predict future, most possible common elements and play them. The network also knows which output sounds ‘real’ and which does not, allowing it to continually improve itself the more it is trained.
Moreover, the model also has support for conditional training with metadata. This, according to Dadabots, gives the model a “distinct sonic behaviour”. The way SampleRNN works is that it predicts what the audio sample in the next second or so will be.
The predictions are based on the weights of the RNN, with the number of predictions depending on the sample rate of the audio. The TTS model was then repurposed to be trained on collections of various artists. Albums were picked for training by the researchers, as they stated,
“We choose to use one album from one artist, because it represents the longest cohesively packaged musical idea. Teams of producers and audio engineers ensure that a commercial album is normalized, dynamically compressed, denoised, and filtered to deliver a consistent aural quality across various listening environments.”
In other words, albums were picked because they represented the cleanest possible data that could be given to the model. Then, tests were ran with multiple genres of music, such as hip-hop, rock, skate punk and black metal.
The AI Before Relentless Doppelganger
As mentioned previously, the band released 10 albums before initiating the Relentless Doppelganger project. They released a paper in 2018 detailing how the network was trained from scratch.
One of the first albums, called Deep The Beatles, was the result of the RNN training on the album ONE by The Beatles. This did not create favourable results, as the model “never learns rhythm or song structure”. However, voice and percussion are present in the later iterations of the model, showing that the researchers were indeed on the right path.
A second attempt was made with the album Calculating Calculating Infinity, which was trained on the mathcore album Calculating Infinity by The Dillinger Escape Plan. However, this album caused a bug in the autoregression part of the algorithm, leading to the album becoming too repetitive.
However, a breakthrough came in the third album, known as Coditany of Timeness that was trained on the black metal album known as “Diotima” by Kralice. The researchers stated,
“This was the first model of ours that learned to keep time, heard in the consistent pulse of the blast beats.”
They iterated on the model with albums Inorganimate, Megaturing and Bot Prownies, trained on albums Nothing by Meshuggah, Mirrored by The Battles, and Punk In Drublic by NOFX respectively.
Over the course of these albums, they worked out a method of human curation as “only about 5% of the AI-generated tracks were usable”. Carr further stated,
“The remarkable part is the high quality-to-s**t ratio. Here [Relentless Doppelganger], we livestream 100 percent of it. Zero curation necessary.”
The Success Of Relentless Doppelganger
The stream, which has been going on for more than a month now, continues to blast out death metal music that sounds human-made to the untrained ear. Reportedly, the reason for the human curation being removed in this situation is due, in part, to the training data itself.
Other iterations had songs destabilise and fall apart in front of the researchers’ eyes. The network performs better on songs with faster blast beats, revealed Carr. This brings ‘stability’ to the output, and the band Archspire is known for being “insanely fast”. However, the server is completely autonomous, revealed Carr. He went on to state,
“[It’s] running on a linux server somewhere in South Carolina. You’re hearing everything it makes.”
This opens up the interesting possibility of autonomous servers blasting out similar music for various different genres, completely autonomously. It also shows that the age of AI-powered music may not be as far away as we think.