Meme culture has always been at the cutting edge of tech. From being a mainstay of BBS boards in the late 1990s to picking on NFT bros, memes have come a long way. But these Internet jokers might have just hit the mother lode of meme-making technology—audio deepfakes.
While AI voice memes have been around in some form since ‘15.ai’ launched in 2020, the recent launch of Eleven Labs’ AI speech synthesiser has supercharged this trend. With the rise of accessible generative AI, it seems that the meme community has finally found its voice.
All Fun and Games
While generating voices with AI is relatively new, meme generators have been engaging in many ways to create fake audio of people for almost two decades now. This trend first began in 2004 with YouTube Poops—edits of older media in humorous ways. The video platform has also been a way for aspiring voice actors to add dubs onto other forms of media, resulting in the rise of channels like Jaboody Dubs, Team Four Star, and Bad Lip Reading.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
While these channels used older methods like overdubbing with a similar voice and video editing, generative AI is taking the meme world by the horns. These older methods generally require knowledge of complex software, while GenAI simply needs some text and a one minute sample of a person’s voice.
Over the last month, there has been a huge spike towards interest in memes created using generative AI, as seen by the above Google Trends graph. With the rise of Eleven Labs’ accessible AI synthesiser, it’s no wonder that meme creators decided to use the power of AI to crack a few jokes at the expense of some celebrities.
Download our Mobile App
A popular meme-maker spoke to Analytics India Magazine on the rise of generative AI, stating, “Eleven Labs has completely changed the game. I can even make people say something they’ve never ever said before. This just wasn’t possible before. I still remember cutting together clips from the Spider-Man movies just to get him to say a swear word. With AI, I can type in even the most obscure copypasta and have Donald Trump give a speech about it in like five minutes.”
Indeed, the meme community has been overtaken by a new trend of ‘Presidents playing video games’. These videos usually involve Joe Biden, Donald Trump, and Barack Obama speaking to each other in a lingo resembling that of gamers in a voice chat. Some particularly hilarious videos include ‘The US Presidents Have A Sleepover’, ‘Presidents Debate Which Cheeto Is Best’ (featuring a special appearance from George Bush), and ‘Presidents Rank Harry Potter’.
Apart from the US presidents meme, other celebrities caught up in the GenAI wave include Joe Rogan, Elon Musk, Jordan Peterson, and Ben Shapiro. This tweet shared by Musk a few weeks ago shows the extent to which AI can be used to fake voices. In this short video, the Tesla CEO seemingly announces the launch of his OnlyFans account and many took it to be true until Musk himself tweeted about it.
By cutting out clips from Musk’s Joe Rogan podcast and overlaying some dynamic subtitles, many viewers failed to see the obvious desynchronisation between the lips of the speakers and the words they were speaking. This seems to be an easy way to package the fake audio with a relevant video to seemingly trick viewers, but does it hide a far more sinister intent?
Until Someone Loses an Eye
While these memes have explored the light-hearted side of using genAI to impersonate celebrities, critics are raising the alarm. According to naysayers, this will result in a wave of unverifiable fake content fuelled by accessible services like Eleven Labs.
Termed ‘cheapfakes’, these accessible deepfakes present an unprecedented threat. Researchers from Data Society have created a spectrum representing the difference between deepfakes and cheapfakes.
For example, even the most basic deepfakes require knowledge of software like After Effects, with more complex ones requiring an understanding of GANs or recurrent neural networks. Cheapfakes, on the other hand, can be made in real-time using real-time filter applications or services like Eleven Labs. In their report, they stated,
“Both technically sophisticated and exceedingly simple techniques can be used in works of art or fiction. But problems arise when these techniques are used to create works that are interpreted as evidence.”
Meme creators, on the other hand, have a different view on the topic. When asked about the probable misuses of voice generation, the aforementioned meme maker said,
“It’s all in good spirit. My viewers are aware that it’s not actually Joe Biden commentating on which anime waifu is the best. It can be used for negative stuff as well, but we should be allowed to have some fun with it too.”
For once, it seems that the public and researchers are in agreement. As with any other form of fake media, context is the hidden key to decoding the true intent behind it. This is best represented by Twitter’s labels, often attached to media of low veracity. This label, containing the text ‘Official sources stated that this is false and misleading’ often appears below these tweets, providing a bit of context to an often misquoted piece of media.
Extending such principles to the watchdogs of the Internet, like Google’s and Meta’s platforms, can play a huge role in addressing fake news. A context cue describing the video as deepfakes or cheapfakes, as the case may be, will shift the needle on GenAI from malicious to just another post-modern form of entertainment.