Song sung who?: The exploding landscape of deep-fake music

Updated on: Aug 25, 2023, 20:00:50 IST

By Bhanuj Kappal

Prefer HTon Google

Share via

Copy link

Recent releases have included an ‘Eminem song’ on cats; tracks by a faux-Oasis. Above, a Midjourney rendering of the landscape of cross-bred music generated by humans and AI programs.

New programs can put any lyrics, any tune, to the voice of an icon. Some of these tracks are being mistaken for new releases. Where will it end?

In May, Indian music fans stumbled upon a rare treasure on YouTube: a cover of Pakistani singer Rahat Fateh Ali Khan’s Tumhe Dillagi (You and Infatuation), with vocals by Atif Aslam, Diljit Dosanjh and Sidhu Moosewala.

The track quickly went viral, racking up views and adulatory comments. There was just one problem. None of those artists had actually sung the track. Their voices had been generated by US-based techie and DJ Amarjit Singh, using an artificial intelligence (AI) voice model.

DJ Singh’s YouTube channel (@DJMRAsingh) is full of such “AI covers”. They simulate the inflections and intonation of artists ranging from Bollywood crooner Arijit Singh to the late Punjabi folk icon Amar Singh Chamkila.

This is the world of deep-fake music, an exploding landscape of music generated by technology.

This year alone, there have been tracks recreated by AI in the voices of a range of music megastars, dead and alive. In April, a TikToker with the handle Ghostwriter977 self-released Heart on My Sleeve, featuring AI-generated vocals moulded on the voices of Drake and The Weeknd. Before it was taken down, the song racked up 600,000 Spotify streams, and over 15.2 million views across TikTok and YouTube.

This is a path that the music industry first wandered onto in 2004, when Yamaha released the voice-synthesiser software Vocaloid as part of its popular keyboard range. These keyboards had already graduated from being mini-pianos to offering a range of built-in sounds, to offering a range of built-in songs that could be embellished during live play. The Vocaloid came with ready voices. Users could input lyrics and have their “song” play in the voice of one of the actors and singers who had signed on to feature on the system.

Fast-forward a decade and, in 2015, Sony Computer Science Laboratories used an AI program, and some help from French composer and lyricist Benoît Carré, to analyse a collection of Beatles songs and create a new track: Daddy’s Car. Neither the lyrics nor the tune is noteworthy, but they do ring a familiar bell. And this would be a turning point.

Today, a laptop and some free time are all one needs to replicate such an effort.

Feed a few moments of audio into an AI model such as Diff-SVC (Singing Voice Conversion via Diffusion), and one can have the late Kurt Cobain be the lead singer of one’s grunge revival band, or have Travis Scott rap over beats generated on Beatoven.ai.

Recent releases created in this manner include a rap song about cats “by Eminem” (“Cats cats cats, they’re always on the prowl / They’re sneaky and sly, with their eyes on the goal” goes one line of Cat Rap, created last year by YouTuber Grandayy and ChatGPT; it has since been taken down following a notice from Eminem’s music label). Similar efforts feature the “voices” of Kanye West and Rihanna; there’s an alternate-reality faux-Oasis, for fans who wish the band had never broken up.

Newer platforms allow users to recreate music with no input of their own; just a few clicks. On OpenAI’s Jukebox, released in 2020, one can mix and match song, genre and artist, to create a new rendition in seconds. So City of Stars from La La Land can play out in the “voices” of the late Frank Sinatra and Ella Fitzgerald, as a classic jazz track. Or as a pop number that sounds like the Backstreet Boys. Or as a hip-hop track in the “voice” and meter of Tupac Shakur.

Walk the line

The possibilities, both creative and commercial, are endless. Which is why AI is being compared to some of the most disruptive innovations in music history, such as the synthesiser and Napster (the peer-to-peer music sharing platform that kicked off the digital revolution in music distribution).

There is, of course, the issue of copyright. Universal Music Group (UMG) has been issuing takedown notices under existing copyright laws, to discourage the use of its artists’ voices. These artists include Eminem, Drake and The Weeknd.

There is the question of consent. Courts around the world are ruling that AI-generated work is essentially copyright-free, so an AI music cover may not eventually be profitable to anyone. But what about the artist’s right to control what is said and sung in their voice?

Meanwhile, legal exceptions are emerging for copyright on works with “substantial human input”. Laws will eventually need to navigate this tangle with respect to AI-generated music.

In March, about 40 music labels and artist organisations came together to form the Human Artistry Campaign, to outline possible AI best practices and argue for the protection of copyright and intellectual property rights.

UMG, in a statement released in April, summarised the problem as a question of “which side of history all stakeholders in the music ecosystem want to be on: the side of artists, fans and human creative expression, or on the side of deep fakes, fraud and denying artists their due compensation.”

Then again, this month, it emerged that UMG and Google are negotiating a deal that would allow the licensing of its artists’ voices for AI-generated music. In that other high-stakes world of storytelling — cinema — artists are pushing back against such alliances.

But the truth is that the genie is out of the bottle, and no amount of rubbing on the lamp is going to force it back in.

We’ve been here before; with the synthesiser and Napster, with television and Netflix. It is unlikely that the music industry or the world of filmmaking will be able to push back successfully against a disruptor as significant as evolving AI. By the time the dust settles, though, they’ll be just two of the things in a world that just doesn’t look or sound the same.

Artificial Intelligence
HT Wknd

Explore the latest Lifestyle News on health, fashion, travel, relationships, food and festivals. Find useful tips, expert advice, trends and inspiring stories for everyday living.

Home/Lifestyle/Art Culture/Song Sung Who?: The Exploding Landscape Of Deep-fake Music

See Less