Press Clipping
Music’s problem with AI training may not be the one you expect

Many of the recent controversies around creative visual AIs – those that create images mainly – have focused on what material those models were trained on.

Specifically whether they hoovered up human artists’ and designers work without permission, credit or any kind of licensing fees, and then compounded that by letting anyone make new images ‘in the style of’ those creators.

We’ve seen a fair amount of sabre-rattling in the music industry in recent weeks as a follow-on from this, with rightsholders warning creative musical AI startups not to follow suit. “The industry needs to be paid and our artists need to be paid based on the AIs that are learning off their music,” as WMG’s Oana Ruxandra put it in January at the NY:LON Connect conference.

Here’s the thing, though. What if the problem for musical AIs is not that they have been trained on copyrighted music, but that they haven’t? That’s one of the points that stands out from our interview on Friday with Oleg Stavitsky, CEO of AI music startup Endel.

“The thing with AI is that the output is only as good as the input. You need high-quality data sets to train your models on,” he said. “Most of the AI music models were trained on just stock music, or stems that were created by a bunch of session musicians basically.”

“We have seen what AI did for graphic design. There has been massive outrage when people have recognised their style in the output of some of the big systems: because basically they’ve taken all of the visual design and art and fed it into their machines,” he continued.

“Fortunately, you cannot do that with music, because it belongs to someone. The music industry has been clear on that: ‘If you train your model on our content, we’re going to come after you’.

So, in order for AI music to become as good as the actual music that we all love, it needs to be trained on actual [commercial] music, and for that, you need to take that hard road and collaborate with music labels.”

“You need to talk to musicians and get their stems to train your models. Otherwise your output will still sound like stock music.”

This is just one startup, and the “most of the AI music models” quote [our emphasis] suggests that some of the others may have taken a different path. The sabre-rattling may still have a point. But Stavitsky’s comments do shed light on one of the paradoxes of AI music.

If musical AIs have only been trained on stock music, their output may only be good enough to rival, well, stock music. Taking the next step up in quality may need much wider training sets, which likely means licensing deals with rightsholders. The same rightsholders who may worry about the impact if these deals help AI music become “as good as the actual music that we all love”…

Delicate? Just a bit. But it’s a useful encapsulation of why AI music and its impact on the music industry may not necessarily follow the same pattern or throw up the exact same challenges as creative AIs in the visual arts and design worlds.