How machine learning can generate music

Are computers starting to turn the tables on us and become creative themselves?
15 December 2020

Interview with 

Rebecca Fiebrink, University of the Arts London


A stylised computer network.


So far we’ve heard how computers can both analyse and inspire creativity; but are they starting to turn the tables on us and become creative themselves? The field of 'creative computing' has progressed far in the last few years alone - at least according to creative computing expert Rebecca Fiebrink. She told Phil Sansom that the scale of musical possibilities would have made poor Beethoven's head spin...

Rebecca - I think he'd be astonished. Obviously computers don't just allow us to replay music; they allow us to record it, they allow us to compose it, and to use increasingly interesting algorithms to process and generate music within the computer.

Phil - Is that true? An algorithm can actually generate music?

Rebecca - Well, it depends on your definition of music, but I think we're getting very close - if not already successful - in having computer algorithms that generate patterns of sounds that people would identify as music, and even very enjoyable music in some cases.

Phil - You're talking here about machine learning, right?

Rebecca - That's right. Machine learning is a set of computational techniques for finding patterns in data, and then generating new data that includes similar patterns.

Phil - And what are they looking for in the music? Is it which notes come after which notes, or is it other stuff?

Rebecca - It depends on what kind of machine learning you use and what kind of musical representation you use. One common way of using machine learning in music is to think about music as a sequence of notes over time. And this doesn't work equally well for all music, but if you want to get a computer listening to the melody of a pop song or a folk tune, then actually it's not too bad. Some of the current machine learning techniques that have just come out in the last year or so have been successful using a different representation of music, more like the representation of music you would use if you stored a recording on your phone in order to listen to it. That representation of music is much more complicated, right? You're not just capturing the notes that are playing at which times they're playing, but you're capturing information about the instrumentation, the volume, the texture; like the way that a piano sounds, or a drum machine sounds, or a singer's voice sounds.

Phil - That last one you mentioned boggled my mind. Can it really capture how a singer sounds and can it recreate a singer as if it was that actual singer?

Rebecca - Absolutely. And I would not necessarily have believed this even five years ago, but machine learning technology has moved so quickly. In April this year a company called OpenAI came out with... it's called Jukebox. And one of the first examples I listened to there was a song; and I use that word loosely, but it's a new, fake song sung in the style of Frank Sinatra, called... I think it's "Hot Tub Christmas".

Phil - Can we take a listen now?

Rebecca - Absolutely.

Phil - Wow.

Rebecca - It's hilarious! It's timely, it's actually a pretty fun song; I'm putting it on my Christmas playlist right now. And it sounds like Sinatra.

Phil - You're saying those aren't samples of words taken out of different songs of Sinatra; those are actual bits of sound that it has generated to mimic Frank Sinatra.

Rebecca - That's right.

Phil - That's crazy to me.

Rebecca - It is crazy. To be fair, this algorithm did not also do the job of coming up with the lyrics. The researchers along with their lyrics generation system decided, "oh, this would be fun as lyrics for a new Frank Sinatra song".

Phil - I mean, elephant in the room: the backing track sounds like a haunted funfair.

Rebecca - Yes, yes. That is a limitation of this type of music generation algorithm. The OpenAI folks who made this system did some clever tricks to figure out how we could even do this at all. And one of those tricks involves some noise in the generated piece of music; it's going to sound a little bit like background noise, like you might hear from an old record; some of it is going to have some weird stuff happening; and the pitches that you hear... things might sound not totally in tune.

Phil - Given unlimited computing resources, do you think that they could basically recreate modern music as we know and love it? Or are there more fundamental limitations to an algorithm trying to be creative and make new music?

Rebecca - I absolutely believe that these techniques are going to get better and better in the next few years, approaching the kind of quality we would expect from a recording of a real musician. There are a couple of things that I still see as big barriers. One of the barriers is that it's really, really difficult to generate music that sounds believable in terms of the longer term structure. A Sinatra song is likely to have verses and choruses; when you look at classical music you're going to have much more complicated structures, things like symphonies, where one theme from the first movement might reappear in a changed way in the end of the piece. And these systems really don't have the ability to represent or generate structure at that scale right now. Another big barrier is these systems tend to be very hard for people to control. You can hit redo and have it do the whole thing again, and come up with something different, because there is randomness in these systems; but if you say, "you know what, I'd really like this to be a little bit peppier," or maybe, "I want this to be in a different key," or, "I want there to be a trumpet in this song," right? There's all sorts of things that humans would want to do, if you start looking at this as a tool for making new kinds of music.


Add a comment