Decoding Bird Song

21 March 2017

Interview with

Dr Dan Stowell, Queen Mary University of London

Some people say that there’s nothing like the sound of birdsong, particularly during an early morning dawn chorus, and many of us are pretty good at identifying a number of species by their distinctive calls. But our understanding of what birds are are actually saying to each other when they sing is still very limited. Jane Reck has been along to Queen Mary University of London to hear about some research that is changing this...
 
Dan - The timing’s really important; it’s not just some notes but it’s some notes with a particular sequencing, it’s about 200/300 milliseconds gap between each of them.

Jane - Dr Dan Stowell is a research fellow in machine listening at Queen Mary University of London. His work has already been used to develop an app called “Warbler” which identifies a UK bird from the recording a user makes. Now he hopes to take the computer analysis of the sounds birds make to a new level to discover more about the social interaction that’s going on.

Dan - Traditionally you would take explicit measures such as: how long is this sound, what frequency is that sound? But, in order to go beyond that, we use modern machine learning methods where you don’t necessarily know how a computer has made a decision about a particular sound. But by training it, which means showing it lots of previous example, we can encourage a computer algorithm to generalise from those.

Jane - At the university’s laboratory aviary, female zebra finches provide plenty of audio examples for Dan’s research.

Dan - We’ve put the timing of the calls together with acoustic analysis of what is the content of that call. Is it a short call or a long call, for example? So with the zebra finches that we’re working with, to some extent there’s knowledge about what the calls are and what there purpose is. When the birds are just hanging around together, they very often make short calls to each other just in the ordinary course of business so they just sound a bit like “ma.”

If one of them gets separated a little bit - it doesn't have to be too far, maybe it gets separated a couple of metres from its partner and then it would do a distance call which sounds more like “maaa.” A little big longer, a little bit more emphasis. It’s quite clear from the content that it’s for re-establishing contact and making sure that you’ve not lost your partner or your group.

We’re starting by taking small groups of birds and record all the calls, and use the timing of those calls to decipher: is this bird when it calls influencing another bird, so are it’s calls causing another bird to call? It’s very difficult to tell that just by listening to the recording but if we apply an analysis that says does the probability of one bird calling increase after this bird calls, or does it decrease, or is there some more subtle interaction? Then we can work out how strongly each bird influences each other and that gives us a kind of picture of the communication network in that group of birds.

Jane - All of Dan’s research has been supported by the Engineering & Physical Sciences Research Council. In the longer term it could be used in a wide variety of areas…

Dan - Deciphering the dawn chorus is certainly one of the long term goals of this kind of work. Certainly something I’m very interested in. But the general application of automatic bird detection or automatic monitoring has a lot of significance in terms of monitoring populations and we know, for example, that bird populations, the latitude that they migrate to depend, at least in part, on the effects of climate change and so monitoring these things is important.

Looking at the detail of bird vocalisations and how birds interact with each other is important in the long term for understanding animal communication, which includes human communication. People working on these things are looking at birdsong, at least in part, because it’s an analogy to human language. Songbirds learn to sing in an analogous fashion as humans learn to speak a language. And so we can improve the monitoring of animal sounds, we can improve the understanding, decoding of animal sounds.

More generally, we actually have quite a lot of applications in which machines are going to need to understand the world around them through sound as well as through vision. Whether that’s self-driving cars, whether that’s your mobile phone, whether it’s monitoring CCTV, for example. Although people have been working on speech recognition and speech technology for a long time, what this work can feed into is a more general understanding, a more general sound analysis of an ordinary sound environment.

Jane - There was also an unforeseen, but very welcome addition to Dan’s research, which has come from the thousands of sounds collected by the great British public through the Warbler app. This big data citizens science aspect will contribute to the machine learning work to help a computer analyse whether a particular sound is, or isn’t, made by a bird.

Dan - One thing that we didn’t quite expect was that people would like to test the recognition quality for themselves by making funny noises into the phone and seeing what decision it came up with. So, as a result, we have an unexpected extra benefit of this collection of bird impressions, and whistling, and squawking children, and other things.

Part of what’s motivating this is, essentially, the big question: what is birdsong? How can a computer know this is birdsong; this is a squeaky door; this is a small child? Those kind of questions you have to try and really address if we’re going to be able automate this kind of detection. There are people creating projects right now where they have unattended microphone systems in the forest recording, and trying to identify which birds occur where. In order to be able to do that, in any sort of scaleable way, we’re going to need algorithms that can say “yes, that is a bird” or “no, that’s just a tree creaking in the wind".

Add a comment