The tell-tale hum!

03 May 2016

Interview with

Peter French, J&P French Associates, York University

Audio can be key to cracking a case, be it wiretap evidence or identifying a suspect by Ultrasoundtheir speech. Peter French is from JP French Associates, and they provide expert assistance with forensic audio, as he explained to Georgia Mills.

Peter - The making of audio recordings is a very significant source of evidence in both criminal investigations and in prosecutions in the sense that if the police have reason to believe that someone is involved in criminal activity, they may record that person.

Georgia - That's Peter French, Chairman of speech and acoustics laboratory JP French associates. They do what's called audio forensics - not something I've ever heard of before, so he took me through some of the things the company does...

Peter - So one of the things we're asked to do is to compare the voices in criminal recordings to the voices of known suspects. Now when I say criminal recordings, they could be a whole range of things. For instance, if the police think that you're involved in high level crime - what they often refer to as top drawer crime they will, in fact, bug your premises under a warrant and we'd be comparing those voices with the voices of known suspects, usually from police interview recordings.

Georgia - What happens if the quality of the recording isn't too good?

Peter - This is, in fact, a very frequent scenario. A lot of the recordings that we get into the lab for processing have a lot of background noise in them or other problems of intelligibility associated with then. What we can do in those cases is to apply digital sound processing programmes to the recordings in order to reduce the noise. Typical scenarios would be things like undercover police officers posing as drug buyers, striking up deals with a drug dealer. This might take place in somewhere like a pub, so what we'd have in the background is a loud jukebox, we might have glasses clinking and what we'd be trying to do is to reduce the level of that noise relative to the foreground conversation so that it can be more clearly heard. And there's a variety of digital processing techniques that we use in order to enhance the sound quality.

Georgia - How does it actually work when you need to get something out? I'm thinking of CSI when they just click 'enhance' and the image becomes, clearer, and clearer, and clearer. What do you actually do - how does it work?

Peter - Well not the way that CSI portray it. In fact, CSI is one of our worst enemies because it give police officers and other clients very elevated expectations of what they're going to get back from us. The techniques we use - a whole range of them really. If it's broad spectrum noise (noise that goes right across the frequency spectrum), what we'll do is to use a technique known as spectral subtraction. This involves taking the recording and locating parts of the recording where there's no speech from the people in question - pauses between words and between sentences. And what we do is we sample that noise using the computer and then, once it's assembled a profile of the noise, it will remove noise with that profile from the sound file as a whole and that will, usually, improve intelligibility. 

Georgia - We had a clip of Peter working his magic on a real wiretap recording, however we, ironically, couldn't get legal clearance to broadcast it. However, the techniques actually sound very familiar to what we do at the Naked Scientists when we've got a poor quality recording - so I can give you a slightly less swish example. Here's a bit of audio from  SeaLife in London with a lot of background noise.

And here it is post noise-reduction - the editing software simply strips out anything that matches the background hum, leaving the speech. But just as you can edit the background noise out of a clip equally, if you were so inclined, you could chop up someone's speech just like a jigsaw and change the order or even the meaning of what they say. It's not something we'd ever do, but how do you know what you're getting is genuine?

Peter - We have available a new technique which is known as ENF analysis (Electrical Network Frequency analysis). We're talking here about a fact that we've got an alternating electric current in use, and the nominal rate of alternation in the current is 50 times a second, in other words 50 hertt but, in reality, it's never absolutely spot on 50 hertz. It alternates unpredictably 49 point something, 51 point something, 50 point something, backwards and forwards, and this happens on a moment to moment basis in response to different levels of demand on the electrical network. The point is though that because we get these moment by moment fluctuations in the level of alternation, that means that any slice of time on a recording where this mains hum is represented, actually tells us when that recording was made.

So let's say someone submits a recording to us and says this is a whole recording, it's continuous, which took place on the 4th April. Because we record the mains hum 24/7, we'll run it against a database and we'll say well, actually, the first 10 seconds of that recording were made on the 27th March, 2014. There's then 30 seconds of speech which comes from, let's say, the 31st October, 2014 beginning at 2 o'clock in the afternoon, and so on and so forth. So, by looking at the mains hum on a recording, we can usually tell a) when it was made and b) if it's a mosaic of pieces of recording from different times and dates.

What I should say is, in order to do this, the recording device doesn't necessarily have to be connected to the mains. Even if it's a battery operated device or even a mobile phone, as long as you're recording in say an urban environment or you're reasonably near to a mains source, you will often get inducted mains hum down on the noise floor of the recording and what we can do is to amplify it and analyse it from there - run it against the database.

Georgia - And is the hum different enough from day to day?

Peter - It's different enough from second to second. I mean, it's changing in frequency all the time and it's doing so totally unpredictably. It's not just that we can pin it down to a day, we can pin it down to the second that it was started and the second it was finished. It's almost like a fingerprint - a time fingerprint which is peculiar to to any section of time that we have on the recording and it will be unique to that slice of time.

Add a comment

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.