# Your number's up: catching tax fraud

26 March 2015

## Interview with

Prof. David Spiegelhalter, Statistics laboratory, University of Cambridge

## accounts_Ken_Teegardin.jpg

There is a mathematical phenomenon that shows the universe isn't as random with its numbers as we might think. You may expect that if you took all the numbers that appeared in a newspaper or on the Naked Scientists website that these would have a random distribution, with about the same proportions of numbers that start with a 1, a 2, a 3 etc. Professor David Spiegelhalter tells Chris Smith why this isn't actually the case...

David - When people came in this evening, we asked them to record their house number and I think those all went on to the computer. But when we look at the first digit of each of these house numbers, so that would be 11 88 4444, one here, let's look at what pattern those digits form. Okay, what we see among the house numbers is that out of 25 numbers, 11 of them started with a 1. Three of them began with a 2, two began with a 3, four began with a 4, and then just 5 began with an 8. If we look at that, we got a little graph here that shows this distribution of the first digit of the house numbers and it clearly is not equally spread between 1 and 9. There's a great preponderance for low numbers to start up in particular, for number 1. And that's an example - a very rough example of what's called Benford's Law. Benford's Law says that numbers of many quantities are not randomly distributed. If we took the lengths of river, if we took how much people earn, if we took the number of books in libraries, we took the number of people living in cities and looked at the first digit of each of those numbers, and plotted them out in a graph like this, they would all have a similar pattern with the most common number starting with 1. In fact, 30% of all numbers begin with 1. Isn't that extraordinary? And 18% begin with 2 and so on, and so on. In fact, 13% begin with 3 and it goes down and down, and down. And that distribution of numbers just occurs again and again, and again.

Chris - And that's known as Benford's Law. So, if one were to take any natural phenomenon including accountant's records then one should see that pattern present there.

David - You've got to have numbers that cover a very wide range. If just asked you your height in feet, then most of you will give me 4, 5, or 6, or something like that. So, that wouldn't work at all. They got to be something with a really big spread of numbers such as lengths of rivers. In fact, it's just quite remarkable with houses quite so well. So that was lucky. But for accounts or numbers in where you might have very small numbers or very large amounts, it works brilliantly and Benford's Law is used as a tool to detect people fiddling accounts.

Chris - Why should it be useful for detecting people with fiddled accounts because wouldn't they just know that?

David - Well, it's quite difficult to reconstruct figures that obey Benford's Law.

Chris - What you're saying is, when people fiddle the books, they try to choose numbers that they think look plausible but in fact, they're not.

David - And the classical example of that was when people looked at the Greek national accountants for 2000 when they were under a lot of scrutiny and they were putting in for EU monitors. In 2008 for example, they looked at all the numbers in the Greek national accounts and 34% of them began with a 2 which has almost double the amount they should. It should only be 18% that begin with a 2. And that was considered clear evidence that people were essentially making up the figures. Let's think there were lots of numbers in the accounts and they were in pounds, and I wanted to translate them into another unit say, dollars. Let's pretend there's \$2 to the pound. All those numbers would double. Now, if actually, numbers didn't obey this law and there were equal numbers in those accounts beginning with 1, 2, 3, 4, 5, 6, 7,8, 9, you realise all the numbers, the amounts that started with a 5 to a 9, when you double them, change them into dollars. We'd all now begin with a 1. So, you get a completely different distribution coming out just by changing the currency. You get a completely different distribution. But if the numbers obey Benford's Law, you can change the currency and they still obeyed Benford's Law. That's where the mathematical form for Benford's Law comes from, so what's called an invariant distribution. You can change the units, you can change lengths of rivers from miles to kilometres to inches, and they will still all obey Benford's Law. It's an absolutely beautiful bit of mathematics.

Chris - And the police do actually use this.

David - This is used for forensic accounting work, yes.

Chris - The clincher though, why does this apply? Why does that happen?

David - Ah! That's a really difficult thing to say. That's why the usual way to explain this say, well if it didn't, you'd end up with when you change currencies. You would end up with different distributions. It's almost like saying, "If it is going to have a distribution, it must obey Benford's Law." That's not a very good answer, is it?

Chris - I'll ask a question again. So, why does this apply? Why do we see this?

David - You see, it's a natural way that numbers cluster. You don't have to work away all through the 1s before you get to the 2s. So, if you think of the length of the street then you get them up to a hundred. But then for the next big jump of length of streets, all are going to begin with a hundred. And so, street is not equally likely to be half a mile as long as 100 miles long.

Chris - So, there are more things that are in small numbers than there things that are in big numbers and therefore, the small numbers starting with 1 for example are going to crop up more often.

David - I guess you could say that.