Wednesday, 23 March 2011

I'm Frank Benford and this is the law

Whilst he may not be quite on the same level as Alan Turing, or even William Gossett, Frank Benford is perhaps one of the unsung heroes of twentieth century Mathematics. True, he didn't help shorten World War II, and he didn't even enable us to compare two sample means taken from populations of unknown variance, but a knowledge of his work could help prevent you from getting caught next time you try to fiddle your tax return governments around the world spot fraudulent accounting.

What Mr Benford spotted was that, in many data sets, the distribution of the leading digits is not rectangular. That is, the digits 1 to 9 won't occur roughly the same number of times, and nor should they be expected to (so don't ask them to as it will only create tension between you). The digit 1 should be expected to appear as the leading digit roughly 30% of the time, and successive digits are decreasingly likely, following a pattern of exponential decay.

This phenomenon is now known as Benford's Law and is, would you believe, actually admissible as evidence in a court of law in the US! At least, that's what Wikipedia says. The same Wikipedia article also says that the law was previously "stated" by Simon Newcomb, which makes me wonder... what constitutes being-statedness? Did he have his work published? Did he merely write it down in his notes? Or did he just mention it in the pub one night while having a few beers with his friends? Although this being the nineteenth century, he was more likely in a tavern quaffing ale with stout companions.

Whoever said Statistics isn't interesting? This is the stuff we should be teaching in school! Which gets me on to the topic of how we teach Statistics, another one for the soap box one day. I really should write these ideas down on a list somewhere so I don't forget. In fact, I'm going to make a note to do just that right now.* Why do we teach it using the techniques as the starting points, rather than the contexts in which they are used? And I'm not just talking about really difficult stuff like formal hypothesis testing, it starts at an early age.

Every school I've ever worked in has had a "module" where I have to teach children how to draw a bar chart. Why? What's the point in ever drawing a bar chart unless you have something interesting to say about it? Unlike the quadratic formula or trigonometric identities, there is no inherent beauty in a bar chart. Try doing an image search for "beautiful bar charts" and see what you get. Are any of those so beautiful you want to hang them from your wall?** Most of the images aren't even bar charts! The blue bird in the hard had is quite cute though.***

My point is, shouldn't all our Statistics teaching be centred of analysing and describing real data sets? Wouldn't it make so much more sense? In the earlier years of secondary, this is really just about making the content more interesting/relevant, but as it gets tougher it should also help comprehension. Rather than learning all about t-Tests and Z-Tests and Chi-Squared Tests and Mann-Whitney Tests and Product-Moment Correlation Coefficients and Stuff, just so that we can answer some pseudo-contextual questions in a book or on an exam paper, why not have students create their own data sets, ask them the right questions, and then show how the techniques can be used to answer them. Simple, no? I may put this blog on hold for a few months while I make a fortune writing a new post-16 Statistics textbook.

Still, Frank Benford, eh? Pretty cool stuff.

* I did actually go so far as to start a list of things I feel the need to get off my chest. Or that may be of interest to some people. Or, if I'm really lucky, both. And the teaching of Statistics is on that list.

** Maybe you wouldn't want the quadratic formula hanging on your wall either. The analogy is far from perfect.

*** I do realise that the Internet is an ever-changing thing and you may not get the blue bird in the hard hat, so here it is. And I'm glad I did this, as that looks like quite an interesting website. I'll add it to my list.

Labels: , ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home