Not Very Good at Statistics

A frequent argument I've had with my brother is over the effectiveness of speed cameras; as an ex-police-officer he's massively in favour and spends lots of time quoting various studies at me that 'prove' speed cameras work.

For a study to be unbiased and generate useful statistics, especially in this field, it is very important to eliminate selection effects and to use proper scientific controls, such that the effects of measuring the results do not affect the results of the experiment. In the case of speed cameras the ideal experiment would be to randomly select some speed camera sites and run two parallel universes, one with cameras and one without and compare the results. This is difficult to carry out for obvious reasons. Another method would be to randomly select sites and put cameras in half of them; however, this runs the risk that the sites with cameras might affect the sites without cameras.

However, the majority of studies done have far more serious selection effects. The methodology is to choose sites with high accident statistics, install cameras and observe that the accidents have fallen at these sites without reference to any others. The behaviour of random statistics at these levels is such that we expect this to happen even if no camera is installed. This is a well known phenomenon called "regression to the mean".

A simple try it at home experiment

Hypothesis: placing a piece of paper under a dice causes it to decrease the number of sixes it rolls.

The experiment: Take ten dice. Roll each of them ten times noting each of their sixes. Place a piece of paper under the three highest scoring die. Roll each of those ten times each counting their sixes again. Note that second total is almost always lower than the first total, proving that the piece of paper decreased the number of sixes rolled by the dice.

Reroll 5,7,8

We conclude that putting paper under dice decreases the roll by an average of 66%.

The problem with this experiment is that it's a pile of rubbish.

It's a pile of rubbish because we've chosen to do analysis on an artificial selection of the data, not either all the data or a randomly chosen sample of the data. This phenomenon is known as regression to the mean.

Speed Camera Studies

Most speed camera studies are conducted by choosing camera sites based on areas with a high accident rate, installing the cameras and discovering that in general the accident rates fall.

I decided to do a fictional study of Cambridge, since there are two studies available, one from Cambridge City Council and one from the National Safety Camera Partnership: Cambridge Council Results, National Safety Camera Partnership.

To model this I wrote a computer program, to generate traffic studies. It generates traffic data as follows:

This gives us six years worth of accident data with typical statistics for each site. Then we place speed cameras in each site which has more than three fatal or serious accidents in the first three years, in accordance with the national policy quoted at Department of Transport Study.

On our speed camera sites we then repeat the studies previously quoted. In the case of the Cambridge study we compare the fatal and serious accident rates for the previous three years against the following three years; in the case of the National Safety Camera partnership we weight the accidents according to their ratings and compare the results. Note their weighting has the effect of skewing the effects of the study away from the reasonable slight accident statistics which weren't used to select the sites, towards the kill and serious accidents which were.

A sample output looks like this for a given randomly generated city. [tidied slightly for display]

Conducting three year study
Site  0; Improvement in KSI 5 to 3 =  40.0 %
Site 15; Improvement in KSI 4 to 2 =  50.0 %
Site 18; Improvement in KSI 4 to 2 =  50.0 %
Site 19; Improvement in KSI 4 to 1 =  75.0 %
Site 20; Improvement in KSI 4 to 2 =  50.0 %
Site 33; Improvement in KSI 4 to 0 = 100.0 %
Site 42; Improvement in KSI 5 to 3 =  40.0 %
Site 47; Improvement in KSI 4 to 1 =  75.0 %
Site 60; Improvement in KSI 4 to 3 =  25.0 %
Site 64; Improvement in KSI 4 to 3 =  25.0 %
Site 65; Improvement in KSI 6 to 0 = 100.0 %
Site 80; Improvement in KSI 4 to 2 =  50.0 %
Site 86; Improvement in KSI 4 to 1 =  75.0 %
Site 88; Improvement in KSI 4 to 4 =   0.0 %
Overall sites 60 to 27
Percentage 55.0

Conducting one year study
Site  0; Improvement in weighted value 11.58 to 6.58 =   43.2 %
Site 15; Improvement in weighted value  4.00 to 4.00 =    0.0 %
Site 18; Improvement in weighted value  2.00 to 2.00 =    0.0 %
Site 19; Improvement in weighted value  2.00 to 6.58 = -229.0 %
Site 20; Improvement in weighted value  7.58 to 7.58 =    0.0 %
Site 33; Improvement in weighted value  6.58 to 2.00 =   69.6 %
Site 42; Improvement in weighted value 15.16 to 2.00 =   86.8 %
Site 47; Improvement in weighted value 20.74 to 3.00 =   85.5 %
Site 60; Improvement in weighted value  7.58 to 5.58 =   26.4 %
Site 64; Improvement in weighted value  5.58 to 6.58 =  -17.9 %
Site 65; Improvement in weighted value 48.04 to 4.00 =   91.7 %
Site 80; Improvement in weighted value 41.46 to 1.00 =   97.6 %
Site 86; Improvement in weighted value 22.74 to 1.00 =   95.6 %
Site 88; Improvement in weighted value 12.16 to 6.58 =   45.9 %
Overall sites 207.2 to 58.48
Percentage 71.8

Presenting Study Results
100 sites considered
14 Toilets installed
City Council 55.0% improvement
StephanieHess 71.8% improvement

Just for good measure we conduct the survey 500 times, to give us the expected average results and deviation, we get

Note that the accident statistics were generated before the cameras were placed. The cameras in this study are guaranteed not to affect the accident rate.

The conclusion here is clear, the Cambridge and National studies quoted on their websites are rubbish, and their results are meaningless.

If you want to try out my program, the Java source code is here

Local Mirrors of the PDFs:

Can you explain that again?

Imagine we have two locations which have an average accident rate of 1 accident per year. We can imagine that our accident figures might look like this.

Site 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Site A 1 0 2 1 0 1 0 3 1 0 0 1 2
Site B 2 1 0 0 1 1 2 1 1 1 0 0 3

According to the selection criteria we need 4 accidents in 3 years to place a camera. So we'd place a camera at Site A in 1997, and at Site B in 1996. Looking at the three years before and after :

Site 3 years before 3 years after Improvement
Site A 4 1 75%
Site B 4 3 25%

We see that for these two sites the cameras are placed just after a "bad period" and immediately afterwards the accident rate returns to the mean, which is lower than the bad period. It appears that the camera caused the accident rate to lower even though the long term averages before and after the camera placement are unchanged. As a result the measured decrease is a result of the way the experiment is carried out; not a result of decreasing the accident rate.

What is your point?

The National Safety Camera Partnership are either deliberately lying to the public by presenting invalid statistics or they are incapable of understanding the analysis they are presenting. They are either deliberately deceptive or incompetent which is not acceptable in a public body.

For a page detailling why I like speed cameras see a page of my opinions.

More Bad Surveys

All the studies I have seen so far about speed cameras are rubbish. This Australian study is no better - no mention of a control. National accident statistics fall which you are supposed to imply is because of speed cameras. The Pilot study is better and tries to include a control (using other areas with no speed cameras) but suffers from comparing,

It also suffers from regression to the mean problems - see H.8:
We could not obtain data for the before period of individual sites other than at camera sites. It was therefore not possible to check fully for regression to the mean at the site level.
Whilst this conclusion is correct - the study does suffer from regression to the mean - the reasoning is wrong. The effects occur because the sites were selected according to the variable measured, not randomly.

Also, being a pilot study, it's questionable to extrapolate from 170 sites to the now more than 4000 in the country, especially since many more counties have now introduced cameras which, according to this study, should show a huge decrease in accident statistics which doesn't seem to have occurred.

Equally, the results presented by the other side of the argument are rubbish too, e.g. the Association of British Drivers 'proving' that speed cameras killed 5500 people, the argument being that the number of road deaths fell until 1993 which was roughly the year speed cameras started, and obviously the trend would have increased so the speed cameras must be to blame. This article and this article by Chris Lightfoot utterly rubbishes the ABDs claims.

The Conclusion

Speed Cameras may reduce accidents; they may increase them. However there is very little data which isn't flawed by regression to the mean and selection effect problems. Until someone does a decent study free from selection effects designed to isolate only the effects of the cameras we're unlikely to find out anything useful.

More concerning to me is that I have seen no evidence that anyone participating in the implementation of speed camera programs has the slightest idea of how to judge the results. Why are we installing cameras at considerable expense to the public and the driver at the bequest of people who can't measure if it did any good? More seriously, why are we letting people carry out expensive experiments with allegedly life or death consequences with no scientific training to devise the experiment or statistical training to analyse the results?

Abscence of Evidence isn't Evidence of Abscence

For those of you who are confused. This page does not demonstrate that speed cameras do not work. It does not demonstrate that speed cameras cause accidents. It merely tells us that the method of measuring speed camera effectiveness is flawed, which is completely different to asserting that speed cameras themselves are flawed.

Additional Road Safety Stuff

This newsgroup posting by Chris Lightfoot is worth reading. It answers a question, asked by Tim Ward a cambridge city counciller about being able to measure the changes in accident rate. He demonstrates that for works that increase the accident rate the probability of seeing a decrease can be as high as 80%.

National accident statistics for Cambridge. Comparing 2001-2002 - which admittedly is a statistical load of rubbish - we discover that the report reference earlier gives an 18% decrease in accidents at camera sites, however over the whole county we see a 2% drop. Since regression to the mean occurs in both directions we expect to see non-camera sites see an increase in the number of accidents - which is exactly what occurs.

Home Mythic Beasts, shell accounts, cvs
hosting, co-location, virtual servers