Polly Toynbee recently wrote on the relationship between income inequality and the prevalence of obesity (more inequality leads to more obesity, she claims). Naturally this has provoked the usual ignorant rebuttals from various corners of the web. Matthew Turner lists a few of these and points out that, whatever the merits or otherwise of Toynbee's piece -- frankly I have better things to do than read it, or the rebuttals, in any detail -- if you plot the prevalence of obesity against the Gini coefficient of income inequality for a bunch of OECD countries, you do indeed get a (weak) positive correlation.

Now, there are two important things to say about this. One is that the various countries of the OECD have substantially different cultures, and this is the sort of thing which is likely to influence the prevalence of obesity. Another is that different countries have widely differing populations, wheras the only sensible causal argument that could be made here is that people who live in unequal societies are (for some reason) more likely to be obese than those who live in more equal ones. On that basis, we should be looking at the data weighted by the populations of the various countries; but if you do this the results are dominated by the appearance of the US (very unequal, very obese, and very populous) and Japan (not very unequal, not very obese, quite populous).

One way around this is to look at obesity within the United States instead. It's true that there is cultural variation within the United States, but presumably it's less important than among the OECD countries; and a wide distribution of populations between the different states (which are the unit over which population, obesity and income data are most conveniently available), but it's not so skewed as the distribution of population in the OECD. Anyway, we can get somewhere with this:

The best-fit line has a slope of 11.45±11.33 (that's a standard error, not a confidence interval). So this provides * very weak * evidence for positive correlation (that is, the result is compatible with the two variables being uncorrelated and their being weakly correlated, but not with their being negatively correlated or strongly positively correlated). Any relationship in these data is far-from-striking.

Can we conclude anything useful from this? Not a lot, frankly, beyond that you shouldn't assume that somebody else's statistics are right just because they disagree with Polly Toynbee. I'm suspicious of this sort of thing anyway, because there's no explanation of how income inequality is supposed to make people obese. Toynbee seems to think it's (roughly) something to do with self-esteem, but doesn't really offer any evidence for this. I doubt that anyone's likely to get to the bottom of this one just using summary statistics.

(As an aside, you might be wondering what the Gini coefficient is or why it's a useful measure of income inequality. Wikipedia will tell you that it's defined as the area between the Lorenz curve of a distribution and the Lorenz curve of a uniform distribution, which sounds easy to calculate but not obviously meaningful; and MathWorld will tell you that it's the normalised mean of the absolute difference between each pair of incomes in the distribution, which sounds much more sensible but a pain to compute. These two definitions appear to be completely different, but surprisingly enough they turn out to be the same. Isn't that nice?)