Statistics, or A Long Walk Off A Short Pier

Seeing Like a Neoliberal, Part 1: Blinded by the Data

Steven Pinker recently had an op-ed in the WSJ in which he summarised his new book celebrating the progress of mankind in recent decades. Pinker relied extensively on social statistics measuring (among other things) poverty, income, violence, the environment and health to make his point that contrary to the naysayers, things are generally getting better.

Pinker is not alone. At the beginning of last year Nicholas Kristof wrote an article called Why 2017 May Be the Best Year Ever, focusing largely on the continued decline in extreme poverty experienced by humanity. Two years before that Zach Beauchamp, senior reporter at Vox, published an article entitled The world is getting better all the time, in 11 maps and charts, an article format which is not uncommon at Vox. Ex-Adam Smith Institute Sam Bowman’s defence of existing neoliberalism also made heavy use of statistical indicators to illustrate that things are improving. Max Rosser has an entire website, Our World in Data, which is basically dedicated to showing the same thing.

This is a perspective I’ve been sceptical of for a long time. The idea is that if ‘good’ outcomes such as income and health are increasing (or if ‘bad’ outcomes such as violence and poverty are falling) then we can confidently say that things are getting better. A general faith in social statistics to measure these outcomes is a crucial part of this perspective — starkly illustrated by Jonathan Portes’ article entitled Forget anecdotes. If you want to know what’s going on in the real world, look at a spreadsheet. Bowman’s neoliberal manifesto also stressed that neoliberals prefer “rigorous quantitative evidence”. I am therefore going to associate this perspective with neoliberalism, which I think is a reasonable judgement to make since it celebrates the current set of political and economic arrangements.

I want to make a few things clear. Firstly, I am not making a partisan point: I am not intending this as a direct refutation of neoliberalism as a philosophy, or of the general arguments made by Pinker or anyone else. My point is narrower, concerning an assumption which many of these arguments lean on: that statistical social indicators, often viewed from above, are a good way to measure ‘progress’. Secondly, as you will see much of what I am saying is not new — I will be drawing from countless articles by people who have a more in-depth knowledge than me. Finally, I am not claiming that no ‘progress’ has taken place or that things are ‘actually getting worse’. It is clear that these social indicators mean something, and also clear from casual observation that in many ways and for many people, things have improved.

Nevertheless, I think that the triumphalism I have observed based on these social indicators is usually unwarranted. Favourable discussion of certain outcomes only tells us a partial story and is sometimes actively misleading. Ultimately to know if things are going well we need a fuller understanding of the context from which social statistics emerge, else we risk missing important details about the state of the world. My argument will be split into several posts, and the series will start with the most obvious reason to doubt statistics: the data are bad.

Blinded by the Data

Social data do not fall from the sky; they must be gathered both by and from people, which is costly and creates practical problems. Gathering data typically entails arbitrary methodological judgements which, in some cases, create inconsistencies in the raw data. Anyone who has worked with datasets knows the difficulties this creates: names of variables, survey questions and how the variables are recorded change year to year, and so some judgement must be used in place of anything better. For example, if we have a reported income band instead of income, do we use the mid-point, interval regression or something else to fill in the missing values? These kinds of questions are inescapable — as the political scientist Adam Przeworski put it:

“Economists are, by and large, careless about the data they use, especially the political data….I believe that results have to be reproducible from observations and rules…if you have “votes by party” and then “total number of votes,” you can do a little check to see if the votes by party add up to the total number of votes. You’d be surprised, because these things often don’t add up.”

If you take the ‘science’ part of ‘social science’ seriously — and it is not unfair to suggest that a commitment to rationalism and science are a mainstay of these types of arguments — you should worry about almost any reported data, and should not really

