Skip to content

Inception Statistics

“Careful, we may be in a model…within a model.” (From an Inception movie poster.)

We’ve had a lot of very heated debates on this blog about the uses and abuses of global statistics—most recently on estimates of poverty, maternal mortality, and hunger—with a certain senior Aid Watch blogger inciting the ire of many (not least those who produce the figures) by calling them “made-up.”

A new study in the Lancet about the tragic problem of stillbirths raises similar questions: If stillbirths have been erratically and inconsistently measured in the past, especially in poor countries with weak health systems, what then are these new numbers based on?

Of the 193 countries covered in the study, the researchers were able to use actual, reported data for only 33. To produce the estimates for the other 160 countries, and to project the figures backwards to 1995, the researchers created a sophisticated statistical model. [1]

What’s wrong with a model? Well, 1) the credibility of the numbers that emerge from these models must depend on the quality of “real” (that is, actual measured or reported) data, as well as how well these data can be extrapolated to the “modeled” setting ( e.g. it would be bad if the real data is primarily from rich countries, and it is “modeled” for the vastly different poor countries – oops, wait, that’s exactly the situation in this and most other “modeling” exercises) and 2) the number of people who actually understand these statistical techniques well enough to judge whether a certain model has produced a good estimate or a bunch of garbage is very, very small.

Without enough usable data on stillbirths, the researchers look for indicators with a close logical and causal relationship with stillbirths. In this case they chose neonatal mortality as the main predictive indicator. Uh oh. The numbers for neonatal mortality are also based on a model (where the main predictor is mortality of children under the age of 5) rather than actual data.

So that makes the stillbirth estimates numbers based on a model…which is in turn…based on a model.

Showing what a not-hot topic this is, most of the articles in the international press that covered the series focused on the startling results of the study, leaving aside the more arcane questions of how the researchers arrived at their estimates. The BBC went with “Report says 7,000 babies stillborn every day worldwide.” Canada’s Globe and Mail called stillbirths an “epidemic” that “claims more lives each year than HIV-AIDS and malaria combined.” Frequently cited statistics included the number of stillbirths worldwide in 2009 (2.6 million), the percentage of those stillbirths that occur in developing countries (98%), the number of yearly stillbirths in Africa (800,000), and the average yearly decline in stillbirth over the period studied (1.1 percent since 1995).

Only one international press article found in a Google search, by AP reporter Maria Cheng, mentioned the possible limitations of the study’s estimates. Not coincidentally, that article interviewed a source named Bill Easterly.

Despite the disinterest of the media, this is a serious problem. Research and policy based on made-up numbers is not an appealing thought. Could the irresponsible lowering of standards on data possibly reflect an advocacy agenda rather than a scientific agenda, or is it just a coincidence that Save the Children is featured among the authors of the new data?

 

FOOTNOTES
1. From the study: “The final model included log(neonatal mortality rate) (cubic spline), log(low birthweight rate) (cubic spline), log(gross national income purchasing power parity) (cubic spline), region, type of data source, and definition of stillbirth.”

This entry was posted in Academic research, Data and statistics, Global health. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

10 Comments

  1. Brett wrote:

    Thanks for this post (it’s always helpful to look at quality of estimates critically) but I think the direction of your criticism needs to be clarified. Which of the following are you upset about (choose all that apply)?
    a) the fact that the researchers used models at all? I don’t know the researchers personally, but I would imagine that they are concerned with data quality in general and would much preferred to have had reliable data from all the countries they work with. But in the absence of that data (and while working towards it) isn’t it helpful to have the best possible estimates on which to set global health policy, while acknowledging their limitations? Based on the available data, is there a better way to estimate these, or do you think we’d be better off without them (in which case stillbirth might be getting even less attention)?
    b) a misrepresentation of their data as something other than a model? If so, could you please specify where you think that mistake occurred — to me it seems like they present it in the literature as what it is and nothing more.
    c) the coverage of these data in the media? On that I basically agree. It’s helpful to have critical viewpoints on articles where there is legitimate disagreement.

    I get the impression your main beef is with (c), in which case I agree that press reports should be more skeptical. But I think calling the data “made up” goes too far too. Yes, it’d be nice to have pristine data for everything, but in the meantime we should try for the best possible estimates because we need something on which to base policy decisions. Along those lines, I think this commentary by Neff Walker (full disclosure: my advisor) in the same issue is worthwhile. Walker asks these five questions – noting areas where the estimates need improvement:
    - “Do the estimates include time trends, and are they geographically specific?” (because these allow you to crosscheck numbers for credibility)
    - “Are modelled results compared with previous estimates and differences explained?”
    - “Is there a logical and causal relation between the predictor and outcome variables in the model?”
    - “Do the reported measures of uncertainty around modelled estimates show the amount and quality of available data?”
    - “How different are the settings from which the datasets used to develop the model were drawn from those to which the model is applied?” (here Walker says further work is needed)

    Posted April 18, 2011 at 1:09 am | Permalink
  2. Nichol wrote:

    I’m missing the obvious conclusion, that models might not be perfect, but they might be good enough to point out where more research and better quality data is needed. Improving health-care can clearly profit a lot from having better quality data about the state of health of the population involved. But it is also obvious that gathering that information will always have a lower priority than first getting a minimal health care set up.

    Posted April 18, 2011 at 10:03 am | Permalink
  3. geckonomist wrote:

    stillbirths in the developed world are – fortunately – extremely rare. So rare that I don’t know anyone in my first world environment that had one.

    In developing countries, however, I have 2 nieces less that I should , and 1 missing daughter of a direct colleague.

    I am no fan of save the children and I agree that the data are non-existent, but I think you are wrong to stop efforts in fighting this.

    Posted April 18, 2011 at 10:10 am | Permalink
  4. It’s one thing to use modeling, but quite extraordinary to have data for only 33 countries out of 193 (17%), yet this is not at all uncommon in international development.

    Very likely, too, that the data for those 33 was collected on paper-and-clipboard, just as it might have been in 1950 — although most health workers in most countries are now carrying around a powerful pocket computer connected to the network (ie, their mobile phone).

    We’re now working with the IRC to allow organizations to let midwives in rural villages to use their $15 mobile phones to send in, for example, two numbers every week: births and deaths in children under five years of age. The data will flow automatically into our web-based EpiSurveyor.org data system, with reports going out automatically.

    Imagine the impact such near-real-time vital statistics will have on our current data vacuum, and on the need for modeling.

    Posted April 18, 2011 at 11:53 am | Permalink
  5. Casey wrote:

    Brett couldnt have responded any better! His comment strikes the nail on the head.

    Models are used often to fill in data gaps, particularly when working in the context of developing countries where data collection is limited and frequently inconsistent. Having done this myself in the past when working with various economic models in the context of developing countries, it’s sometimes a necessity to calculate estimates based on similar data for the sake of your (in my case) economic model (and the author *should* note this in their study).

    I do certainly agree that the media were irresponsible with what they were given. Take what you read with a grain of salt, I suppose.

    Posted April 18, 2011 at 1:35 pm | Permalink
  6. IP Freely wrote:

    No more irresponsible than the Barro regressions you used to run as a World Bank lead economist.

    Posted April 18, 2011 at 7:24 pm | Permalink
  7. Chris wrote:

    Statistic is as good as the dependence it creates between the aid agencies and the recipient regime. Squeezed in between is the population dependent on the government that skew statistic based on political consideration to sustain its rule and the aid agencies that needs the regime as a partner to keep the money flow going. The problem is found at the national ‘statistic’ agency of a given ‘regime’.

    Thus, there are three markets on aid dependent nations as I observed them. The indigenous market that ‘follows’ the pattern of the free market incorporating aid in the price structure. The aid market that follows the pattern of entitlement that is full of corruption and cronyism and the state market that resembles a soviet model that draws money from tax/aid and cheap product and services from the indigenous population to sustain the regime.

    The winner is the government i.e. relatively substantial amount of free money and abandoned of cheap labor to draw from.

    What politician will refuse this deal and willing to change the formula to live in the real world?

    By the way there is another winner out of these anarchy; the people working for aid agencies that earn wages by western standard and spend it by third world standard.

    No wonder there are no statistics that can be relied on to make serious decision, like removing the regime for incompetence.

    Posted April 18, 2011 at 9:16 pm | Permalink
  8. IP Freely wrote:

    Chris:

    Brilliant. You’re a genius.

    Posted April 18, 2011 at 9:35 pm | Permalink
  9. Brandon J wrote:

    False statistics seemed to be used as tools to scare people into not only caring but giving aid to help fight stillbirths. I f they really want to generate more of a buzz and really help their cause they could start out by providing real statistics. All these prominent figures from the West should take steps to help set up better systems of recording and reporting actual data concerning development in LDCs. This could be done by diverting money being spent on ineffective programs and using those funds to help solve an important issue with statistic reliability.

    Posted April 21, 2011 at 1:15 pm | Permalink
  10. James Moore wrote:

    Brandon, I believe the reason they make up the false statistics IS to generate a larger buzz. This may seem obvious but your post seemed to suggest the actual numbers would be more beneficial to their cause, and would be of course, if the public knew that they were not getting totally “real” statistics, but rather some rather heavy estimated guessing.

    The main thing that jumps out at me from the post is the lack of media attention put towards investigating and reporting where these statistics come from. Of course no one wants to watch CNN or BBC and see a report about modeling techniques, but major news agencies should have someone reviewing research methods of the stories they present. In theory I am sure they do employ fact checkers of some sort but I am thinking of something along the lines of a dedicated researcher.

    Posted April 21, 2011 at 11:52 pm | Permalink

9 Trackbacks

  1. [...] Link to the original site Posted in Aid SHARE THIS Twitter Facebook Delicious StumbleUpon E-mail « Online course on intangible cultural heritage » Got ‘Em: An Evaluation Story No Comments Yet [...]

  2. By Modelling Stillbirth – Brett Keller on April 18, 2011 at 1:36 am

    [...] Easterly and Laura Freschi go after “Inception Statistics” in the latest post on AidWatch. They criticize — in typically hyperbolic style, with bonus [...]

  3. By Inception Statistics | Global Health Hub on April 18, 2011 at 4:05 am

    [...] Inception Statistics AKPC_IDS += "17561,";Popularity: [...]

  4. By Inception Statistics « Daniel Smith on April 18, 2011 at 10:13 am
  5. By Department of Awful Statistics | LesBnB.com on April 20, 2011 at 5:40 pm

    [...] Easterly has a good post on bad infant mortality stats: Of the 193 countries covered in the study, the researchers were able [...]

  6. By Numbering off « Bottom Up Thinking on April 21, 2011 at 7:42 am

    [...] Easterly and Laura Freschi lament the tendency to “make up” international development statistics. Though done from the best of intentions the result does rather seem to resemble a counting system [...]

  7. [...] is from Bill Easterly via Megan [...]

  8. By Friday Links « Do No Harm on April 29, 2011 at 2:01 pm

    [...] Bill Easterly on stillbirths and inception statistics [...]

  9. [...] what else.  Oh yes…this has nothing to do with decentralizing data but here you go.  “Inception Statistics.”  It’s about data and false data and other data-y [...]