Sunday, November 27, 2005

What's in a number: counting Iraqi dead


I urge everyone to check out the October 28th, 2005 episode of This American Life called "What's in a Number." This episode tells the story about a number, specifically one controversial estimate of the amount of Iraqi deaths during the Iraq War. The following post is based on that story.

You may have heard, at some point, about a study (published in the British medical journal, The Lancet) estimating that, by September 2004, 100,000 Iraqis had died due to war related causes during the Iraq War. The number was published in the mainstream press just before the 2004 US presidential election. It was not widely reported in the US, especially on television (only NBC reported it), and when it was reported it was generally disparaged, sometimes scathingly (though some articles dealt with the study fairly). The main "authoritative" critic reffered to in the press was a man named Mark Garlasco, an analyst at Human Rights Watch, who stated "I certainly think that 100,000 is a reach." It turns out that Garlasco, while an expert in assessing specific types of war related damage, had not read the study and did not have the statistical training to comment even if he had (he has since essentially retracted his statements). In General, the criticisms included 1) saying the sampling was not random: that researchers were blocked from going to cerain areas and that they decided not to go to certain areas due the danger; 2) that they went to Fallujah (where the worst of the fighting has been) on purpose in order to inflate their numbers; 3) that those counted were not neccessarily civilians, but could be insurgents; 4) and finally that the 95% confidence interval for the 100,000 was so wide as to be meaningless (8,000-194,000).

It turns out that criticism 1 is simply untrue. Not only did the group, lead by a reasercher at John's Hopkins named Les Roberts, use a well known and highly reliable technique called cluster sampling, but they followed the randomly determined sampling proceedure despite all dangers.

As for criticism 2, the researchers went to Fallujah because it had been randomly selected, just like all the other samples. And even so they did not include it in their estimate because it was such an outlier (essentially it was off the charts in terms of the number of deaths). Thus, the 100,000 estimate was made without the counts from Fallujah. Certainly, no inflation there.

Criticism 3 is to some degree true because the researchers had no way of knowing whether the deaths they counted were from insurgents or not (freely admitted in the paper). Still, half of all deaths were women and children, and even if all the men were insurgents (an outrageous claim) then that would still leave 50,000 civilian deaths; which, it turns out, is 3 times the next highest estimate at the time (from Iraq Body Count, but see).

Criticism 4 is the only one that holds any water. A range of 8,000-194,000 does seem to be an extremely wide range. Fred Kaplan on slate.com stated (see), "this isn't an estimate. It's a dart board." However, not every number within that range is equally likely. In fact, it follows a bell curve, with 100,000 as the most likely number and 8,000 and 194,000 as the least. This is evidenced by the fact the with a confidence of 90% (rather than 95%) the lower limit of the estimate becomes 44,000. The main thing this interval says about the study is that getting accurate of counts of war dead is difficult when fighting is highly concentrated spatially (see map from the paper). Interestingly, this study only cost $40,000 dollars to conduct, and Les Roberts believes that with only a bit more money to conduct futher study he could increase the accuracy of the estimate by a substantial margin. In a war where a single bomb can cost hundreds of millions of dollars (and the pentagon has dropped more than 50,000 in this war) you think just a little bit of money would be filtered into attempting to understand its impact. But, apparently, the US military "doesn't do body counts."

Speaking of bombs, one of the major findings of the study was that most of the violent deaths (43%) were due to bombing from coalition forces, while only 14% were confirmed to be from insurgents. This suggests that the very style of US warfare causes an immense amount of suffering.

The only other group that has attempted to count Iraqi dead is the Iraq Body Count (IBC) website. The current estimate is between 27,354-30,863. This group uses a passive data collection approach, by taking data from press reports. Interestingly, passive methods in human epidemiology studies (the best comparisons to the Lancet study, due to non-uniform spatial distributions) are notoriously inaccurate. IBC freely admits this and points out that its estimates should be seen as the minimum number of deaths, not as an actual number. Given this, studies that use direct and sound methods of counting, such as the Lancet study, should be supported and commended. Lastly, I would like to remind you that this study was conducted more than a year ago. In that time the IBC estimate has nearly doubled from 14,181-16,312 in October 2004 to 27,354-30,863 today. If we the take Lancet number as accurate, and we use a bit of extrapolation, that means that the number of Iraqi deaths at this point could be as much or more than 200,000. But despite the actual numbers, substantial no matter which way you count, deaths are deaths; deaths cause anger, and anger causes violence.