President Trump is once again causing distress by downplaying the number of deaths caused by Hurricane Maria's devastation of Puerto Rico last year. Official estimates initially put the death toll at 15 before raising it to 64 months later, but it was clear even then that those numbers were absurdly low. The government of Puerto Rico commissioned an official report from the Millikan Institute of Public Health at George Washington University (GWU) to obtain a more accurate estimate, and with its interim publication official toll stands at 2,975.
Why were the initial estimates so low? I read the interim GWU report to find out. The report itself is clearly written, quite detailed, and composed by an expert team of social and medical scientists, demographers, epidemiologists and biostatisticians, and I find its analysis and conclusions compelling. (Sadly however the code and data behind the analysis have not yet been released; hopefully they will become available when the final report is published.) In short:
- In the earliest days of the hurricane, the death-recording office was closed and without power, which suppressed the official count.
- Even once death certificates were collected, it became clear that officials throughout Puerto Rico has not been trained on how to record deaths in the event of a natural disaster, and most deaths were not attributed correctly in official records.
Given these deficiencies in the usual data used to calculate death tolls (death certificates) the GWU team used a different approach to calculate the death toll. The basis of the method was to estimate excess mortality, in other words, how many deaths occurred in the post-Maria period compared to the number of deaths that would have been expected if it had never happened. This calculation required two quantitative studies:
- An estimate of what the population would have been if the hurricane hadn't happened. This was based on a GLM model of monthly data from the prior years, accounting for factors including recorded population, normal emigration and mortality rates.
- The total number of deaths in the post-Maria period, based on death certificates from the Puerto Rico government (irrespective of how the cause of death was coded).
- (A third study examined the communication protocols before, during and after the disaster. This study did not affect the quantiative conclusions, but formed the basis of some of the report's recommendations.)
The difference between the actual mortality, and the estimated "normal" mortality formed the basis for the estimate of excess deaths attributed to the hurricane. You can see those estimates of excess deaths one month, three months, and five months after the event in the table below; the last column represents the current official estimate.
These results are consistent in scale with another earlier study by Nishant Kishore et al. (The data and R code behind this study is available on GitHub.) This study attempted to quantify deaths attributed to the hurricane directly, by visiting 3299 randomly chosen households across Puerto Rico. At each household, inhabitants were asked about any household members who had died and their cause of death (related to or unrelated to the hurricane), and whether anyone had left Puerto Rico because of the hurricane. From this survey, the paper's authors extrapolated the number hurricane-related deaths to the entire island. The headline estimate of 4,625 at three months is somewhat larger than the middle column of the study above, but due to the small number of recorded deaths in the survey sample the 95% confidence interval is also much larger: 793 to 8498 excess deaths. (Gelman's blog has some good discussion of this earlier study, including some commentary from the authors.)
With two independent studies reporting excess deaths well into the thousands attributable directly to Hurricane Maria, it's a fair question to ask whether a more effective response before and after the storm could have reduced the scale of this human tragedy.
Milken Institute School of Public Health: Study to Estimate the Excess Deaths from Hurricane Maria in Puerto Rico
I stopped reading after:
"Sadly however the code and data behind the analysis have not yet been released..."
What a sham. "Lies, Damn Lies and Statistics" indeed.
No data => BS results. Let us see the raw data and we will make up our own minds.
Where were all the bodies at the morgues? Nobody went there to check.
Posted by: John Brooks | September 14, 2018 at 21:34
Is it good practice to give the central estimate to 4 significant figures when the CI range is about 40% of the central figure? The middle column is surely more fairly summarised as "2100 + or - 200 deaths".
Posted by: Jeremy Colman | September 15, 2018 at 01:27
It's sad that the code and data for one analysis were not released, as it gives people who would deny the results no matter what a seemingly legitimate reason to do so (as seen by the statement made by the first poster above, who failed to read far enough to see this:
These results are consistent in scale with another earlier study by Nishant Kishore et al. (The data and R code behind this study is available on GitHub.)
Posted by: dean | September 15, 2018 at 14:40
So let's compare the two reports. The GWU report makes the headline (https://edition.cnn.com/2018/08/28/health/puerto-rico-gw-report-excess-deaths/index.html) :
"Puerto Rico's new Hurricane Maria death toll is 46 times higher than the government's previous count"
The GWU report claims to be consistent with the previous “Mortality in Puerto Rico after Hurricane Maria”(https://www.nejm.org/doi/full/10.1056/NEJMsa1803972) But that one claims:
“...the number of excess deaths related to Hurricane Maria in Puerto Rico is more than 70 times the official estimate.”
OK… I understand the GWU report alleges to focus on the direct causality between Maria and mortality. But the headlines are frustrating and makes the “scientific Community” appear to be incapable of being consistent.
We have advocated for “reproducible research” within the R community for years. Until I see the data and methodology, I will continue to “deny the results”.
Posted by: John Brooks | September 15, 2018 at 23:01
@ John Brooks: Certainly releasing the data (and code) would be more transparent, but the study (linked above) is pretty detailed in its methodology explanation. IMO, the "Methods" section (in the actual report, pages 3-7) includes more details and is better written than most peer-reviewed publications.
To immediately conclude that "no data => BS results" seems unfair. I would rather have a clear and thorough methods section than the raw data.
Plus, what's more likely: a "6 to 18" death toll (as reported by the US president) or a ~3,000 death toll?
Posted by: Jordan Erickson | September 16, 2018 at 05:49
Vastly boring and not worthy of a technology blog - keep politics out of this blog.
Posted by: BR | September 16, 2018 at 14:56
@Jordan Erickson: Fair enough, my initial comment was harsh.
Unfortunately, this issue has been politicized, mainly due to the President's tweets. The official number of 64 was (per the above report) attributed to direct causes from the hurricane is probably still valid. But residual, collateral or lingering effects, are much more amorphous. For example, veterans from the Vietnam War are still dying from the effects of exposure to Agent Orange... shall these be considered "excess deaths"? Is there a time limit for attributing a death to an event...?
The term, with focus on "excess", and the implication thereof, seems to be lost in common discourse.
I do hope to be able to review the actual data and methodology at some point. In any case, the report makes recommendations that are applicable to any jurisdiction facing a natural disaster.
Posted by: John Brooks | September 17, 2018 at 07:24
@Jordan Erickson: Looks like I was responding to your post before your edit... I agree, keep politics out of a tech blog. But the OP opened the discussion with a political statement:
"President Trump is once again causing distress..."
I'm finished. :) Have a good day.
Posted by: John Brooks | September 17, 2018 at 07:28
One thought I have about the methodology is that they are comparing year on year deaths to determine what the excess deaths were due to the hurricane. This implies that the control values would have stayed the same year on year. That is where I have a problem with the study. (Concern heightened by not having data or code released...)
If the island nation went into default in May and the government was unable to pay creditors, what affect did that have on hospitals, medical coverage and access to medicine? Certainly this kind of variable would have an impact on the quality of healthcare year on year. How was this impact determined? What percentage of the deaths would be attributable to this change that began months before the hurricane? What impact did this have on the government infrastructure before and after the hurricane? What is due to the weather event and what is due to the crash?
From the article and the study notes it did not seem that this was taken into account. Those facts, more than the politics, make me question the conclusions.
Posted by: ThinkData | September 17, 2018 at 17:41
Not sure why you have to make this political. Please keep politics out of technology blogs. I won't be clicking on links to Revolutions blogs as a result.
"Excess deaths" is calcuated as actual deaths minus expected deaths from Sep 2017 to Feb 2018.
However, what I don't see in the study is the accuracy of their expected deaths forecasts prior to Sep 2017. If their expected deaths forecast have a high mean absolute percentage error, then their residuals (or "excess deaths") post-Sep 2017 are obviously going to be high.
Posted by: Michael Westen | September 19, 2018 at 11:16