Cutting through the confusion over Covid-19 fatality statistics

23 September 2020

Faculty of Medical and Health Sciences, Coronavirus

Opinion: People get mixed-up over Covid-19 fatality data. Professor Rod Jackson explains why the virus is much worse than flu.

Estimates of the proportion of people who die from Covid-19 have been controversial, with some even dismissing it as similar to a bad flu. There are three main problems accounting for this controversy. In this article, I describe each of these problems and some of the ways that epidemiologists like me try to deal with them.

Problem 1: To calculate the proportion of people who have been infected with Covid-19 and who die as a result (the Infection Fatality Proportion), you need two numbers – the number of deaths (the numerator) and the number of people who have been infected (the denominator) and then you divide the numerator by the denominator. Sounds simple, but unfortunately accurate information on both these numbers for Covid-19 is hard to find, unless you know what to look for. Just for the record, I have used the correct epidemiological term - the ‘Infection Fatality Proportion’ not the ‘Infection Fatality Rate’ - because it’s not actually a rate. In epidemiology, a rate requires a time component, for example, 10 deaths per 1000 people per year.

Problem 2: Some commentators have mixed up the Infection Fatality Proportion with the Case Fatality Proportion. Both calculations use the same number of deaths as the numerator, but they use different denominators. To calculate the Infection Fatality Proportion, you divide the number of deaths by the total number of people who have been infected. In contrast, to calculate the Case Fatality Proportion, you divide by the total number of known cases, where a case is generally someone who has experienced symptoms and has a positive swab test. For Covid-19, some people don’t have symptoms and many more are missed for a range of reasons. Therefore, for Covid-19, the number of recorded cases is usually a lot lower than the true number of infected people. As a result, the Case Fatality Proportion is a larger number than the Infection Fatality Proportion.

For example, among 100 people who are infected with Covid-19, only 50 might report their symptoms and have positive tests, so the total number of recorded cases will be 50. If one of these 50 cases dies, then the Case Fatality Proportion would be 1 in 50 or 2%, but the Infection Fatality Proportion will be 1 in 100 or 1%, because the true denominator is the 100 people infected. Unfortunately, good information on the true denominator for calculating the Infection Fatality Proportion is hard to find.

The media are full of stories of studies that most epidemiologists have discarded and this is a common cause of much of the apparent controversy...

Rod Jackson

Problem 3: The Infection Fatality Proportion varies a lot in different groups of people and in particular depends on a person’s age and whether they suffer from other illnesses. If the information used in the calculation is sourced from a group of infected people who are not similar in terms of their ages or health status to people in your city or country, then the calculated Infection Fatality Proportion won’t be very meaningful. Access to high-quality healthcare can also make a difference; the load of virus to which a person is exposed might be important; and a number of other known and unknown factors influence the Infection Fatality Proportion. Another problem that causes large differences across studies is random error or chance, which is much more of an issue in small studies. Unfortunately, the vast majority of studies that have estimated the Infection Fatality Proportion suffer from problem 3.

So how can we address these three problems?

Starting with problem 3, experienced epidemiologists have realised that to get meaningful estimates of the Infection Fatality Proportion for Covid-19, they need to use information that comes from large populations of typical people, with a typical range of ages, with typical disease patterns, living in typical accommodation, and having access to the typical health services of a large city, region or country. This means we now generally discard estimates that are not based on representative samples taken from populations of at least a few million people. To deal with random error, we also need studies that include hundreds, and ideally thousands of Covid-19 deaths. Certainly, any study with fewer than several hundred Covid-19 deaths is probably not worth looking at. For example, you can’t use information from New Zealand (25 deaths) or Iceland (10 deaths) to get a meaningful estimate of the Infection Fatality Proportion. In fact, experienced epidemiologists now discard the vast majority of studies that have estimated the Infection Fatality Proportion for Covid-19. Of course, early in the pandemic all we had were small studies, so we had to make the most of what was available.

To address problem 2, we simply discard any information in which the denominator is based on cases rather than on estimates of all infected people.

Addressing problem 1 - establishing accurate numerator and denominator data - can also be very challenging, even after dealing with problems 2 and 3. To estimate the true denominator of Covid-19 infections, people must have had positive swab tests or positive antibody tests (from a blood sample), so you have to search out information from well-done antibody studies of many thousands of people, who are representative of large cities, states, or countries. Such studies have been done in a number of US states and a number of European countries. Countries like Iceland have also done excellent antibody studies, but the population is simply too small with too few infected people and too few deaths to provide useful information on the Infection Fatality Proportion.

Finally, estimating the true number of Covid-related deaths for the numerator is not straightforward because many countries substantially under-report the number of Covid-related deaths. So, rather than just using the reported number of Covid-19 deaths, in our calculations, we also use the number of excess deaths – the additional (excess) number of total deaths that have been reported in a population for, say, the last 6 months, compared to the total number of deaths that would have typically been reported during the same period over the previous few years. In the US, for example, a total of 248,400 more deaths were reported in the 6 months between March and August this year, than expected based on the average number of deaths between March and August in the previous 5 years. Yet, only 176,247 Covid-19 deaths were reported during this period, which suggests that the true Covid-related deaths could have been up to 40% higher (248,400 total excess deaths / 176,247 reported Covid-19 deaths).

Once epidemiologists have gone through the process of addressing these 3 problems, calculating the Infection Fatality Proportion gets a bit easier, mainly because we have discarded most of the studies that have been done, as they are not likely to be very useful. The media are full of stories of studies that most epidemiologists have discarded and this is a common cause of much of the apparent controversy around the true Infection Fatality Proportion. If you read about a study based on a town or, say, a ship, or a study with fewer than a few hundred deaths, it’s probably best to ignore it, because it is very unlikely to be relevant to our own communities and fortunately, we now have some much larger studies.

One example of a good source of information on the denominator we need to calculate the Infection Fatality Proportion comes from Spain. The country has a population of approximately 47 million and between April 27 and May 11 this year, Covid-19 antibodies were measured on blood samples from 61,075 people who were a representative sample of the Spanish population. About 5 of every 100 people in the study had a positive antibody test or were already known to have been infected with Covid-19, which meant that about 5 of every 100 people among the 47 million people in Spain, (about 2,350,000 people) had been infected. This is the best kind of estimate of the denominator for calculating the Infection Fatality Proportion.

Death records in Spain are also readily available and the total number of reported Covid-19 deaths in mid-May in Spain was approximately 27,000. However, the total number of excess deaths during the same period was about 45,000. These two numbers represent reasonable lower and upper estimates of the numerator. If the lower number is correct, then the estimated Infection Fatality Proportion in Spain would be just over 1 death in 100 infected people or about 1%; but, if the upper number was correct, which is more likely, then the estimated Infection Fatality Proportion would be just under 2 deaths in 100 infected people or about 2%.

Epidemiologists typically do several ‘what if?’ calculations, known as sensitivity analyses. The two numerators I used from Spain (reported Covid deaths and excess deaths) is one example of a sensitivity analysis – ‘what if the true number of Covid-deaths is closer to the expected number of deaths than the reported number of deaths?’ We can do a similar ‘what if’ calculation with the denominator. For example, what if the antibody tests missed half of all the true infected cases? This is an extreme ‘what if’ but would lead to a doubling of the estimated denominator to about 4,700,000 infected people in Spain. If this was the true number of infected people, then the Infection Fatality Proportion in Spain would be about 1 death in 200 infected people (0.5%) if you use the reported Covid-19 deaths as the numerator and about 1 death in 100 people (1%) if you used the excess deaths as the numerator.

These multiple calculations suggest that the Infection Fatality Proportion for COVID-19 in Spain is somewhere between 1 in 200 (0.5%) and 1 in 50 (2%), with the best estimate around the middle, at about 1 death for every 100 infected people (about 1%). I have either done or seen similar calculations for France, the UK, Sweden, New York State, and several other countries and US states and the estimated Infection Fatality Proportions are all in the same range.

Closer to home, I have recently attempted the same calculation for Victoria in Australia (population 6.65 million). The Australian government report that there have been about 730 Covid-19 deaths in Victoria (likely to be an underestimate of the numerator – the true number of deaths) and about 20,000 cases (definitely an underestimate of the denominator – the true number of infected people). Colleagues in Melbourne believe that only about half the total number of infected people have been identified, so the true denominator is at least 40,000. Therefore, if the numerator is correct, then the Infection Fatality Proportion in Victoria would be just under 1 in 50 or 2% (730/40,000). However if the true denominator was, say, 4 times higher than the reported number of cases, which seems very unlikely, the Infection Fatality Proportion would be about 1 death in 100 people. So again, the calculations give similar estimates to the others I have done based on large populations.

In comparison, seasonal flu has an Infection Fatality Proportion of well under 1 in 1000. This means that Covid-19 is at least ten times more deadly than the flu.

Media contact

Paul Panckhurst | Media adviser
paul.panckhurst@auckland.ac.nz
022 032 8475