To the Editor:
How many healthcare workers have lost their lives fighting coronavirus disease (COVID-19)? We estimate using the capture–recapture method.
In the capture–recapture method, two lists are drawn from a population, and the numbers on each list and overlap are observed.1–4 Let be the numbers in the population on both lists, list 1 but not list 2, list 2 but not list 1 and neither list respectively. Model these counts as2
The parameter is the expected size of the population that we are seeking to estimate, and are the odds of an individual being on list 1 and list 2 respectively and is the odds ratio of an individual being on list 1 given the individual is on list 2 compared with not being on list 2. We only observe three counts so cannot estimate the four parameters without further assumptions. If we assume the lists are independent, that is, ; then, we can estimate the other parameters by maximum likelihood.2,3 The estimated population size can be thought of as based on imputing the number of population members missing from both lists by assuming the probability of being missing is the probability of being missing from list 1 times the probability of being missing from list 2 (i.e., list independence).1
To apply the capture–recapture method, we consider the following two websites set up to honor healthcare workers who have died fighting COVID-19: (i) https://covid-heroes.com and (ii) https://www.medscape.com/viewarticle/927976. Both websites have online forms for submitting names and information about deceased healthcare workers worldwide. As of May 20, website (i) contains 928 names and website (ii) contains 1,136 names. Each of these numbers, 928 and 1,136, underestimates the number of healthcare workers who have died since each website’s list of names is incomplete. Using the capture–recapture method, we estimate that 1,652 healthcare workers worldwide have died fighting COVID-19 as of May 20, 2021 (95% confidence interval [CI]: 1618, 1693). This is likely an underestimate. The lists are likely to be positively dependent because (1) a family member/friend/colleague who submits a deceased healthcare worker’s name to one website may easily submit it to the other website and (2) healthcare workers dying in some parts of the world may be less likely to be reported, for example, because of low internet access. This causes the capture–recapture method to underestimate because, when the lists are positively dependent, the probability of being missing from both lists is greater than the product of the probabilities of being missing from each list, causing the number missing from both lists to be underimputed.1 Although the list dependence parameter cannot be estimated, if it is known, the other parameters in model (1) can be estimated by making an offset.3 We do not know but considerations (1) and (2) make it likely the lists are positively dependent, so . Table shows a sensitivity analysis with different values of . If being listed on website (i) doubles the odds that a healthcare worker who died of COVID-19 is listed on website (ii) compared with not being listed on website (ii), then we estimate that 1,879 health care workers have died of COVID-19 and if being listed on website (i) multiplies the odds by 5, then we estimate 2,558 heath care workers have died. The eAppendix (https://links.lww.com/EDE/B684) contains an R script for updating these estimates.
A limitation is that we do not have good information about the sensitivity parameter . However, we think , in which case the usual capture–recapture estimate, 1,652 (95% CI: 1618, 1693) as of May 20th, is an underestimate. This provides evidence that many healthcare workers have sacrificed their lives fighting COVID-19.
Table 1. -
Capture–recapture estimates of number of healthcare workers who have died fighting COVID-19 as of May 20, 2021. There were 928 workers listed on website (i), 1,136 workers listed on website (ii) and 620 workers listed on both websites.
|Odds ratio of being listed on website (ii) given being listed on website (i) compared with not being listed on website (i)
||Estimate (95% Confidence interval)
|1 (Usual Capture–recapture Assumption)
||1,652 (1618, 1693)
||1,879 (1810, 1960)
||2,105 (2002, 2226)
||2,331 (2194, 2493)
||2,558 (2386, 2760)
||3,011 (2771, 3293)
||3,690 (3347, 4094)
1. Hook EB, Regal RR. Capture-recapture methods in epidemiology: methods and limitations. Epidemiol Rev. 1995;17:243–264.
2. Cormack RM. Log-linear models for capture–recapture. Biometrics. 1989;45:395–413.
3. Lum K, Ball P. Estimating undocumented homicides with two lists and list dependence. Human Rights Data Analysis Group. 2015. https://hrdag.org/wp-content/uploads/2015/07/2015-hrdag-estimating-undoc-homicides.pdf
4. Chao A, Tsay PK, Lin SH, Shau WY, Chao DY. The applications of capture-recapture models to epidemiological data. Stat Med. 2001;20:3123–3157.