Exploratory analysis of the suitability of data from the civil registration system for estimating excess mortality due to COVID-19 in Faridabad district of India : Indian Journal of Medical Research

Secondary Logo

Journal Logo

Programme: Original Article

Exploratory analysis of the suitability of data from the civil registration system for estimating excess mortality due to COVID-19 in Faridabad district of India

Gupta, Ayon; Asadullah, Md; Kumar, Rakesh; Krishnan, Anand

Author Information
Indian Journal of Medical Research 156(3):p 421-428, September 2022. | DOI: 10.4103/ijmr.ijmr_1421_21
  • Open


An estimate of deaths caused by a pandemic is a good measure of its severity. Variations in testing policies and coverage and COVID-19 death definitions limit comparability of death tolls across geographies as do access to care and robustness of vital registration systems. India officially reported over 43 million cases and 516,543 COVID-19 deaths till March 20221. It was estimated that although 86 per cent of all deaths were registered, yet only 21.1 per cent of all registered deaths were medically certified2,3.

In countries with near-complete registration of deaths in normal times have also acknowledged that their officially recorded COVID-19 death tolls might be underestimates4,5. This has shifted the focus to indirect estimation procedures. One established approach is estimating excess mortality i.e., the number of deaths above that expected during normal times4,6. Excess mortality reflects both direct and indirect mortality due to COVID-19. Indirect deaths include cardiac or diabetic deaths occurring as a sequela of COVID-19 or those occurring due to a reduction in healthcare access during pandemic-related restrictions. Similar approaches have been used in the past for the estimation of deaths due to influenza pandemics7. The WHO advocates the use of rapid mortality surveillance approaches with a focus on excess mortality to inform decision-makers about the scale of the epidemic6.

All deaths in India whether at home or in hospital must be registered in the jurisdiction of occurrence by next of kin or hospitals under the civil registration system (CRS)2. Two evaluations of the different sources of data for the estimation of influenza mortality scored CRS high on sample size and representativeness while rating it low on cause of death (CoD) ascertainment and timely availability of data8,9.

In the present study, the suitability of the CRS was examined for estimation of excess mortality due to COVID-19 in two tehsils (Ballabgarh and Badkhal) in Faridabad district of Haryana, India. The parameters for suitability included completeness of death registration, the feasibility of estimating excess mortality from the dataset using different statistical approaches and the validity of lay reporting of CoD in the CRS data.

Material & Methods

The study was based on the secondary data analysis. The study protocol was approved by the Institutional Ethics Committee, All India Institute of Medical Sciences, New Delhi, India. Initial exploratory analysis was performed using available data from a concurrent study. Later, Municipal Commissioner, Faridabad, was approached formally who provided CRS (civil registration system) data for the estimation of excess deaths. The study was carried out between March to December 2021.

Study setting: Faridabad district of Haryana State lies in the National Capital Region with Delhi to its north and Uttar Pradesh to its east. The district has one functioning government medical college hospital and several tertiary care multi-speciality hospitals in the private sector10. The 2011 census recorded the population of the district as 1,809,733, with 79.5 per cent residing in urban areas with an estimated population in 2020 of 2.1 million11,12. In 2019, 11,141 deaths were registered in Faridabad district and the State of Haryana reported 100 per cent coverage of death registration2. The district reported over 45,000 cases and 400 COVID-19-related deaths in 2020 with a peak of active and confirmed cases being recorded on November 20, 202013. A serosurvey in October 2020 found that almost one-third of the population (31.2%) had been exposed to the virus14.

Data sources: A line list of all deaths between January 1, 2016 and September 30, 2021 registered until November 30, 2021 was obtained from the Registrar of Births and Deaths Offices in Ballabgarh and Badkhal tehsils of Faridabad district. To check for the level of completeness in the registration of deaths in 2020, a sample of 50 deaths each recorded in the registers of two large crematoria and one graveyard (total 150) in the local area were drawn through systematic random sampling and compared with death records obtained from CRS.

Data analysis: Data were analyzed using Microsoft Excel 2016 and STATA® release 15 (StataCorp LLC, College Station, Texas, USA). Three approaches were explored to estimate excess mortality by gender and age groups (0-14, 15-59 and ≥60 yr) in 2020 and 2021 against a baseline estimated from monthly deaths in the corresponding group during 2016-2019. The first approach compared monthly mortality in 2020 and 2021 (pandemic years) against the historical average, using standard error (SE) for the confidence intervals (CIs). In the second approach, expected monthly mortality for 2020 and 2021 was estimated using the FORECAST.ETS function with seasonality set to 1 (automatic calculation) in Microsoft Excel 2016. The function utilizes a triple exponential smoothing method based on historical data and using the square root of deaths as the CI15. The total (cumulative) number of excess (observed vs. baseline) all-cause deaths in both approaches was obtained by summing the excess all-cause deaths (with negative values set to 0) in each month. Weekly mortality was plotted by age and sex separately for the years 2020 and 2021 and compared with the range identified by the above two approaches. The third approach was a modification of the simple linear regression method elaborated by Gibertoni et al16, with the observed deaths as the dependent variable and an ordinal variable indicating year as the independent variable (coded 1-6 to represent years from 2016 to 2020-21). The constant and slope obtained for each month were used to estimate by extrapolation; the expected number of monthly deaths in 2020 and 2021, using the equation:

d (Expected Deaths)=a+(b×4)

where, x=5 corresponds to the year 2020-21.

Monthly excess mortality (observed minus expected) with negative values set to 0 was summed to arrive at cumulative all-cause excess mortality. Excess deaths were expressed as a range by taking the mean and 95 per cent upper bound of the CI for expected or predicted values.

Verbal autopsies (VA) were conducted by trained field workers using a validated instrument17 for a subset of 585 deaths in the age group of 30-69 yr which occurred in study tehsils as a part of an ongoing study to identify cardiac deaths. To assess the reliability of CoD from CRS, International Classification of Diseases-10 (ICD-10) codes for the underlying CoD based on the VA interview were assigned by the authors18. Appropriate ICD-10 codes were also assigned for the CoD recorded in the CRS portal for the same subset. Cohen’s κ was calculated to assess agreement between physicians assigned CoD and that recorded in CRS for major CoD categories. Cohen’s κ ≥0.4 was interpreted as a moderate level of agreement19.


A total of 7017 all-cause deaths were registered in Ballabgarh and Badkhal tehsils between January 1, 2020 and December 31, 2020. This was 8.9 per cent higher than the mean annual deaths (6446±147; 95% CI) registered from 2016 to 2019. In 2021, 6792 all-cause deaths were recorded until September 30, representing a 43.8 per cent increase over the mean for the same period (4723±277) (Table I). There did not appear to be an increased delay in death registration as compared to previous years. All the 150 deaths identified from crematoria and a graveyard in the study area in 2020 were registered on the CRS portal. A significant increase (19% in 2020 and 56% in 2021) in deaths in the population >60 yr old was seen as compared to the average for 2016-2019, with no sex differences.

Table I:
Characteristics of deaths occurring in Ballabgarh and Badkhal tehsils, Faridabad district: January 2016-September 2021

Figure 1A and B depict the week-wise COVID-19-confirmed cases and deaths in the Faridabad district reported in 2020 and 2021 (untill September 30, 2021). A multimodal curve with the peak in wk 48 is seen in Figure 1A while in Figure 1B a steeply sloping unimodal curve peaking at wk 18. A total of 402 COVID-19 deaths were reported in the Faridabad district till December 31, 2020 and a further 314 were reported between January 1, 2021 and September 30, 202113. The peaks of all-cause deaths observed, correspond temporally and in terms of magnitude to infection surges in the district13. Figures 2A, B and 3A-D depict the week-wise mortality reported in 2020 and 2021 along with the CIs of the historical average by sex and age groups. These show that the highest estimate of excess mortality was in ≥60 yr age group, with little impact on children below 14 yr and that women had parallel though smaller peaks than men throughout the pandemic.

Fig. 1:
(A and B) Weekly reported cases in Faridabad district: 2020 and 2021.
Fig. 2:
(A and B) Distribution of deaths in Ballabgarh and Badkhal in 2020 and 2021 by sex.
Fig. 3:
(A-D) Distribution of deaths in Ballabgarh and Badkhal in 2020 and 2021 by various age groups.

Estimates of excess mortality derived by each of the three approaches for age group and sex are shown in Table II. All three approaches gave overlapping estimates except for the 0-14 yr age group. The range of estimates by linear regression was wider than others while that of the historical average was the narrowest. The estimates were highest for the age group >60 yr and higher for men as compared to women though the ranges were overlapping. The forecasting method showed a small excess death estimate range for the 0-14 yr age group, while the other two methods indicated no excess mortality in this age group. Smaller overall number of deaths in this group and an outlier number in 2016 impacted the mean deaths which could have affected the estimates. Assuming that deaths were uniformly distributed in the district, Badkhal and Ballabgarh Registrar’s office would account for 58 per cent (6446/11141) deaths in Faridabad district, or 233 of the 402 COVID-19 deaths in 2020 and 182 of the 314 COVID-19 deaths till September 30, 2021 from District Faridabad would be registered in these two tehsils. This gave us a ratio of estimated excess deaths, directly or indirectly attributable to COVID-19, to officially reported COVID-19 deaths between 1.8-4 for 2020 and 10.9-13.9 for 2021.

Table II:
Excess mortality estimates for the years 2020 and 2021 of Ballabgarh and Badhkhal tehsils, Faridabad district by age groups and sex using three different approaches

The agreement between lay-reported and physician-assigned CoD showed moderate agreement for tuberculosis (κ=0.481; SE=0.039, P<0.001) and external injuries (κ=0.453; SE=0.041, P<0.001) (Table III). There was poor agreement on all other major CoD categories. The high proportion of garbage codes (cardiac arrest or a mechanism of death like heart failure or sepsis) were recorded in the CRS. None of the 585 sampled deaths from 2020 had COVID-19 recorded as an underlying CoD.

Table III:
Agreement between lay reported and physician assigned cause of death for the major cause of death categories registered in Ballabgarh and Badkhal tehsil Faridabad in 2020 (n=585)


There is consensus that the best estimate of the COVID-19 death toll can be assessed through the excess mortality approach20. Our assessment of the appropriateness of CRS data from a district in India for excess mortality estimation for COVID-19 covering both the first and the second wave showed that the number of deaths registered increased in 2020/21, the three commonly used approaches used for excess mortality estimation provided overlapping estimates and the data were not appropriate for cause-specific excess mortality estimation.

Pandemic-associated lockdowns and mobility restrictions did not seem to have affected the historically high death registration completeness in Ballabgarh and Badkhal tehsils2. Our independent verification further validated this finding. As compared to the mean of the previous four years, about 450 additional deaths were registered in 2020 and the timeliness of death registration improved. Excess mortality estimates arrived at using three approaches overlapped thus indicating the robustness of the methods used. Pandemic-induced changes in registration coverage, if any, would affect the excess mortality estimates for all three approaches. Although a deficiency in registration could be made up by excess deaths, it is believed that this is unlikely in our study and, if present, would have resulted in further underestimation of excess deaths.

There is a wide variation in estimated all-cause excess mortality globally depending on the severity of the pandemic. A population-level analysis conducted in England and Wales up to November 20, 2020 estimated 15 per cent excess mortality21. Total mortality was higher by 9.9 per cent between March and November 2020 in Israel22. Brazil reported 10.7 per cent excess mortality between January to June 20204. In the United States, which has reported the highest number of COVID-19 deaths worldwide, an excess mortality of 22.9 per cent was reported between March 2020 and January 202123. On the other end of the spectrum, a mere 0.03-0.72 per cent excess mortality was observed across prefectures in Japan24, while New Zealand did not report any excess deaths in 20204. Our relatively lower excess death estimate was consistent with lower mortality rates reported from India in 2020. A 3-6 times higher mortality in 2021 was corroborated by anecdotal evidence. Consistent with the global trend we also reported higher excess mortality in the elderly (≥60 yr) age group16,21,22. Although COVID-19 affects both sexes equally, there are reports of more men than women dying from COVID-19 globally25,26. However, mortality risk for COVID-19 has been reported to be higher for women than men in India27.

The Institute for Health Metrics and Evaluation estimated the ratio of total COVID-19 deaths to reported COVID-19 deaths as 2.96 for India and later increased it to 5-25 for 2021 deaths28. An ecological analysis of 22 countries has estimated that deaths attributed to COVID-19 are underestimated by 35 per cent4. Another study tracking excess mortality figures across 89 countries reported a figure of 1.56 as a global undercount ratio of COVID-19 deaths29. Our estimate of the ratio of total excess deaths to officially reported COVID-19 deaths was between 1.8-4 for 2020 and 10.9-13.9 for 2021. This ratio is dependent on the access to testing as also the completeness of death registration. Our findings suggested that excess mortality figures as estimated from this study were reasonable. It is to be emphasized here that these excess deaths include those directly as well as indirectly attributable to COVID-19. Officially reported COVID-19 deaths are those directly attributable to COVID-19, the deceased either tested positive for COVID-19 or in which a physician certified that the underlying CoD was COVID-19 related. Problems in reporting COVID-19 deaths will be faced in deaths occurring out-of-hospital and indirect COVID-19 deaths such as cardiac or cancer deaths which occurred due to a lack of access to care.

The most important challenge in excess death estimation common to all methods is the availability of a baseline or expected deaths against which to compare the observed deaths7. When estimating excess deaths, it is recommended that the total for the most disaggregated analyses are obtained and summed. The forecasting option should be resorted to when there are less than four years of historical data. Other approaches including Poisson distribution to model for excess deaths have been used by researchers, and these are likely to give similar results23,24,29,30.

This exploratory study had several limitations. Being limited to only two blocks (tehsils) of one district, it should not be used as an estimate for the country as pandemic intensity and underreporting of deaths would vary widely between States as well as urban and rural areas. Due to the smaller number of recorded deaths per week, we used month as the unit of estimation. Any excess mortality estimate is ultimately dependent for the accuracy on stable population dynamics. Lack of data on possible migration in and out of the district and of recent population denominators and age distribution of the district was the additional limitation.

In conclusion, our study indicated the usefulness of the CRS mortality data for estimating all-cause excess mortality due to COVID-19 in India. This approach may be considered for the estimation of excess deaths due to COVID-19 at the district, State or national level, subject to the availability of data.

Acknowledgment: Municipal Corporation of Faridabad is acknowledged for providing data access.

Financial support & sponsorship: None.

Conflicts of Interest: None.


1. World Health Organization. WHO COVID-19 hompage Available from: https://covid19.who.int/region/searo/country/in accessed on March 23, 2022.
2. Office of the Registrar General, India. Ministry of Home Affairs, Government of India. Vital Statistics Division. Vital statistics of India based on the civil registration system 2019 Available from: https://crsorgi.gov.in/web/uploads/download/CRS%202019%20report.pdf accessed on October 10, 2021.
3. Office of the Registrar General, India. Ministry of Home Affairs, Government of India. Vital Statistics Division. Report on medical certification of cause of death 2018 Available from: https://censusindia.gov.in/2011-documents/ MCCD_Report-2018.pdf accessed on May 14, 2021.
4. Kung S, Doppen M, Black M, Braithwaite I, Kearns C, Weatherall M, et al. Underestimation of COVID-19 mortality during the pandemic. ERJ Open Res 2021;7:00766–2020.
5. Reuters. Official COVID-19 death toll probably underestimates true total Available from: https://www.reuters.com/article/health-coronairus-who-deaths-int-idUSKBN26J2PX accessed on May 14, 2021.
6. World Health Organization. Revealing the toll of COVID-19: A technical package for rapid mortality surveillance and epidemic response Available from: https://www.who.int/publications/i/item/revealing-the-toll-of-covid-19 accessed on May 14, 2021.
7. Viboud C, Simonsen L, Fuentes R, Flores J, Miller MA, Chowell G Global mortality impact of the 1957-1959 influenza pandemic. J Infect Dis 2016;213:738–45.
8. Narayan VV, Iuliano AD, Roguski K, Haldar P, Saha S, Sreenivas V, et al. Evaluation of data sources and approaches for estimation of influenza-associated mortality in India. Influenza Other Respir Viruses 2018;12:72–80.
9. Rao C, Gupta M The civil registration system is a potentially viable data source for reliable subnational mortality measurement in India. BMJ Glob Health 2020;5:e002586.
10. National Capital Region Planning Board Study on health infrastructure in NCR, Vol 1 New Delhi Mott MacDonald 2015.
11. Directorate of Census Operations Haryana. District census handbook Faridabad; 2011 Available from: https://censusindia.gov.in/2011census/dchb/DCHB_A/06/0620_PAR T_A_DCHB_FARIDABAD.pdf accessed on May 14, 2021.
12. Office of the Registrar General, India. SRS bulletin, volume 53 No. 1; 2020 Available from: https://censusindia.gov.in/vital_statistics/SRS_Bulletins/SRS%20Bulletin_2018.pdf accessed on May 14, 2021.
13. COVID19INDIA. Haryana Available from: https://www.covid19india.org/state/HR accessed on October 10, 2021.
14. Director General Health Services, Haryana. Sero-prevalence of anti-SARS-CoV-2 IgG antibodies in Haryana, India: A population based cross-sectional study (2nd round); 2020 Available from: http://nhmharyana.gov.in/WriteReadData/userfiles/file/CoronaVirus/Final%20COVID-19%20Sero%20Survey%20in%20Haryana%20Round-II%20Booklet%2028-10-2020.pdf accessed on April 28, 2021.
15. Hyndman RJ, Athanasopoulos G Athanasopoulos G. Chapter 7.7 forecasting with ETS models. In: Forecasting principles and practice 2nd ed Melbourne, Australia OTexts 2018.
16. Gibertoni D, Adja KYC, Golinelli D, Reno C, Regazzi L, Lenzi J, et al. Patterns of COVID-19 related excess mortality in the municipalities of Northern Italy during the first wave of the pandemic. Health Place 2021;67:102508.
17. Krishnan A, Kumar R, Nongkynrih B, Misra P, Srivastava R, Kapoor SK Adult mortality surveillance by routine health workers using a short verbal autopsy tool in rural north India. J Epidemiol Community Health 2012;66:501–6.
18. Krishnan A, Gupta V, Nongkynrih B, Kumar R, Kaur R, Malhotra S, et al. Mortality in India established through verbal autopsies (MINErVA):Strengthening national mortality surveillance system in India. J Glob Health 2020;10:020431.
19. Cohen J A coefficient of agreement for nominal scales. Educ Psychol Meas 1960;20:37–46.
20. Beaney T, Clarke JM, Jain V, Golestaneh AK, Lyons G, Salman D, et al. Excess mortality:The gold standard in measuring the impact of COVID-19 worldwide?. J R Soc Med 2020;113:329–34.
21. Aburto JM, Kashyap R, Schöley J, Angus C, Ermisch J, Mills MC, et al. Estimating the burden of the COVID-19 pandemic on mortality, life expectancy and lifespan inequality in England and Wales:A population-level analysis. J Epidemiol Community Health 2021;75:735–40.
22. Haklai Z, Aburbeh M, Goldberger N, Gordon ES Excess mortality during the COVID-19 pandemic in Israel, March-November 2020:When, where, and for whom?. Isr J Health Policy Res 2021;10:17.
23. Woolf SH, Chapman DA, Sabo RT, Zimmerman EB Excess deaths from COVID-19 and other causes in the US, March 1, 2020, to January 2,2021. JAMA 2021;325:1786–9.
24. Kawashima T, Nomura S, Tanoue Y, Yoneoka D, Eguchi A, Ng CFS, et al. Excess all-cause deaths during coronavirus disease pandemic, Japan, January-May 2020. Emerg Infect Dis 2021;27:789–95.
25. The Lancet The gendered dimensions of COVID-19. Lancet 2020;395 1168.
26. Peckham H, de Gruijter NM, Raine C, Radziszewska A, Ciurtin C, Wedderburn LR, et al. Male sex identified by global COVID-19 meta-analysis as a risk factor for death and ITU admission. Nat Commun 2020;11 6317.
27. Joe W, Kumar A, Rajpal S, Mishra US, Subramanian SV Equal risk, unequal burden?Gender differentials in COVID-19 mortality in India. J Glob Health Sci 2020;2:e17.
28. Institute for Health Metrics and Evaluation Estimation of total mortality due to COVID-19 2021 Available from:http://www.healthdata.org/special-analysis/estimation-excess -mortality-due-covid-19-and-scalars-reported-covid-19-deat hs accessed on May 10, 2021.
29. Karlinsky A, Kobak D Tracking excess mortality across countries during the COVID-19 pandemic with the World Mortality Dataset. Elife 2021;10:e69336.
30. Bilinski A, Emanuel EJ COVID-19 and excess all-cause mortality in the US and 18 comparison countries. JAMA 2020;324:2100–2.

Cause of death; civil registration; COVID-19; excess mortality; India; verbal autopsy

Copyright: © 2022 Indian Journal of Medical Research