Secondary Logo

Journal Logo

Original Research Article

A hierarchical model for estimating the exposure-response curve by combining multiple studies of acute lower respiratory infections in children and household fine particulate matter air pollution

Keller, Joshua P.a,*; Katz, Joanneb; Pokhrel, Amod K.c; Bates, Michael N.c,d; Tielsch, Jamese; Zeger, Scott L.f

Author Information
doi: 10.1097/EE9.0000000000000119


What this study adds

We present two statistical models for analyzing health outcomes related to household air pollution and apply them to three studies of respiratory illness in children in Nepal. The models provide important advancements for reducing error in exposure measurements and estimating complex relationships between air pollution and health outcomes. The models provide a novel approach to combining data from multiple studies into a single analysis. We find that risk of respiratory illness increases with higher air pollution concentrations up to a point and then the risk plateaus.


Indoor air pollution from cookstoves constitutes a major source of morbidity and mortality throughout the world.1,2 Particulate matter emitted from stoves can impact respiratory and cardiovascular health of household residents, who are typically exposed to this household air pollution (HAP) on a daily basis. Acute lower respiratory infections (ALRIs) present a significant worldwide morbidity burden, particularly for children, that has been associated with HAP.3

Replacing biomass burning stoves with other, cleaner-burning stoves has been proposed as a means to mitigate this health burden.4 Multiple studies around the world have been designed to assess the health impact of cleaner-burning stoves.5–13 Many of these studies have failed to identify evidence of a significant impact on respiratory disease risk attributable to stove replacement, and better evidence about the effectiveness of cookstove interventions is needed.14–16

While cleaner-burning stoves may reduce indoor particulate matter concentrations by a large absolute or relative amount on average,17–20 the concentrations can still be quite high compared with the World Health Organization (WHO) air quality standard of 10 μg/m3 for fine particulate matter (PM2.5).21 Furthermore, the simultaneous use of multiple stove technologies (“stove stacking”) is widespread,22 so analyses based upon actual concentrations, rather than assigned stoves types, are needed. Additionally, short-term (24- or 48-hour) measurements of HAP can be highly variable,18,20 which poses challenges for studies assessing health impacts of long-term exposures.

The large range of HAP exposures means that correctly representing the nonlinearity in the exposure-response relationship is important. Burnett et al23 developed an “integrated” exposure-response (IER) curve for PM2.5 exposure and ALRI risk that included data from studies of ambient air pollution, HAP, and secondhand smoking. Limitations of the Burnett et al23 IER curve were its meta-analysis approach based on combining study-specific risk ratios rather than directly pooling data, its inclusion of only one study of HAP with limited exposure range, and its reliance on a fully parametric curve that restricted the types of relationships it can represent. In a recent analysis of the Bhaktapur study described below, Bates et al24 used flexible splines to estimate a similar curve with fewer assumptions but still included data from only one study. Burnett et al25 introduced an exposure-response curve that relaxed some of the assumptions of the IER curve and was designed for a combination of individual-level mortality data and study-level hazard ratios from multiple studies of outdoor PM2.5. This curve has several attractive features, including monotonicity, but retained a parametric form and only covered an exposure range up to 84 μg/m3.

In this article, we present an approach for estimating long-term HAP concentrations and combining data from multiple studies to estimate an exposure-response curve. We do this by developing two hierarchical models: (1) one for long-term exposure concentrations and (2) one for disease risk. We first describe three studies from Nepal that investigated the relationship between HAP and ALRI in children and motivated model development. We then present our hierarchical models for exposure concentrations and the exposure-response relationship and apply the models to the motivating studies.


Motivating studies

The first study is a case-control study conducted in Bhaktapur, Nepal.24,26 The study conducted active surveillance for respiratory illness in an open cohort of approximately 4,500 children under 3 years of age, from May 2006 to June 2007. There were 452 cases assessed to have ALRI according to WHO criteria.26 Immediately after a case occurred, a control subject of the same age was selected from the cohort. There were a total of 465 controls; logistical constraints and some refusals to participate prevented exact matching of cases and controls by age. In each child’s home, one 24-hour integrated measurement of PM2.5 concentrations was made using a light-scattering nephelometer, the University of California, Berkeley-Particle and Temperature Monitoring System (UCB-PATS).27 PM2.5 measurements were made within a week of recruitment on a weekday (Sunday to Friday). UCB-PATS were validated in the field against standard gravimetric measurements for each separate fuel type, and adjustments were made accordingly in the light-scattering efficiencies used to calculate the PM2.5 concentration. Stove type was identified by a questionnaire and categorized by fuel source: biomass, kerosene, gas, and electricity. HAP measurements are available for 393 cases and 431 controls (Table 1), with the remainder missed due to a delay in the start of monitoring or equipment malfunction. Overall, a 10 μg/m3 difference in PM2.5 was associated with 36% higher odds of ALRI (odds ratio [OR] = 1.36; 95% CI = 1.19, 1.56).24 This study was approved by the institutional review boards (IRBs) at the University of California–Berkeley, the Institute of Medicine, Tribhuvan University Teaching Hospital, Kathmandu, Nepal, and the Regional Committee for Medical and Health Research Ethics (REK VEST), Norway.

Table 1. - Characteristics of the Bhaktapur cohort
Characteristic Cases (n = 393); n (%) Controls (n = 431); n (%)
Stove type
 Biomass 109 (28) 109 (25)
 Kerosene 104 (26) 83 (19)
 Gas 110 (28) 128 (30)
 Electricity 70 (18) 111 (26)
 Female 166 (42) 197 (46)
 Male 227 (58) 234 (54)
Age (months)
 0–6 111 (28) 122 (28)
 7–12 98 (25) 92 (21)
 13–18 81 (21) 84 (19)
 19–24 52 (13) 77 (18)
 25–36 51 (13) 56 (13)

The next studies are from a cookstove intervention trial conducted in the Sarlahi district of Nepal.10,28 This trial comprised two phases, which we treat here as separate studies. The first phase was a stepped-wedge design that studied the impact of replacing traditional biomass stoves with improved biomass stoves with chimneys. Households were grouped by villages and the improved stoves were provided to groups of villages in a staged manner between December 2010 and January 2012 so that by study conclusion, all households were using the improved stove. Weekly follow-up, beginning in March 2010, was conducted to ascertain incidence of ALRI, defined as a maternal report of 2 or more consecutive days of rapid breathing accompanied by fever.10 Episodes were separated by a minimum of 7 symptom-free days. We restricted our analysis to ALRI incidence from the period July 2010 to December 2012, which includes all times at which both stove types were in use. A total of 5,254 children from 3,376 households were enrolled and, after exclusion of those with missing covariate, exposure, or follow-up information, 4,163 children from 2,460 households were included in the exposure-response model (Table 2). Between March 2010 and July 2012, 21-hour integrated measurements of PM2.5 were made using DataRAM pDR-1000 (Thermo Scientific, Franklin, MA) in each household twice during the phase I study (one measurement approximately 6 months before and one approximately 6 months after stove installation).29 A total of 3,276 households that had available PM2.5 measurements were included in the exposure concentration model analysis (Figure 1). The second phase used a traditional parallel design, covering a shorter period from April 2013 to April 2014 and a smaller cohort of 1,312 households, to compare the improved biomass stoves against liquid propane gas (LPG) stoves. Weekly outcome ascertainment was conducted similar to phase I. PM2.5 measurements were made once per household in phase II between May 2013 and February 2014, following the same procedures as phase I. In the primary analysis for phase I, the improved biomass stove intervention was associated with a small but nonsignificant reduction in ALRI incidence of 13% (risk ratio [RR] = 0.87; 95% CI = 0.67, 1.13).30 The intervention effect from phase II, which has not been previously published, is presented below. The Sarlahi studies were approved by the IRBs of the Johns Hopkins Bloomberg School of Public Health and the Institute of Medicine, Tribhuvan University, Kathmandu, Nepal. All study households and adult individuals provided informed verbal consent. The trials are registered at (NCT 00786877).

Table 2. - Characteristics of the Sarlahi cohorts
Characteristic Phase I (n = 4,163); n (%) Phase II—LPG arm (n = 786); n (%) Phase II—biomass arm (n = 782); n (%)
 Male 2,102 (50) 411 (52) 400 (51)
 Female 2,061 (50) 375 (48) 382 (49)
Age at study entry, months
 0–3 2,075 (50) 267 (34) 270 (34)
 4–7 320 (8) 136 (17) 123 (16)
 8–12 390 (9) 134 (17) 122 (16)
 13–24 891 (21) 249 (32) 267 (34)
 25–36 487 (12)
The dash indicates no individuals, since the Phase II study restricted to children 2 years of age or younger.

Figure 1.
Figure 1.:
Measured PM2.5 concentrations (in μg/m3) from the three motivating studies.

Overview of statistical models

We present two statistical models: an exposure concentration model for estimating long-term concentrations of HAP and an exposure-response model for estimating the relationship between HAP exposure and health outcomes. The exposure concentration model is study-specific, while the exposure-response model is designed to pool data from multiple studies. We present these models in the context of the motivating studies of PM2.5 and ALRIs, but they can be applied to other types of HAP, outdoor air pollution, and other health outcomes.

Exposure concentration model

The exposure concentration model is designed to estimate long-term household HAP concentrations and can accommodate any of the following study features:

  • 1) groups with different stove types, whether due to intervention or differing practices,
  • 2) cross-over in group membership over time,
  • 3) irregularly spaced repeated measurements of HAP concentration,
  • 4) a small number of measurements per household or study unit,
  • 5) clustering of study units within neighborhoods or other groups,
  • 6) an underlying temporal trend in HAP concentrations, and
  • 7) a large amount of variability in individual measurements.

The hierarchical Bayesian structure of the model shrinks observed concentrations towards cluster- and group-means. The shrinkage towards the group mean reduces the error that results from using highly variable single-day measurements to represent a long-term average, which can attenuate point estimates and reduce power. Importantly, the shrinkage is only partial, and so some individual-level and temporal variation remains, providing benefit over an analysis that averages solely by stove type.

We model the log-transformed measured pollutant concentration in household i of cluster k in stove group g on day t, as samples from the normal distribution . The mean has the form:

The group mean accounts for different average concentrations for each stove type. The cluster random effect accounts for neighborhood-level effects, which can reflect differences in practices and long-term local climate. If there is no clustering in the study design, this term can be dropped from the model. The household random effect accounts for correlation in repeated observations within the same household. The temporal trend is a linear combination of natural cubic splines with degrees of freedom (df). The group mean, cluster random effect, household random effect, and spline coefficients are each given Normal prior distributions: , , , and , respectively.

Exposure-response model

The exposure-response model combines data from multiple studies to estimate an exposure-response curve and is designed to accommodate any of these study features:

  1. 1) one or more studies with the same binary outcome and continuous exposure, variable, although the exposure ranges need not overlap between studies,
  2. 2) repeated outcome ascertainment for individuals,
  3. 3) different time periods and temporal effects by study, and
  4. 4) different measured confounders and adjustment variables by study.

Let and denote the case count and total time at risk for individual i in study s during time period t. For most studies, will be 1 and will be either 0 or 1 depending on whether or not an individual becomes a case during time period t. However, the model allows for studies in which an individual may have multiple episodes of being a case. The period t may be a day, week, month, or other duration. Importantly, t in this exposure-response model does not need to be identical to the timestep t in the exposure concentration model. We assume follows a Binomial ( , ) distribution where the probability of being a case is modeled as:

The term models differences in disease risk between study populations, whether due to underlying factors or differences in study designs. The term is a subject-level random effect that accounts for differences between individuals. The vector represents adjustment variables with coefficients . The term accounts for temporal variation in disease risk via natural cubic splines with degrees of freedom. By allowing to vary between studies, the model accommodates both different trends in risk between studies and data from different time periods. PM2.5 exposure is denoted by and represents a transformation of exposure that we describe in the following paragraphs. We use an improper uniform prior for and model the other parameters using Normal prior distributions: , , , and .

For the exposure concentration in Equation 2, we use the estimated long-term average concentration from the exposure concentration model. Specifically, we set to be the exponentiated posterior mean of the long-term average log-concentration for the household of individual i over the last time periods:

where is the vector of all observed exposure data for study s and is an indicator of subject i being in stove group g and cluster k at time t’. The averaging period controlled by captures changes in group and cluster in cross-over studies. By excluding the time trend in Equation 3, we obtain an estimate of long-term exposure concentration. In general, no information is lost through this approach because any smooth temporal variation in the exposure would be captured by the temporal trend .

We use monotonic I-splines, denoted by the transformation to represent a nonlinear exposure-response relationship. When their coefficients are restricted to be nonnegative, I-splines can be used to estimate a nondecreasing exposure-response function.31 I-splines have been used before for outdoor air pollution by Powell et al,32 but not for HAP studies. eFigure 1 ( shows the I-Spline basis functions used in the pooled analysis of the Nepal studies.

Model implementation

Sampling for both models is done via the Stan modeling framework.33 We implement this in the R package “bercs” (Bayesian Exposure-Response Curves via Stan), which is free and publicly available at

Application of models to cookstove studies

We fit the exposure model separately to each study. The hyperparameters and hyperpriors for each model are provided in eAppendix1 and eTables 1 and 2 ( We conducted two sensitivity analyses that explored the impact of varying prior specifications for the Bhaktapur study. The first used prior distributions favoring small household random effects, which is expected to lead to greater shrinkage since there is only one measurement per household. The second sensitivity analysis used a prior distribution that favored larger household random effects than the primary model, which is expected to give results closer to the measurements. For Sarlahi phase I, we included all households with available PM2.5 measurements, including those that were excluded from the exposure-response model due to missing covariate or follow-up information.

To quantify the relationship between HAP and ALRIs, we fit separate exposure-response models for each study and a model that pools the data from all three studies. All models included adjustment for age and sex. To illustrate the impact of different choices of exposure values, we fit three sensitivity analysis for the Bhaktapur study: two using the concentrations from the exposure concentration model sensitivity analyses and one using the observed measurements directly. We conducted additional sensitivity analyses for the Bhaktapur and combined models that restricted to nondecreasing exposure-response curves. The values of hyperparameters and hyperpriors for each exposure-response model are provided in eAppendix 2 and eTables 3–5 (

We plot the exposure-response curves using the posterior means of β and include pointwise 95% credible intervals that incorporate the uncertainty in the study-specific intercepts. To aide comparison, we center all of the estimated exposure-response curves to have odds ratios of 1 at 50 μg/m3.

We also present the estimated intervention effect for phase 2 of the Sarlahi study, computed from a log-linear model with offset for total time at risk, a fixed effect for stove, and a random effect for cluster.


Exposure model—Bhaktapur

In the Bhaktapur study, the geometric mean PM2.5 concentrations were 382 μg/m3 for households with biomass stoves (n = 218), 117 μg/m3 for kerosene (n = 187), 72 μg/m3 for gas (n = 238), and 55 μg/m3 for electricity (n = 181) (Table 3 and Figure 1). The exponentiated posterior group means estimated from the exposure concentration model are similar: 388 μg/m3 (95% credible interval [CI] = 346, 435 μg/m3) for households with biomass stoves, 119 μg/m3 (106, 1,134 μg/m3) for kerosene, 72 μg/m3 (65, 81 μg/m3) for gas, and 55 μg/m3 (49, 62 μg/m3) for electricity (Table 4).

Table 3. - Summary statistics of the household PM2.5 measurements (μg/m3) in each study, by primary stove type (Bhaktapur) and assigned stove type (Sarlahi studies)
Study Stove type N Geometric mean Minimum Q25 Median Q75 Maximum Mean SD
Bhaktapur Biomass 218 381.7 63.1 177.7 349.6 787.5 7,979 656.5 923.8
Kerosene 187 117.2 30.9 73.0 99.8 178.0 1,442 168.6 207.1
Gas 238 72.0 18.7 45.6 63.4 97.0 1,007 100.6 129.6
Electricity 181 55.2 11.7 33.9 53.0 79.6 775 79.9 103.3
Sarlahi phase I Traditional biomass 3,075 1,003 40 608 1,082 1,766 20,842 1,384 1,271
Improved biomass 2,836 626 9 350 652 1,166 19,410 937 1,113
Sarlahi phase II Improved biomass 659 558 9 299 618 1,104 10,642 884 975
Gas 661 264 10 128 268 594 4,115 442 513
Q25 indicates 25th percentile; Q75, 75th percentile.

Table 4. - Posterior mean (95% credible interval) for the parameters representing the average PM2.5 concentrations by stove type (ηg)
Stove group Modeling scale (natural log) Original scale (μg/m3)
 Biomass 5.96 (5.85, 6.08) 388 (346, 435)
 Kerosene 4.78 (4.66, 4.90) 119 (106, 134)
 Gas 4.28 (4.18, 4.39) 72 (65, 81)
 Electricity 4.01 (3.89, 4.13) 55 (49, 62)
Sarlahi phase I
 Traditional biomass 7.03 (6.87, 7.19) 1,134 (960, 1,326)
 Improved biomass 6.35 (6.22, 6.49) 571 (501, 657)
Sarlahi phase II
 Improved biomass 5.97 (5.73, 6.22) 392 (308, 503)
 LPG 5.22 (4.99, 5.47) 185 (147, 237)
Values are presented on the modeling scale (natural-log transformed) and back transformed to the original scale (μg/m3), which represents a geometric mean.

The posterior mean concentration for each household for the primary Bhaktapur exposure model results are plotted in Figure 2A, including the time trend. The long-term concentrations, which omit the time trend and are used in the exposure-response models, are summarized in eAppendix 3, eTable 6, and eFigure 2 ( For comparison, Figure 2B shows the model results from sensitivity model 1, when strong prior information is not included, leading to a small estimated and near-complete shrinkage to the group means. Alternatively, Figure 2C shows the modeled means from sensitivity model 2, when an informative prior is used to encourage even greater variation in the household random effects and less shrinkage towards the group mean. eTable 7 ( provides all parameter estimates for these models.

Figure 2.
Figure 2.:
Modeled PM2.5 concentrations for the Bhaktapur study. Results from (A) the primary exposure concentration model, (B) the sensitivity analysis with small household random effects, and (C) the sensitivity analysis with prior distributions favoring larger household random effects.

Exposure model—Sarlahi studies

For Sarlahi phase I, the geometric mean measured concentration was 1,003 μg/m3 in homes using traditional stoves (n = 3,075) and 626 μg/m3 in homes using improved biomass stoves (n = 2,836) (Table 3 and Figure 1). The exponentiated posterior mean PM2.5 concentrations from the exposure model fit were 1,134 μg/m3 and 571 μg/m3 for the traditional and improved biomass stove, respectively (Table 4). The posterior means from the exposure model fit for phase I of the Sarlahi study are shown in Figure 3A. The posterior means of the standard deviations of and were similar in magnitude, 0.21 and 0.30, respectively (eTable 9;, reflecting a moderate amount of between-cluster and between-household variation.

Figure 3.
Figure 3.:
Modeled PM2.5 concentrations for the Sarlahi studies. (A) Phase I and (B) Phase II.

In phase II, the geometric mean measured concentrations were 558 μg/m3 (n = 659) in biomass-stove households and 264 μg/m3 (n = 661) in LPG stove households (Table 3 and Figure 1). The exponentiated posterior mean PM2.5 concentrations from the exposure model fit were 392 μg/m3 and 185 μg/m3 for the improved biomass and LPG stove groups, respectively (Table 4, Figure 3B, and eTable 10;

Exposure-response model—Bhaktapur

The estimated exposure-response curve for the Bhaktapur study is plotted in Figure 4A and all parameter estimates are reported in eAppendix 4 and eTable 11 ( The estimated relative odds of ALRI doubles from 50 μg/m3 to 100 μg/m3, before the curve flattens out for higher concentrations. This difference corresponds to subjects in the gas, kerosene and biomass stove groups relative to those in the electricity group. Below 50 μg/m3, the estimated odds are lower but there is considerable uncertainty such that a flat exposure-response curve cannot be ruled out for that range (Figure 4A).

Figure 4.
Figure 4.:
Estimated exposure-response curves (posterior mean with pointwise 95% credible intervals) for the relationship between exposure to PM2.5 and ALRI in children from the Bhaktapur study. (A) the primary model using estimated long-term exposures with moderate household random effect, (B) sensitivity model 1, which uses exposures with large amount of shrinkage to group means, (C) sensitivity model 3, which uses the direct observations as the exposure, and (D) sensitivity model 4, which contains a nondecreasing exposure-response curve.

For the sensitivity analyses using alternative exposure values, the exposure-response curves are plotted in Figures 4B and C and eFigure 4 ( The curve estimated using the direct observations (Figure 4C) spans the largest range but has large uncertainty at the highest and lowest concentrations due to the small numbers of values far from the center of the data. The curve using exposures from sensitivity model 1 (Figure 4B) has a limited exposure range because it effectively uses only the stove group mean exposures (eTable 8; The sensitivity analysis using the modeled values with less shrinkage (eFigure 3; yielded an estimated exposure-response curve that is similar to that of the primary analysis. Figure 4D shows the estimated exposure-response curve from the model that restricts the curve to be nondecreasing. At lower concentrations, this curve resembles the primary curve (Figure 4A), but it does not plateau at higher concentrations.

Exposure-response model—Sarlahi studies

In Sarlahi phase I, there were 571 and 212 ALRI cases among the traditional and improved biomass stove groups, respectively, yielding unadjusted rates of 28.6 and 10.1 cases per 100 person-years. However, these differences are impacted by strong temporal trends in ALRI incidence (eFigure 4; that must be accounted for due to the stepped-wedge design.10 In the exposure-response model, there was little evidence of an exposure-response relationship (Figure 5A).

Figure 5.
Figure 5.:
Estimated exposure-response curve (posterior mean with pointwise 95% credible intervals) for the Sarlahi study. (A) Phase I and (B) Phase II.

Sarlahi phase II had a limited number of ALRI cases, 24 and 20 in the biomass and LPG stove groups, respectively. This corresponds to rates of 5.60 and 4.62 cases per 100 person-years and an intervention effect of a 14% reduction in ALRI incidence (RR = 0.86; 95% CI = 0.47, 1.56). But the limited amount of information yields no evidence of an exposure-response relationship (Figure 5B).

Full parameter estimates for the phase I and phase II models are reported in eTables 12 and 13 (, respectively.

Exposure-response model: three-study results

Figure 6 shows the estimated exposure-response curve from the model that combines all three studies and the corresponding parameter estimates are provided in eTable 14 ( The overall shape is similar to the Bhaktapur study curve (Figure 4A), but the inclusion of the Sarlahi studies extends the horizontal axis to 2,200 μg/m3. eTable 15 ( provides sets of odds ratios for different exposure combinations and the publicly available R package “PMerc” ( provides a tool for calculating the odds ratio from the curve in Figure 6 between any user-specific concentrations. At 150 μg/m3, the estimated odds of ALRI are more than three-times those at 50 μg/m3 (OR = 3.39; 95% credible interval = 1.89, 6.10). The estimated exposure-response relationship plateaus for concentrations above about 150 μg/m3 and then drops at the highest concentrations where there is less data. This curve is consistent with the lack of evidence for an exposure-response relationship in the Sarlahi study models since those studies only included PM2.5 concentrations of approximately 120 μg/m3 and higher.

Figure 6.
Figure 6.:
Estimated exposure-response curve (posterior mean with pointwise 95% credible intervals) for the relationship between exposure to PM2.5 and ALRI in children, for all three studies combined.

When forcing the exposure-response curve to be nondecreasing, we obtain the curve shown in Figure 7. This follows the same basic trend as the unrestricted curve at low exposure levels but continues to increase at higher exposure concentrations. However, the uncertainty around this curve is consistent with an increase in risk at low levels, followed by constant risk for concentrations above 150 μg/m3.

Figure 7.
Figure 7.:
Estimated exposure-response curve (posterior mean with pointwise 95% credible intervals) for the relationship between exposure to PM2.5 and ALRI in children, for all three studies combined in a model that restricts the curve to be nondecreasing.


We have presented exposure concentration and exposure-outcome models that provide a flexible framework for combining highly variable data from multiple studies to estimate a common exposure-response curve. These models allow for irregularly collected exposure measurements and outcome ascertainment. The spline-based approach to estimating the exposure-response curve avoids restrictive assumptions required for a parametric curve and allows for varying uncertainty across the range of exposure values. The framework can easily accommodate different approaches to defining long-term exposures (e.g., including the time trend or modifying the averaging period) for different applications.

In our analysis that combined data from three studies, we found evidence of a tripling in risk of ALRI among children exposed to long-term PM2.5 concentrations of 50 μg/m3 compared with 150 μg/m3. Above 150 μg/m3, there was little evidence of higher risk at higher PM2.5 levels. Below 50 μg/m3, there was evidence suggestive of lower risk, but a large amount of uncertainty precludes strong conclusions for that exposure range using the data available in this analysis. Compared with the IER curve of Burnett et al,23 the curve estimated here covers a wider range of HAP concentrations but does not provide as much information at the lowest concentrations. The Burnett et al23 curve showed a similar increase in risk between 50 and 150 μg/m3 but did not show the plateau in risk at higher levels that is a key feature of the estimated curve in Figure 6.

For studies that seek to reduce health burdens associated with household PM2.5, these results suggest that considerable reductions in PM2.5 concentration need to be achieved to have measurable impacts on health outcomes. In particular, the reduction in HAP achieved by the replacement of traditional biomass stoves with improved biomass stoves, like what was observed in Sarlahi phase I, is unlikely to be sufficient to achieve measurable health impacts.

Although the combined analysis incorporated individual-level data from three different studies, the majority of the information about ALRI risk came from the Bhaktapur study. When analyzed individually, the Sarlahi studies both failed to show evidence of an exposure-response relationship. For phase I, this was likely a combination of an insufficient reduction in the exposure concentrations and a strong temporal trend in the overall ALRI incidence rate (eFigure 4; In Sarlahi phase II, the LPG stoves lead to lower PM2.5 concentrations, but there was still no discernable exposure-response relationship (Figure 5B) due to the overall low numbers of cases.

A major strength of the exposure-response model presented here is its ability to combine data from different time periods and contexts, including studies with different designs. The Bhaktapur study had a matched case-control design, which necessitated including age in the exposure-response model. In contrast, the Sarlahi studies were stepped-wedge and parallel randomized trials. The study-specific intercepts, account for the differences in background ALRI rates in the randomized trials and for the fixed risk in the case-control design. In contrast to the multistudy PM mortality curve of Burnett et al,25 this work benefits from directly using all individual-level data in a single model and the flexibility of a spline basis.

By using flexible splines for the exposure-response curve, this model can estimate many different shapes for the relationship between exposure and disease risk. By not forcing a parametric form for the curve, we avoid a priori forcing structure into the exposure-response relationship. In plotting the exposure-response curve, we have included the uncertainty from the model intercepts. This better reflects the total information about the exposure-outcome relationship, which is impacted by both the spline coefficients and their correlation with the intercept terms. By plotting the curve with a fixed reference concentration in the middle of the observed values, rather than at the extreme low end, we avoid spurious interpretations of overly large effects impacted by large amounts of uncertainty at the edges of the data.

The exposure model presented here provides several advantages over using observed concentrations directly in analyses of HAP and health. First, the model provides estimates of exposure for times and settings that might not have been observed, which can be important when only a limited number of measurements are possible. Second, the shrinkage towards the group mean reduces error in using highly variable single-day measurements to represent a long-term average. The Bhaktapur exposure-response curve estimated using the measurements directly showed large amount of uncertainty and nonmonotonic behavior, particularly at the extremes of the data range.

A limitation of the analyses presented here is the limited amount of covariate adjustment included. For all three studies, we adjusted only for child age and sex, which were the only variables available for this pooled analysis. In particular, socioeconomic status may be a confounding factor and ideally would be adjusted for in these models. We note, however, that the results obtained for the Bhaktapur study in this minimally adjusted analysis were consistent with the results obtained by Bates et al,24 who explored adjustment for many additional factors. An additional limitation of the spline-based approach we use is that when forcing monotonicity, it is possible for the curve to plateau before increasing again at higher levels. We see this in Figure 7, although the uncertainty at those concentrations remains consistent with a flat curve. If desired, this behavior could be mitigated through modification or elimination of the basis functions (eFigure 1; that span the highest concentrations.

The modeling framework we have presented can flexibly include multiple studies for estimating a pooled exposure-response curve and estimating long-term exposures. Future applications to cookstove intervention studies can incorporate more than just the three studies presented here to advance our understanding of the relationship between HAP exposures and adverse health outcomes. This information can then be used to target future interventions to reduce ALRI risk.

Conflicts of interest statement

The authors declare that they have no conflicts of interest with regard to the content of this report.

Funding for the Bhaktapur study was provided by the European Commission (EU-INCO-DC contract number INCO-FP6-003740), the Danish Council of Developmental Research (project number 91128), the Research Council of Norway (RCN project numbers. 151054 and 172226), the Norwegian Council of Universities’ Committee for Development Research, Education (NUFU project number PRO 36/2002), and the Winrock International, Kathmandu, Nepal. Funding for the Sarlahi studies ( NCT 00786877) was provided by the National Institute for Environmental Health Sciences (NIEHS) R01ES015558, The Thrasher Research Fund (9177), and the Global Alliance for Clean Cookstoves, UN Foundation (UNF-12-380). The Sarlahi studies were registered on number NCT00786877.

Process for obtaining data and code: The statistical models presented here are implemented in the free R package available at Additional code for running the analyses is available from the first author. A free R package for calculating odds ratios from the primary results is available at Readers interested in obtaining the data should contact the authors about access and confidentiality requirements.


1. Gordon SB, Bruce NG, Grigg J, et al. Respiratory risks from household air pollution in low and middle income countries. Lancet Respir Med. 2014;2:823–860.
2. Mortimer K, Gordon SB, Jindal SK, Accinelli RA, Balmes J, Martin WJ 2nd. Household air pollution is a major avoidable risk factor for cardiorespiratory disease. Chest. 2012;142:1308–1315.
3. Troeger C, Blacker B, Khalil IA, et al. Estimates of the global, regional, and national morbidity, mortality, and aetiologies of lower respiratory infections in 195 countries, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Infect Dis. 2018;18:1191–1210.
4. Grieshop AP, Marshall JD, Kandlikar M. Health and climate benefits of cookstove replacement options. Energy Policy. 2011;39:7530–7542.
5. Balakrishnan K, Sambandam S, Ghosh S, et al. Household air pollution exposures of pregnant women receiving advanced combustion cookstoves in India: implications for intervention. Ann Glob Health. 2015;81:375–385.
6. Clasen T, Checkley W, Peel JL, et al.; HAPIN Investigators. Design and rationale of the HAPIN study: a multicountry randomized controlled trial to assess the effect of liquefied petroleum gas stove and continuous fuel distribution. Environ Health Perspect. 2020;128:47008.
7. Jack DW, Asante KP, Wylie BJ, et al. Ghana randomized air pollution and health study (GRAPHS): study protocol for a randomized controlled trial. Trials. 2015;16:420.
8. Mortimer K, Ndamala CB, Naunje AW, et al. A cleaner burning biomass-fuelled cookstove intervention to prevent pneumonia in children under 5 years old in rural Malawi (the Cooking and Pneumonia Study): a cluster randomised controlled trial. Lancet. 2017;389:167–175.
9. Smith KR, McCracken JP, Weber MW, et al. Effect of reduction in household air pollution on childhood pneumonia in Guatemala (RESPIRE): a randomised controlled trial. Lancet. 2011;378:1717–1726.
10. Tielsch JM, Katz J, Zeger SL, et al. Designs of two randomized, community-based trials to assess the impact of alternative cookstove installation on respiratory illness among young children and reproductive outcomes in rural Nepal. BMC Public Health. 2014;14:1271.
11. Young BN, Peel JL, Benka-Coker ML, et al. Study protocol for a stepped-wedge randomized cookstove intervention in rural Honduras: household air pollution and cardiometabolic health. BMC Public Health. 2019;19:903.
12. Lee AG, Kaali S, Quinn A, et al. Prenatal household air pollution is associated with impaired infant lung function with sex-specific effects. Evidence from GRAPHS, a cluster randomized cookstove intervention trial. Am J Respir Crit Care Med. 2019;199:738–746.
13. Clark ML, Peel JL, Burch JB, et al. Impact of improved cookstoves on indoor air pollution and adverse health effects among Honduran women. Int J Environ Health Res. 2009;19:357–368.
14. Clark ML, Peel JL, Balakrishnan K, et al. Health and household air pollution from solid fuel use: the need for improved exposure assessment. Environ Health Perspect. 2013;121:1120–1128.
15. Rosenthal J. The real challenge for cookstoves and health: more evidence. Ecohealth. 2015;12:8–11.
16. Thakur M, Nuyts PAW, Boudewijns EA, et al. Impact of improved cookstoves on women’s and child health in low and middle income countries: a systematic review and meta-analysis. Thorax. 2018;73:1026–1040.
17. Quansah R, Semple S, Ochieng CA, et al. Effectiveness of interventions to reduce household air pollution and/or improve health in homes using solid fuel in low-and-middle income countries: a systematic review and meta-analysis. Environ Int. 2017;103:73–90.
18. Benka-Coker ML, Peel JL, Volckens J, et al. Kitchen concentrations of fine particulate matter and particle number concentration in households using biomass cookstoves in rural Honduras. Environ Pollut. 2020;258:113697.
19. Steenland K, Pillarisetti A, Kirby M, et al. Modeling the potential health benefits of lower household air pollution after a hypothetical liquified petroleum gas (LPG) cookstove intervention. Environ Int. 2018;111:71–79.
20. Yip F, Christensen B, Sircar K, et al. Assessment of traditional and improved stove use on household air pollution and personal exposures in rural western Kenya. Environ Int. 2017;99:185–191.
21. World Health Organization. WHO Air Quality Guidelines for Particulate Matter, Ozone, Nitrogen Dioxide and Sulfur Dioxide: Global Update for 2005. World Health Organization; 2006. Available at: Accessed 24 June 2020.
22. Ruiz-Mercado I, Masera O. Patterns of stove use in the context of fuel-device stacking: rationale and implications. Ecohealth. 2015;12:42–56.
23. Burnett RT, Pope CA 3rd, Ezzati M, et al. An integrated risk function for estimating the global burden of disease attributable to ambient fine particulate matter exposure. Environ Health Perspect. 2014;122:397–403.
24. Bates MN, Pokhrel AK, Chandyo RK, et al. Kitchen PM2.5 concentrations and child acute lower respiratory infection in Bhaktapur, Nepal: the importance of fuel type. Environ Res. 2018;161:546–553.
25. Burnett R, Chen H, Szyszkowicz M, et al. Global estimates of mortality associated with long-term exposure to outdoor fine particulate matter. Proc Natl Acad Sci U S A. 2018;115:9592–9597.
26. Bates MN, Chandyo RK, Valentiner-Branth P, et al. Acute lower respiratory infection in childhood and household fuel use in Bhaktapur, Nepal. Environ Health Perspect. 2013;121:637–642.
27. Pokhrel AK, Bates MN, Acharya J, et al. PM 2.5 in household kitchens of Bhaktapur, Nepal, using four different cooking fuels. Atmos Environ. 2015;113:159–168.
28. Katz J, Tielsch JM, Khatry SK, et al. Impact of improved biomass and liquid petroleum gas stoves on birth outcomes in rural Nepal: results of 2 randomized trials. Glob Health Sci Pract. 2020;8:372–382.
29. Chen C, Zeger S, Breysse P, et al. Estimating indoor PM2.5 and CO concentrations in households in Southern Nepal: the Nepal Cookstove Intervention Trials. PLoS One. 2016;11:e0157984.
30. Tielsch JM, Katz J, Khatry SK, et al. Effect of an improved biomass stove on acute lower respiratory infections in young children in rural Nepal: a cluster-randomised, step-wedge trial. Lancet Glob Health. 2016;4:S19.
31. Ramsay JO. Monotone regression splines in action. Stat Sci. 1988;3:425–461.
32. Powell H, Lee D, Bowman A. Estimating constrained concentration-response functions between air pollution and health. Environmetrics. 2012;23:228–237.
33. Stan Development Team. RStan: The R Interface to Stan. R Package Version 2.17.3. 2018. Available at: Accessed 25 June 2018.

Bayesian; Exposure-response curve; Hierarchical models; Household air pollution; Measurement error; Particulate matter

Supplemental Digital Content

Copyright © 2020 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of The Environmental Epidemiology. All rights reserved.