Secondary Logo

Journal Logo

Original Research Article

The impact of image resolution on power, bias, and confounding

A simulation study of ambient light at night exposure

McIsaac, Michael A.a,b,*; Sanders, Ericc; Kuester, Theresd; Aronson, Kristan J.b,e; Kyba, Christopher C. M.d

Author Information
Environmental Epidemiology: April 2021 - Volume 5 - Issue 2 - p e145
doi: 10.1097/EE9.0000000000000145

Abstract

What this study adds

This study uses novel simulation studies to quantify the errors that can result when measuring an environmental exposure from a data source with low spatial resolution. We show that epidemiologic studies that rely on low-resolution images (such as those provided by the Defense Meteorological Satellite Program Operational Line-Scan system) may be particularly prone to bias, confounding, and reduced statistical power. This work may help explain some of the variations in results from epidemiologic studies of artificial light at night exposure as a potential cause of various adverse health outcomes.

Introduction

People who live in urban environments are exposed to many environmental pollutants that are hypothesized or known to affect health, including heavy metals, air pollution, noise, and light pollution. Many of these pollutants have strong geographical correlations with each other.1 A busy arterial street, for example, can have much higher levels of air pollution, noise, and light pollution than a residential street only a few blocks away. In this article, we explore how these correlations and the spatial resolution of geographical data can potentially impact epidemiologic research.

The article is motivated by studies of the potential relationship between outdoor artificial light at residences and breast cancer risk.2–8 In the ideal case, measurement of artificial light exposure would be done using personal devices that capture light exposure while a person is outside and inside, as well as in their sleeping environment. The use of such light meters9 is limited by cost and the fact that the biologically effective exposure time window occurs years before the development of adverse outcomes such as cancer. Some studies have used questionnaires to try to assess past light exposure, usually asking about past experiences with shiftwork and possible bedroom light exposure.10,11 While indoor light at night (LAN) is believed to be the more relevant exposure,12 a frequently used proxy measure in epidemiologic studies is LAN from satellite imagery corresponding to study participants’ place of residence.2–6,13–22

For many years, the only source of global light emission data was the Defense Meteorological Satellite Program Operational Line-Scan system (DMSP). This instrument has been used in epidemiologic studies,3–6,13–15,17,18 although it was not originally intended for scientific work. The radiance measurements were not calibrated, the radiance resolution was only eight bits, and as a result, city centers often reported saturated values (i.e., the maximum possible value). Furthermore, the DMSP had a low spatial resolution, in the range 2.5–5 km.23–25 Recently, higher resolution and calibrated data have become available,25 for example, from the Visible Infrared Imaging Radiometer Suite Day/Night Band (DNB) sensor24 (~750 m resolution) or from astronaut photographs from the International Space Station (ISS) (up to 10 m resolution).26 Despite this, many studies completed later than 2012 continued to use the lower resolution and uncalibrated DMSP images, rather than the higher resolution ISS photos or DNB measurements.3–5,8,13–15,17,18,20

A number of issues may arise from using lower resolution imagery to assess LAN, and only some are well understood. For example, using lower resolution imagery introduces something akin to Berkson-type measurement error to LAN assessment.27 In other words, while we wish to assess the nighttime brightness at a specific location, we actually observe a brightness that was averaged over a larger area. This likely reduces study power through exposure misclassification, and depending on the structure of the city and the chosen method of analysis, could potentially introduce bias.

Another potential issue arising from using lower-resolution imagery is the possibility of an association between LAN and other risk factors that are strongly correlated at low resolutions. Any risk factor that varies between urban, suburban, and rural environments (e.g., pollution, socioeconomic status) could hypothetically have some relationship with LAN, because LAN tends to be brighter in certain areas (e.g., city centers, industrial areas), and also brighter in more populous cities.28 There has been little investigation into the nature of such associations.

This study examines the impact of image resolution and confounding factors on bias and error rates—both type II errors (false-negative) and type I errors (false-positive). Hypothetical epidemiologic studies are conducted using Monte Carlo simulations. Here, we simulate adverse outcomes caused by one type of environmental pollutant, and then examine (1) what would be observed in the case that analyses used exposure maps of this simulated causal agent at lower resolutions, or (2) what would be observed in the case that analyses used exposure maps of other (noncausal) pollutants, that may or may not be correlated with the simulated causal agent. We hypothesized that study power will be reduced when the simulated causal agent exposure is estimated using lower-resolution spatial data23,29; and that statistically significant effects will frequently be observed even when testing noncausal pollutants instead of the simulated causal agent (due to confounding), especially when those noncausal pollutant exposures are estimated using lower-resolution data. Although the focus here is on studies of outdoor artificial light at residences, these results can be generalized to other geographic pollutants.

Methods

In this article, we compare LAN (estimated from nighttime imagery) with seven other types of environmental pollution in the region near Vancouver, Canada. We conducted two stages of simulations. In stage 1, we focus on how spatial resolution of geographical data relates to type II errors (“false-negatives”; in other words, how often a study relying on lower-resolution data might mistakenly conclude that the simulated causal agent is not related to an adverse outcome). Here we simulated a setting where an adverse outcome is caused by light exposure, and explored the impact that decreased image resolution would have on statistical power. In stage 2, we examine type I errors (“false-positives”; in other words, how often a study relying on lower-resolution data might mistakenly conclude that a noncausal agent is causing an adverse outcome). Here we run simulations in which some pollutant other than LAN is the assigned cause of an adverse outcome, and explore the impact that decreased image resolution would have on confounding (i.e., on the frequency of the technically correct but potentially misleading finding that LAN is associated with higher risk of the outcome).

Light at night data

On the night of March 30–31, 2013, an astronaut on board the ISS took dozens of photos with a 400 mm lens while passing over North America. Citizen scientists in the “Cities at Night” project later categorized the photos and identified Vancouver as the target of some of the images. Each of the images in the series cover slightly different areas, and some have more motion blur than the others (the ISS travels at nearly 8 km/s). We selected image ISS035-E-13071 (available from the “Gateway to Astronaut Photography of Earth: https://eol.jsc.nasa.gov), which has little motion blur and covers much of the Vancouver metro area. The image was radiometrically calibrated by Noktosat.com, using the techniques developed by Sánchez de Miguel et al.30–32 Briefly, the color ratio of the green and red band was used to estimate spectral radiance (nW cm-2 sr−1Å−1) in a synthetic luminance band (i.e., the camera green band value was slightly adjusted to account for different lamp spectra). This is the same technique that was used by Garcia-Saenz et al.2

We used this image to produce synthetic satellite imagery with lower spatial resolution. This was done by simulating the spatial sampling process of a hypothetical satellite sensor. For each desired output resolution, we convolved the original image with a 2D spatial (x, y) point spread function using a simple Gaussian kernel with a full width at half maximum equal to the output resolution (Figure 1).33 To speed processing, the special size of the kernel extended only to 3σ, which covers >0.997 of a complete Gaussian distribution. The pixel values therefore contain only 99.7% of the surrounding information like the actual ground instantaneous field of view. Note that this is different from simply averaging data and producing an image with larger ground sample distance, and is more representative of what a real satellite with a reduced resolution would observe.34 For ease of programming, all images were saved with an identical extent in a Universal Transverse Mercator projection, with a 5-m ground sample distance. The full and reduced resolution data are shown in Figure 2.

F1
Figure 1.:
Kernel functions (upper layer) used to reduce spatial resolution of images using sets of pixels from the original image (lower layer). Shown here are a Gaussian kernel (left), with decreasing weighting from the center to the edges when reducing spatial resolution, and a uniform rectangular kernel (right), which weights pixels equally in a rectangle when reducing spatial resolution. Figure modified from the work of Bochow 2010.33
F2
Figure 2.:
Street map (A) and topographic map (B) of Vancouver, British Columbia. Distribution of LAN in Vancouver as measured from the ISS (full resolution; C) and at calculated reduced resolutions (50 m, 100 m, 200 m, 500 m, 1000 m, and 2000 m; D–I) using a Gaussian Point Spread Function (PSF).

Other pollutant data

For comparison to the LAN data, we obtained maps of a number of other pollutants from colleagues at the University of British Columbia (Figure 3). The pollutants include: “all noise,”35 “street noise,”35 black carbon,36 NO,37 NO2,37 PM2.5,38 and ultrafine particles.39 These datasets were clipped and resized (using nearest neighbor interpolation) to the same extent as the LAN datasets (with 5 m ground sample distance). Each of the maps are based on different models, with reference data obtained in different years, times of day, and times of year (see eTable 1; https://links.lww.com/EE/A130). The heterogeneity is not a problem for this study; we are using them as related examples of spatial distributions of pollutants, not trying to identify the underlying cause of real disease.

F3
Figure 3.:
Distribution of pollutants in Vancouver, with LAN for comparison (full resolution A; low resolution B) in order PM2.5, black carbon, NO, NO2, street noise, all noise, ultrafine particles (C–I).

Simulated residence locations and exposures

For each simulated study, we produced a set of geographical coordinates for the residences of 2000 simulated study participants. Population is known to scale approximately linearly with LAN for low-resolution imagery.28 We therefore decided to use a modified version of the 50 m resolution LAN data as a proxy for population density, and assigned locations proportionally (e.g., so that areas with twice as much LAN were twice as likely to have a resident assigned). The 50 m resolution image was chosen so that simulated residences would be more common in bright neighborhoods, but would not necessarily always have bright values at full resolution. This process was repeated to create 2000 simulated data sets (each containing 2000 simulated residences).

As shown in Figure 2, the Metro Vancouver area contains a few large regions where people do not live (e.g., mountainous or water-filled areas). These areas do not always have values of zero in the LAN map, either because of sensor noise or because of the detection of scattered skyglow.40 Therefore, before generating the residences, large sections of these dark areas were set to zero LAN (preventing participants from being assigned these locations as residences). Once a location was assigned for a simulated participant, the pixel value was recorded from the maps for each of the pollutants (including LAN at different resolutions), and saved to a data file.

Relationship between light at night and other pollutants

We examined the correlation between pollutants in two different ways. First, we paired the non-zero areas of each map to the non-zero areas of each other map, and calculated the Pearson correlation coefficient for each dataset pair (Figure 4). This is the spatial correlation for the study area, but is not necessarily the same as the correlation that would be observed at residences, because the population is not uniformly distributed across the map. We therefore also calculated the Pearson correlation coefficients between the different LAN measurements and each pollutant at all simulated residences (Figure 5).

F4
Figure 4.:
Pearson correlation coefficients between various forms of pollution, including light at different spatial scales. This figure shows the correlations calculated using the entire map, not the locations where simulated study participants lived. Numeric headers refer to resolution (m) of LAN measurement. AN indicates all noise; BC, black carbon; TN, traffic noise; UFP, ultrafine particles.
F5
Figure 5.:
Correlation between other pollutants and light at residence locations for different resolutions of the light maps. The different symbols indicate which pollutant was compared in each case.

Stage 1 simulations: exploring type II errors when light at night is the cause of the adverse outcome

In stage 1, we simulated a situation in which light exposure at the place of residence in the full resolution map is the cause of an adverse outcome. We then examined what researchers would observe if they estimated light exposure using a map that had reduced spatial resolution.

Individual outcome statuses were simulated based on a logistic model. That is, the probability of an adverse outcome () for individual i with LAN exposure of was . The logarithmic odds of an adverse outcome () were thus . So, for each unit increase in LAN exposure in the full resolution map, the odds of an adverse outcome increased by a factor of.

After the outcome data were generated using the full resolution map, we examined what would happen if researchers used logistic regression to estimate the effect of LAN using exposure estimates based on maps with lower spatial resolution. That is, for each spatial resolution under consideration, we determined the estimated LAN exposure, , for each individual in the simulation. We then performed a logistic regression using these exposure estimates in order to calculate and , the point estimates of the parameters in the expression , which a researcher would find if analyzing this study using LAN exposure measured from a map with that particular spatial resolution.

Each simulation included 2000 simulated participants, and we performed 2000 such simulations in order to calculate empirical power and bias. For each of the LAN maps at different resolutions, the statistical power of the experiment is defined as the percentage of simulations that did not result in a type II error (i.e., as the percentage of simulations that correctly identified a statistically significant relationship between LAN and the adverse outcome). We therefore calculated the percentage of simulations in which researchers would have observed a statistically significant effect of LAN (defined here as obtaining a 95% confidence interval for that does not include 0). Additionally, empirical bias was calculated as the difference between the average LAN coefficient estimate () and the true LAN coefficient used for outcome generation (i.e., ).

These simulations were conducted using a large range of values of representing no effect of LAN up to a very strong effect of LAN, and using values representing three levels of prevalence of the adverse outcome. Specifically, values were chosen corresponding to odds ratios ranging from 1 to 2 per 1 SD increase in full-resolution LAN, and corresponding values were chosen such that outcome rates at mean LAN exposure were 5%, 20%, and 50%. This wide range of settings was explored such that the findings might be generally applicable across research settings involving a variety of environmental factors and health outcomes. Note that study power is a function of not only effect size, but also sample size, which can vary drastically from one study to another. Our goal here is not to predict power for a particular effect size, but rather to compare the relative power of theoretical studies relying on estimates of LAN exposure from data sources with different spatial resolutions.

Stage 2 simulations: exploring type I error when light at night is not a cause of the adverse outcome

In stage 2, we simulated a situation in which a type of pollution other than LAN is the cause of adverse outcomes. We then examined what researchers would find if they explored whether LAN exposure estimated via images with various resolutions was related to the adverse outcome rates.

We used the same simulated residences as in stage 1. In this case, however, individual outcome statuses were simulated based on a pollutant, W, that was not LAN. Thus, the simulated logarithmic odds of an adverse outcome were .

After the outcome data were generated, we performed logistic regressions to see to what degree a researcher might (incorrectly) identify LAN as a cause of the adverse outcome. That is, for a given estimate of LAN, , we estimated the corresponding logistic regression parameters as . As before, 2000 simulations with 2000 participants were used separately for each resolution. We then evaluated the number of simulations that resulted in a type I error (i.e., observing a 95% confidence interval for that does not include 0, even though LAN is not actually the cause of the outcome).

Such simulations were conducted for a large range of values of corresponding to odds ratios ranging from 1 to 2 per 1 SD increase in pollutant exposure, with values corresponding to adverse outcome rates of 20% at mean pollutant exposure. This was done for each of the seven available pollutants: PM2.5, NO, NO2, all noise, street noise, black carbon, and ultrafine particles. Again, this collection of simulations represents a wide variety of possible settings, with ranging, in each set of simulations, from considerations of no effect (OR = 1) of the given pollutant up to a very strong effect (OR = 2).

Results

Relationship between light at night and other pollutants

The Pearson correlation coefficients between environmental pollutants at the map level are presented in Figure 4. The impact of spatial resolution on the relationship between LAN estimates and other environmental pollutants at the residence level is highlighted graphically in Figure 5.

When considering the correlations based on equal geographic weighting (Figure 4), all environmental pollutants have a positive correlation with each of the other environmental pollutants. The size of these correlations varies depending on the pair of pollutants. The level of ultrafine particles, for example, has a Pearson correlation coefficient of 0.6 with black carbon at the citywide scale, but only a 0.22 correlation with PM2.5. Differences arise when considering the correlation at simulated residences (eTable 2; https://links.lww.com/EE/A130) rather than comparing the complete maps. In general, the correlation coefficients are higher (sometimes much higher) for the full map than they are for the residences.

LAN is positively correlated with many other environmental pollutants, and the strength of this relationship tends to increase as the spatial resolution of the LAN image decreases (Figure 5). In particular, NO, NO2, PM2.5, and, to a lesser extent, black carbon had much stronger correlations with LAN when it was recorded at lower spatial resolutions (e.g., LAN and NO2 correlation ranges from near 0.2 at high spatial resolutions to over 0.4 at low spatial resolutions; see Figure 5). There is much smaller variation in the correlation between LAN and the other pollutants as spatial resolution changes (e.g., correlation with ultrafine particles ranges only from 0.36 to 0.41). This suggests that higher resolution maps could reduce confounding of LAN with some pollutants (NO, NO2, PM2.5, and black carbon), but perhaps not with others.

Stage 1 simulations: exploring type II errors when light at night is the cause of the adverse outcome

Figure 6 shows the probability of a type II error (false-negative) for studies relying on LAN maps at each resolution and for three different effect sizes at mean exposure (i.e., ). Each power curve shows the percentage of simulated trials in which the relationship between LAN and the outcome was found to be statistically significant, as a function of the strength of the simulated relationship between LAN and the adverse outcome (i.e., as a function of ). The strength of the relationship is measured in ORs per 1 SD increase in full resolution LAN, where the SD of full-resolution LAN was found to be 0.04 nW cm−2 sr-1Å−1. For purposes of visualization, a smoothed curve is shown (the stochastic nature of Monte Carlo means that individual sets of simulations came in higher or lower than the curves).

F6
Figure 6.:
Stage 1 simulation results: power achieved for each spatial resolution as a function of the strength of the true relationship between LAN and the adverse outcome, measured in ORs per 1 SD increase in full resolution light. Results are presented from simulations with adverse outcome rate of 5% (a), 20% (b), and 50% (c) at mean exposure.

Statistical power decreased with decreasing spatial resolution in all simulations. The loss of power from using low-resolution maps was often substantial, though the extent of this reduction depended on the strength of the simulated relationship (i.e., when a small change in exposure caused a large effect, it was easier to conclude that a significant relationship existed even when using low-resolution data). In some settings, the power when estimating LAN exposure from high-resolution images was up to five times greater than the power that resulted from estimating LAN exposure from low-resolution images. We also used these results to estimate the sample size that would be required to achieve 80% power to detect a statistically significant relationship between LAN and adverse outcome; the sample size required to observe an effect with equivalent power is generally over 10 times larger for the 2 km resolution map compared to the full resolution map (eFigure 1; https://links.lww.com/EE/A130).

The bias of the estimates depends upon the resolution, adverse outcome rate at mean exposure (i.e., ), and simulated odds ratio between LAN and adverse outcome (i.e., ) (Figure 7). Each bias curve shows the mean difference across 2000 simulated trials between the estimated LAN effect and the simulated LAN effect, as a function of the strength of the simulated relationship between LAN and the adverse outcome. As with Figure 6, smoothed curves are shown to aid interpretation. There is no simple relationship here, though in general, bias is larger when a map with lower resolution is used to assess LAN exposure.

F7
Figure 7.:
Stage 1 simulation results, bias observed for each image resolution as a function of the strength of the true relationship between LAN and the adverse outcome (measured in ORs per 1 SD increase in full resolution light; the SD of full-resolution light was found to be 0.04 nW cm−2 sr-1Å−1). Results are presented from simulations with outcome rate of 5% (A), 20% (B), and 50% (C) at mean exposure.

Stage 2 simulations: exploring type I error when light at night is not a cause of the adverse outcome

The probability of type I errors (false-positives) is shown for the LAN maps of different resolutions for each of the other pollutants at each resolution in Figure 8. Each type I error curve shows the percentage of simulated trials that observe a statistically significant relationship between LAN and adverse outcomes, as a function of the strength of the relationship between the simulated causal agent and the adverse outcome. These type I error rates reflect how easy it would be to mistakenly conclude that LAN is an important cause of the outcome when the true causal agent was actually another pollutant. In the case of NO, NO2, PM2.5, and black carbon (Figure 8C–F), the resolution of the LAN image plays a clear role in the frequency of type I errors. This is because the correlation between these pollutants and LAN differs depending on the spatial resolution of the light map (Figure 5). Since these correlations were in general larger for lower-resolution maps, this suggests that using high-resolution light imagery could reduce the frequency of these errors.

F8
Figure 8.:
Stage 2 simulation results: Type I Error Rate as a function of the strength of the true relationship between pollutant and the adverse outcome (measured in ORs per 1 SD increase in pollutant level). Light resolution of the regression predictor varies by line type. Results are presented from simulations with true outcome risk determined by, respectively, All noise (A), street noise (B), black carbon (C), NO (D), NO2 (E), PM2.5 (F), and ultrafine particles (G). All simulations have outcome rate at mean exposure of 0.2.

The impact of this confounding depended not only on the correlation between the measurement of LAN and the simulated causal agent, but also on the strength of the relationship between the simulated causal agent and the adverse outcome (Figure 8C–F): when the causal agent had very little impact on the outcome (), there would be a low type I error rate regardless of the resolution to which LAN was measured; when the causal agent had a very large impact on the outcome ( for NO and NO2), there would be a high type I error rate regardless of the resolution to which LAN was measured. However, in settings with moderate effects of NO, NO2, PM2.5, and black carbon, high-resolution light imagery could reduce the chance of making incorrect conclusions about causality due to confounding.

In contrast, this type I error rate is fairly consistent across all resolutions of LAN when the simulated cause of the adverse outcome is ultrafine particle pollution (regardless of the strength of the simulated relationship) (Figure 8G). This makes sense, as the correlation between ultrafine particles and LAN does not strongly vary with the spatial resolution of the LAN map (Figure 5).

In the case of noise (Figure 8A and B), we see fairly low type I error rates regardless of the strength of the effect or the resolution of the LAN imagery. This is explained by the low correlations between noise and light pollution at the residences for all resolutions (Figure 5).

Discussion

If LAN exposure at the place of residence truly causes an adverse outcome, studies relying on higher resolution measures of LAN will be more likely to have statistically significant findings. This conclusion is almost certainly true for other pollutants as well. For example, NO2 can vary between streets with heavy traffic and neighboring parks,41 and near roadways it even varies between the height of a child and an adult.42 In addition to having lower power (Figure 6), studies that estimate exposure from lower-resolution maps may introduce biases that are not predictable (Figure 7). Spatial aliasing due to city factors such as the size of blocks size and grid-type structures may potentially contribute to the complicated relationship with bias (see especially at a resolution of 200 m). However, the overall trend is still clear: maps with a higher spatial resolution tend to result in lower bias and greater statistical power.

Urban pollutants are often highly correlated with each other. Studies of the health impacts of light exposure therefore face a danger of confounding if their modeling does not account for other urban pollutants. Our results show that the effects of LAN are likely to be particularly confounded with black carbon, NO, NO2, PM2.5, and ultrafine particles, rather than with noise. Furthermore, we have shown that confounding with black carbon, NO, NO2, and PM2.5 can be somewhat mitigated through higher resolution mapping of LAN. This is because measurements from lower resolution LAN maps effectively acted as proxies for NO, NO2, PM2.5, and, to a lesser extent, black carbon in our data. These pollutants have high spatial autocorrelation. If LAN is measured at low resolution, it also has high spatial autocorrelation, so it induces a correlation across predictors. Due to the spatial correlations between the pollutants, this would also be the case for a study that also accounted for people’s motion through the landscape. Studies of the effects of LAN that rely on these lower resolution LAN maps may be unable to differentiate between effects of LAN and effects of these other pollutants. Conversely, high-resolution maps (of both light and other pollutants) will in general reduce correlations between exposures, which will reduce the likelihood of false positives in epidemiologic studies estimating exposure using geographic data.

A number of investigations have shown a relationship between LAN and the risk of negative outcomes using low-resolution LAN data.3–18 More recent work using higher resolution images has led to varying results.2,6 The results presented here confirm that epidemiologic research testing the hypothesis that LAN is associated with adverse outcomes can suffer from serious problems if LAN is estimated from low-resolution data sources. Furthermore, these results confirm the value in reevaluating older studies with newly available higher resolution data, as in Rybnikova and Portnov.43 Studies that rely on estimates of LAN from satellite imagery corresponding to study participants’ place of residence should use imagery with spatial resolution near the scale of individual buildings if possible (5–20 m), and otherwise the highest spatial resolution data available.29 While higher-resolution LAN data relies on more-recent satellite imagery that will in many cases reflect a time period after the effective exposure window, the rate of change of LAN emissions is quite small compared to the spatial variation in most developed countries.44 Overall, we argue that when faced with the choice between more-recent data with higher spatial resolution (e.g., VIIRS DNB, astronaut photographs) or lower-resolution data collected at a time closer to the effective exposure window (DMSP), researchers should choose the higher resolution data.

Conclusions

To lower the risk of both type I errors and type II errors, we suggest that studies of the impact of outdoor LAN using low-resolution satellite images be interpreted with caution. Reliance on lower-resolution exposure maps will result in imprecise estimates of exposure that increase the risk of failing to identify a true association if one exists. Furthermore, epidemiologic research using geographical data for estimating exposures in urban contexts is susceptible to confounding due to correlation among geographical variables; use of lower-resolution maps may exacerbate this. Studies that rely on lower-resolution maps for estimating exposures are likely to have lower power, be more prone to bias, and be less able to examine the independent effects of the exposure of interest.

ACKNOWLEDGMENTS

We thank the volunteers and organizers of the Cities at Night project for identifying the images of Vancouver and providing the results openly. We thank Michael Brauer and Hugh Davies of the University of British Columbia for sharing the maps of other pollutants with us. We thank Alejandro Sánchez de Miguel for discussions related to the astronaut photographs.

References

1. Yazdi MD, Kuang Z, Dimakopoulou K, et al. Predicting fine particulate matter (PM2.5) in the Greater London Area: an ensemble approach using machine learning methods. Remote Sens. 2020; 12:914
2. Garcia-Saenz A, Sánchez de Miguel A, Espinosa A, et al. Evaluating the association between artificial light-at-night exposure and breast and prostate cancer risk in Spain (MCC-Spain Study). Environ Health Perspect. 2018; 126:047011
3. Rybnikova N, Stevens RG, Gregorio DI, Samociuk H, Portnov BA. Kernel density analysis reveals a halo pattern of breast cancer incidence in Connecticut. Spat Spatiotemporal Epidemiol. 2018; 26:143–151
4. Hurley S, Goldberg D, Nelson D, et al. Light at night and breast cancer risk among California teachers. Epidemiology. 2014; 25:697–706
5. Portnov BA, Stevens RG, Samociuk H, Wakefield D, Gregorio DI. Light at night and breast cancer incidence in Connecticut: an ecological study of age group effects. Sci Total Environ. 2016; 572:1020–1024
6. Ritonja J, McIsaac MA, Sanders E, et al. Outdoor light at night at residences and breast cancer risk in Canada. Eur J Epidemiol. 2020; 35:579–589
7. Xiao Q, James P, Breheny P, et al. Outdoor light at night and postmenopausal breast cancer risk in the NIH-AARP diet and health study. Int J Cancer. 2020; 147:2363–2372
8. Clarke RB, Amini H, James P, et al. Outdoor light at night and breast cancer incidence in the Danish Nurse Cohort. Environ Res. 2021; 194:110631
9. Obayashi K, Saeki K, Kurumatani N. Bedroom light exposure at night and the incidence of depressive symptoms: a Longitudinal Study of the HEIJO-KYO Cohort. Am J Epidemiol. 2018; 187:427–434
10. Davis S, Mirick DK, Stevens RG. Night shift work, light at night, and risk of breast cancer. J Natl Cancer Inst. 2001; 93:1557–1562
11. McFadden E, Jones ME, Schoemaker MJ, Ashworth A, Swerdlow AJ. The relationship between obesity and exposure to light at night: cross-sectional analyses of over 100,000 women in the Breakthrough Generations Study. Am J Epidemiol. 2014; 180:245–250
12. Huss A, van Wel L, Bogaards L, et al. Shedding some light in the dark-a comparison of personal measurements with satellite-based estimates of exposure to light at night among children in the Netherlands. Environ Health Perspect. 2019; 127:67001
13. Abay KA, Amare M. Night light intensity and women’s body weight: evidence from Nigeria. Econ Hum Biol. 2018; 31:238–248
14. Ohayon MM, Milesi C. Artificial outdoor nighttime lights associate with altered sleep behavior in the American general population. Sleep. 2016; 39:1311–1320
15. Xiao Q, Gee G, Jones RR, Jia P, James P, Hale L. Cross-sectional association between outdoor artificial light at night and sleep duration in middle-to-older aged adults: the NIH-AARP Diet and Health Study. Environ Res. 2020; 180:108823
16. Min JY, Min KB. Outdoor artificial nighttime light and use of hypnotic medications in older adults: a Population-Based Cohort Study. J Clin Sleep Med. 2018; 14:1903–1910
17. Rybnikova NA, Haim A, Portnov BA. Is prostate cancer incidence worldwide linked to artificial light at night exposures? Review of earlier findings and analysis of current trends. Arch Environ Occup Health. 2017; 72:111–122
18. Kim KY, Lee E, Kim YJ, Kim J. The association between artificial light at night and prostate cancer in Gwangju City and South Jeolla Province of South Korea. Chronobiol Int. 2017; 34:203–211
19. Helbich M, Browning MHEM, Huss A. Outdoor light at night, air pollution and depressive symptoms: a cross-sectional study in the Netherlands. Sci Total Environ. 2020; 744:140914
20. Koo YS, Song JY, Joo EY, et al. Outdoor artificial light at night, obesity, and sleep health: cross-sectional analysis in the KoGES study. Chronobiol Int. 2016; 33:301–314
21. Paksarian D, Rudolph KE, Stapp EK, et al. Association of outdoor artificial light at night with mental disorders and sleep patterns among US adolescents. JAMA Psychiatry. 2020; 77:1266–1275
22. Jones RR. Exposure to artificial light at night and risk of cancer: where do we go from here? [published online ahead of print January 22, 2021]. Br J Cancer. doi: 10.1038/s41416-020-01231-7
23. Kyba CC. Defense meteorological satellite program data should no longer be used for epidemiological studies. Chronobiol Int. 2016; 33:943–945
24. Miller SD, Straka W, Mills SP, et al. Illuminating the capabilities of the Suomi national polar-orbiting partnership (NPP) visible infrared imaging radiometer suite (VIIRS) day/night band. Remote Sens. 2013; 5:6717–6766
25. Levin N, Kyba CC, Zhang Q, et al. Remote sensing of night lights: a review and an outlook for the future. Remote Sens Environ. 2020; 237:111443
26. Sanchez de Miguel A, Castaño JG, Zamorano J, et al. Atlas of astronaut photos of earth at night. Astronomy & Geophysics. 2014; 55:4–36
27. Berkson J. Are there two regressions? J Am Stat Assoc. 1950; 45:164–180
28. Kyba C, Garz S, Kuechly H, et al. High-resolution imagery of earth at night: new sources, opportunities and challenges. Remote Sens. 2015; 7:1–23
29. Kyba CC, Aronson KJ. Assessing exposure to outdoor lighting and health risks. Epidemiology. 2015; 26:e50
30. Sánchez de Miguel A, Kyba CC, Aubé M, et al. Colour remote sensing of the impact of artificial light at night (I): the potential of the International Space Station and other DSLR-based platforms. Remote Sens Environ. 2019; 224:92–103
31. Sánchez de Miguel A, Bará S, Aubé M, et al. Evaluating human photoreceptoral inputs from night-time lights using RGB imaging photometry. J Imaging. 2019; 5:49
32. Sánchez de Miguel A. Variación espacial, temporal y espectral de la contaminación lumínica y sus fuentes: Metodología y resultados. 2015, Universidad Complutense de Madrid. (Doctoral dissertation)
33. Bochow M. Automatisierungspotenzial von Stadtbiotopkartierungen durch Methoden der Fernerkundung. 2010, Logos Verlag
34. Schowengerdt RA. Remote Sensing: Models and Methods for Image Processing. 2006, 3rd ed. Academic Press
35. Gan WQ, McLean K, Brauer M, Chiarello SA, Davies HW. Modeling population exposure to community noise and air pollution in a large metropolitan area. Environ Res. 2012; 116:11–16
36. Larson T, Henderson SB, Brauer M. Mobile monitoring of particle light absorption coefficient in an urban area as a basis for land use regression. Environ Sci Technol. 2009; 43:4672–4678
37. Wang R, Henderson SB, Sbihi H, et al. Temporal stability of land use regression models for traffic-related air pollution. Atmos Environ. 2013; 64:312–319
38. Henderson SB, Beckerman B, Jerrett M, Brauer M. Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter. Environ Sci Technol. 2007; 41:2422–2428
39. Abernethy RC, Allen RW, McKendry IG, Brauer M. A land use regression model for ultrafine particles in Vancouver, Canada. Environ Sci Technol. 2013; 47:5217–5225
40. Sanchez de Miguel A, Kyba CCM, Zamorano J, Gallego J, Gaston KJ. The nature of the diffuse light near cities detected in nighttime satellite imagery. Sci Rep. 2020; 10:7829
41. Parra MA, Elustondo D, Bermejo R, Santamaría JM. Ambient air levels of volatile organic compounds (VOC) and nitrogen dioxide (NO2) in a medium size city in Northern Spain. Sci Total Environ. 2009; 407:999–1009
42. Kenagy HS, Lin C, Wu H, Heal MR. Greater nitrogen dioxide concentrations at child versus adult breathing heights close to urban main road kerbside. Air Qual Atmos Health. 2016; 9:589–595
43. Rybnikova NA, Portnov BA. Outdoor light and breast cancer incidence: a comparative analysis of DMSP and VIIRS-DNB satellite data. Int J Remote Sens. 2017; 21:5952–5961
44. Kyba CCM, Kuester T, Sánchez de Miguel A, et al. Artificially lit surface of earth at night increasing in radiance and extent. Sci Adv. 2017; 3:e1701528
Keywords:

Light at night; Environmental pollutants; Bias; Error; Circadian rhythm; Visible infrared imaging radiometer suite day/night band

Supplemental Digital Content

Copyright © 2021 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of The Environmental Epidemiology. All rights reserved.