Institutional members access full text with Ovid®

Share this article on:

Dynamics of the Risk of Smoking-Induced Lung Cancer: A Compartmental Hidden Markov Model for Longitudinal Analysis

Chadeau-Hyam, Marca; Tubert-Bitter, Pascaleb,c; Guihenneuc-Jouyaux, Chantald; Campanella, Gianlucaa; Richardson, Sylviae; Vermeulen, Roelf; De Iorio, Mariag; Galea, Sandroh; Vineis, Paoloa

doi: 10.1097/EDE.0000000000000032

Background: To account for the dynamic aspects of carcinogenesis, we propose a compartmental hidden Markov model in which each person is healthy, asymptomatically affected, diagnosed, or deceased. Our model is illustrated using the example of smoking-induced lung cancer.

Methods: The model was fitted on a case-control study nested in the European Prospective Investigation into Cancer and Nutrition study, including 757 incident cases and 1524 matched controls. Estimation was done through a Markov Chain Monte Carlo algorithm, and simulations based on the posterior estimates of the parameters were used to provide measures of model fit. We performed sensitivity analyses to assess robustness of our findings.

Results: After adjusting for its impact on exposure duration, age was not found to independently drive the risk of lung carcinogenesis, whereas age at starting smoking in ever-smokers and time since cessation in former smokers were found to be influential. Our data did not support an age-dependent time to diagnosis. The estimated time between onset of malignancy and clinical diagnosis ranged from 2 to 4 years. Our approach yielded good performance in reconstructing individual trajectories in both cases (sensitivity >90%) and controls (sensitivity >80%).

Conclusion: Our compartmental model enabled us to identify time-varying predictors of risk and provided us with insights into the dynamics of smoking-induced lung carcinogenesis. Its flexible and general formulation enables the future incorporation of disease states, as measured by intermediate markers, into the modeling of the natural history of cancer, suggesting a large range of applications in chronic disease epidemiology.

Supplemental Digital Content is available in the text.

From the aMRC/HPA Centre for Environment and Health, School of Public Health, Imperial College, London, United Kingdom; bCentre for Research in Epidemiology and Population Health, INSERM, Villejuif, France; cUMRS 1018, University Paris Sud, Villejuif, France; dEA 4064, University Paris Descartes, Paris, France; eMRC Biostatistics Unit, Institute of Public Health, Cambridge, Uinted Kingdom; fInstitute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands; gDepartment of Statistical Science, University College London, London, United Kingdom; and hDepartment of Epidemiology, Mailman School of Public Health, Columbia University, NY.

This work has been carried out within the Transdisciplinary Research in Cancer of the Lung (TRICL) project, which is supported by the National Cancer Institute, Grant U19 CA148127 02 to Chris Amos. Marc Chadeau-Hyam, Roel Vermeulen and Paolo Vineis acknowledge the European FP7 EnviroGenoMarkers (Grant Agreement 226756 to S.A. Kyrtopoulos) and Exposomics (Grant Agreement 308610 to P. Vineis) projects. Roel Vermeulen and Paolo Vineis acknowledge the European FP7 ECNIS 2 project (Grant Agreement 266198 to K. Rydzynski).

The authors report no conflicts of interest.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article ( This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author.

Correspondence: Marc Chadeau-Hyam, St Mary’s Hospital, Norfolk Place, W21PG London, UK. E-mail:

© 2014 by Lippincott Williams & Wilkins, Inc