Background: To account for the dynamic aspects of carcinogenesis, we propose a compartmental hidden Markov model in which each person is healthy, asymptomatically affected, diagnosed, or deceased. Our model is illustrated using the example of smoking-induced lung cancer.
Methods: The model was fitted on a case-control study nested in the European Prospective Investigation into Cancer and Nutrition study, including 757 incident cases and 1524 matched controls. Estimation was done through a Markov Chain Monte Carlo algorithm, and simulations based on the posterior estimates of the parameters were used to provide measures of model fit. We performed sensitivity analyses to assess robustness of our findings.
Results: After adjusting for its impact on exposure duration, age was not found to independently drive the risk of lung carcinogenesis, whereas age at starting smoking in ever-smokers and time since cessation in former smokers were found to be influential. Our data did not support an age-dependent time to diagnosis. The estimated time between onset of malignancy and clinical diagnosis ranged from 2 to 4 years. Our approach yielded good performance in reconstructing individual trajectories in both cases (sensitivity >90%) and controls (sensitivity >80%).
Conclusion: Our compartmental model enabled us to identify time-varying predictors of risk and provided us with insights into the dynamics of smoking-induced lung carcinogenesis. Its flexible and general formulation enables the future incorporation of disease states, as measured by intermediate markers, into the modeling of the natural history of cancer, suggesting a large range of applications in chronic disease epidemiology.