Lash, Timothy L.; Olshan, Andrew F.
The editors of EPIDEMIOLOGY are pleased to announce a new manuscript submission category for validation studies, which we broadly define as studies that have the objective of improving the quality of the evidence obtained from other epidemiologic research. Although EPIDEMIOLOGY has published such studies1 and previously encouraged such studies,2 they have appeared rarely. We decided to make this category available to authors for two reasons. First, we hope that papers published in this category will enable access to the information required to support quantitative bias analyses by other authors or to otherwise increase confidence in other authors’ research results. Second, we hope that a defined submission category will elevate the importance of validation studies among all stakeholders. Below we expound on these two goals, and provide some guidance for the types of submissions we would like to receive.
EPIDEMIOLOGY’s Instructions for Authors page encourages authors to use quantitative methods to evaluate the influence of important threats to validity, including missing data, differential selection or loss-to-follow-up, confounding due to an unmeasured potential confounder, or measurement error. These methods rely on validation data to assign values to the parameters of a bias model. Without validation data, a range of educated guesses must suffice. By opening a submission category for validation studies, we hope to make validation data to support quantitative bias analysis more readily available. Although we encourage quantitative methods, we recognize that qualitative assessments of the importance of threats to validity are common. For example, authors may write that the validity of an outcome measurement is “good.” EPIDEMIOLOGY prefers that the validation data are used quantitatively in a bias model, but other journals may not have so strong a preference. Validation data that support qualitative assessments of data quality are therefore also of value.
Despite the importance of validation data,3 there is remarkably little methodologic work to inform the design and conduct of validation studies. In the realm of information bias, validation studies often do not collect data on exposure validity within categories of the outcome, or vice versa. For example, they might collect sensitivity and specificity data overall, but not by case or control status. Validation studies that avoid this mistake often collect data from a subsample of a larger study population, an independent small study, or a convenience sample. In the former case, it is tempting to select a simple random sample, but this design is often suboptimal.4 Samples selected by design provide more precise estimates of the values to assign to bias parameters, but may also limit the bias models to those of a certain form.5 For example, bias models can correct for exposure misclassification using the sensitivities and specificities of exposure classification within outcome groups or using the positive and negative predictive values of exposure classification within outcome groups.6 When both exposure and outcome are rare, few exposed cases would be selected into a validation substudy with a simple random design. This substudy would provide imprecise estimates in those with the outcome of the sensitivity of exposure classification and of the positive predictive value of exposure classification. A substudy design that oversamples nominally exposed cases would improve the precision of the estimate of the positive predictive value of exposure classification in those with the outcome, but the sensitivity of exposure classification could not be validly estimated because of the sample selection conditional on the observed exposure classification. These examples illustrate the importance of further development of methods for optimal design, analysis, and presentation of validation studies. We hope to encourage this development by providing an explicit opportunity to publish validation studies’ results and by extension publication of methods papers related to validation studies.
Our description of the validation study submission category is as follows:
Validation studies should follow the outline for an Original Research Article and should provide estimates to inform bias analyses or otherwise be of use in epidemiologic research. Examples include estimates of measurement error for continuous variables, classification parameters for discrete variables (sensitivity, specificity, or positive and negative predictive values), strengths of association to inform analyses of an unmeasured confounder, or participation proportions within combinations of exposures and outcomes. The validation study should be designed and the results presented to optimize their utility in other similar settings.
We have limited the submission category to 2000 words because we expect that there should be little need for lengthy introduction or discussion sections. Results will be largely presented in tables and figures. Methods sections should be complete, and occupy a substantial proportion of the word allowance. We encourage authors to make use of Supplemental Digital Content to provide detailed and expansive tables, and original record-level data when possible, so that readers can make good use of the validation study results in their own work. For example, do not just give the positive predictive value for self-report of daily exercise against an accelerometer; use Supplemental Digital Content to give this positive predictive value in all reasonable combinations of covariates and outcomes. Furthermore, we emphasize the importance of the utility of the validation study results in other similar settings. To be accepted in this category, the results must have value outside of a particular study or data analysis. Validation substudies that have value only in a particular study or analysis should be incorporated into the publication of that study, not separately submitted for consideration in this category. Authors may wish to consult guidelines for reporting of validation studies,7 although we do not require that submissions conform to these guidelines.
We look forward to receiving your submissions, and your feedback about the utility of this new endeavor.
REFERENCES
1. Niesner K, Murff HJ, Griffin MR, et al. Validation of VA administrative data algorithms for identifying cardiovascular disease hospitalization. Epidemiology. 2013;24:334–335.
2. Hernán MA. With great data comes great responsibility: publishing comparative effectiveness research in epidemiology. Epidemiology. 2011;22:290–291.
3. Ehrenstein V, Petersen I, Smeeth L, Jick SS, Benchimol EI, Ludvigsson JF, Sørensen HT. Helping everyone do better: a call for validation studies of routinely recorded health data. Clin Epidemiol. 2016;8:49–51.
4. Holcroft CA, Spiegelman D. Design of validation studies for estimating the odds ratio of exposure-disease relationships when exposure is misclassified. Biometrics. 1999;55:1193–1201.
5. Marshall RJ. Validation study methods for estimating exposure proportions and odds ratios with misclassified data. J Clin Epidemiol. 1990;43:941–947.
6. Lash TL, Fox MP, Fink AK. Applying Quantitative Bias Analysis to Epidemiologic Data. Statistics for Biology and Health. 2009.New York, NY: Springer.
7. Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A, Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. 2011;64:J Clin Epidemiol: 821–829.