Institutional members access full text with Ovid®

Share this article on:

Net Reclassification Indices for Evaluating Risk Prediction Instruments: A Critical Review

Kerr, Kathleen F.a; Wang, Zheyua; Janes, Hollyb; McClelland, Robyn L.a; Psaty, Bruce M.c; Pepe, Margaret S.b


An Editors’ Note was omitted that refers readers to the following commentary on this paper:

Hilden J. On NRI, IDI, and “good-looking” statistics with nothing underneath. Epidemiology. 2014:25:265–267.

Epidemiology. 25(2):320, March 2014.

doi: 10.1097/EDE.0000000000000018

Net reclassification indices have recently become popular statistics for measuring the prediction increment of new biomarkers. We review the various types of net reclassification indices and their correct interpretations. We evaluate the advantages and disadvantages of quantifying the prediction increment with these indices. For predefined risk categories, we relate net reclassification indices to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for net reclassification indices and evaluate the merits of hypothesis testing based on such indices. We recommend that investigators using net reclassification indices should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the components of net reclassification indices are the same as the changes in the true- and false-positive rates. We advocate the use of true- and false-positive rates and suggest it is more useful for investigators to retain the existing, descriptive terms. When there are three or more risk categories, we recommend against net reclassification indices because they do not adequately account for clinically important differences in shifts among risk categories. The category-free net reclassification index is a new descriptive device designed to avoid predefined risk categories. However, it experiences many of the same problems as other measures such as the area under the receiver operating characteristic curve. In addition, the category-free index can mislead investigators by overstating the incremental value of a biomarker, even in independent validation data. When investigators want to test a null hypothesis of no prediction increment, the well-established tests for coefficients in the regression model are superior to the net reclassification index. If investigators want to use net reclassification indices, confidence intervals should be calculated using bootstrap methods rather than published variance formulas. The preferred single-number summary of the prediction increment is the improvement in net benefit.

Supplemental Digital Content is available in the text.

From the aDepartment of Biostatistics, University of Washington, Seattle, WA; bFred Hutchinson Cancer Research Center, University of Washington, Seattle, WA; and cCardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Services, University of Washington, Group Health Research Institute, Group Health Cooperative, Seattle, WA.

The MESA study was supported by contracts N01-HC-95159 through N01-HC-95169 from the National Heart, Lung, and Blood Institute. This work was also supported by grant NIH GM054438 to M.S.P., grant NIH R01 CA152089 to H.J., and a subcontract to the University of Washington from NIH grant HL085757-07 to K.F.K.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article ( This content is not peer-reviewed or copy-edited; it is the sole responsibility of the author.

Correspondence: Kathleen F. Kerr, University of Washington, Box 357232, F-600 Health Sciences Building, 1705 NE Pacific Street, Seattle, WA 98195-7232. E-mail:

© 2014 by Lippincott Williams & Wilkins, Inc