Secondary Logo

Journal Logo


Fibrosis-4 Index vs Nonalcoholic Fatty Liver Disease Fibrosis Score in Identifying Advanced Fibrosis in Subjects With Nonalcoholic Fatty Liver Disease: A Meta-Analysis

Castellana, Marco MD1; Donghia, Rossella BSc1; Guerra, Vito BSc1; Procino, Filippo PhD1; Castellana, Fabio BSc1; Zupo, Roberta BSc1; Lampignano, Luisa BSc1; Sardone, Rodolfo PhD1; De Pergola, Giovanni PhD1,2; Romanelli, Francesco PhD3; Trimboli, Pierpaolo MD4,5; Giannelli, Gianluigi MD6

Author Information
The American Journal of Gastroenterology: September 2021 - Volume 116 - Issue 9 - p 1833-1841
doi: 10.14309/ajg.0000000000001337



Nonalcoholic fatty liver disease (NAFLD) is a common disorder with high prevalence, morbidity, and excess mortality. Globally, approximately 1 in 4 subjects is estimated to have this condition, and an even higher frequency is reported among specific populations (1–4). In subjects with NAFLD, fibrosis has proven to be a strong predictor of adverse liver-related events; specifically, subjects with advanced forms harbor the highest risk (5–7). The reference standard for the diagnosis and staging of NAFLD and fibrosis is liver biopsy. However, this procedure is invasive, costly, and can be associated with a small but not negligible risk of complications, and there is a major discrepancy between the burden of NAFLD and the number of procedures that can be performed. Moreover, fibrosis is often asymptomatic, and no sign or single laboratory finding raises suspicion of this condition (8,9). To overcome these limitations, noninvasive tools (NITs) for the risk stratification of fibrosis have been developed. Several options are available in the literature, which differ according to what clinical and/or laboratory data they are based on. The most commonly used are the fibrosis-4 index (FIB-4) and NAFLD fibrosis scores (NFS), which are specifically recommended by current guidelines, being considered to have a better performance (8,9).

After the advent of these tools, several articles attempted to compare their accuracy (10–29). Three specific outcomes were assessed: the performance in ruling out fibrosis, the performance in ruling in fibrosis, and the prevalence of indeterminate scores. The results of these studies have been heterogeneous, thus limiting the applicability of their findings in clinical practice (30). First, most of these studies had a retrospective design. Second, they enrolled subjects undergoing liver biopsy during clinical practice, for indications not based on these tools. Consequently, these studies were affected by a significant selection bias, which in turn affected the resulting prevalence of fibrosis (10–27). Third, some studies included subjects without NAFLD as well (28). Finally, some studies were focused on significant fibrosis, rather than advanced fibrosis (AF) (29).

Given that NITs are diagnostic tests conceived for selecting patients with NAFLD to undergo liver biopsy, we question whether the results of these studies are really comparable given the different methodologies adopted in the published reports. Simply pooling the findings of the abovementioned studies would be associated with a significant bias. To overcome these limitations, summary operating measures assumed to be independent of the disease prevalence should be used. These include the diagnostic odds ratio (DOR) and the likelihood ratio for positive results (LR+) and negative results (LR−) (30,31). These would enable a reliable comparison of different NITs to be performed using a head-to-head approach relying on relative measures, such as the relative DOR (RDOR), relative LR+ (RLR+), and relative LR− (RLR−). This study aimed at gaining information on this issue to reduce or eliminate the significant limitations of studies in the available literature. Therefore, our research methodology envisaged the following: (i) a systematic search of studies reporting the performance of both FIB-4 and NFS in identifying AF in biopsy-proven NAFLD; (ii) a meta-analysis of available data to evaluate the diagnostic performance of each NIT; and (iii) a comparison of the 2 NITs.


This meta-analysis was registered in PROSPERO (CRD42021224766) and performed in accordance with the PRISMA-DTA statement (32).

Search strategy

First, we searched for sentinel studies in PubMed. Second, we identified keywords in PubMed. Third, the following complete search strategy was used in PubMed: (“fibrosis-4 index” [Title/abstract] OR FIB-4 [Title/abstract] OR FIB4 [Title/abstract]) AND (NFS [Title/abstract] OR “NAFLD fibrosis score” [Title/abstract]) AND (histolog* [Title/abstract] OR biopsy [Title/abstract]). Fourth, Cochrane Central Register of Controlled Trials, Scopus, and Web of Science were searched using the same strategy. Fifth, studies evaluating the performance of both FIB-4 and NFS in identifying AF in subjects with biopsy-proven NAFLD were selected. Studies meeting the following criteria were excluded: (i) focused on pediatric patients only; (ii) including mixed populations (e.g., subjects without NAFLD); (iii) not using histology as the reference standard; (iv) less than 100 subjects; (v) not adopting standardized cutoffs for the evaluation of FIB-4 and NFS (see further); (vi) letters, commentaries, and posters. Finally, the references of included studies were searched to find additional articles. The last search was performed on December 6, 2020. No language restriction was adopted. Two investigators (M.C. and F.P.) independently searched for articles, screened titles and abstracts of the retrieved articles, reviewed the full-texts, and selected articles for inclusion.

Data extraction

The following information was extracted independently by the same investigators in a piloted form: (i) general information on the study; (ii) cutoffs for the interpretation of FIB-4 and NFS; and (iii) the number of subjects classified as true-positive, false-positive, true-negative, and false-negative. Histology was the reference standard; AF was taken to be fibrosis stages 3 (bridging fibrosis) or 4 (cirrhosis). FIB-4 and NFS were the index tests. For each NIT, 2 cutoffs are reported in the literature: a lower cutoff to rule out AF and a higher cutoff to rule in AF. For NFS, these values are −1.455 and 0.676, respectively (33). For FIB-4, the cutoffs were initially developed to detect significant fibrosis in subjects with human immunodeficiency virus/HCV coinfection and later adapted to detect AF in subjects with NAFLD (34,35). This has led to heterogeneity in the assessment of the FIB-4 performance. Because the most commonly used cutoffs were developed by Shah et al., in 2009, equal to 1.3 and 2.67, respectively, we included only studies using these thresholds (35). FIB-4 and NFS can be interpreted using a single threshold or a dual threshold approach (i.e., upper and lower cutoffs). Separate data extractions were performed accordingly (see Text, Supplementary Digital Content 1, For each selected article, the main article and supplementary data were searched; if data were missing, the authors were contacted through e-mail. Data were crosschecked, and any discrepancy was discussed.

Study quality assessment

The risk of bias of the included studies was assessed independently by 2 reviewers (M.C. and F.P.) applying the Quality Assessment of Diagnostic Accuracy Studies tool (36).

Data analysis

The characteristics of the included studies were summarized, and then separate analyses were performed according to the following steps. First, a meta-analysis of the diagnostic performance in identifying AF was performed. For each NIT, we plotted estimates of sensitivity and specificity on coupled forest plots. Summary operating points including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), LR+, LR−, and DOR, with 95% confidence intervals, were estimated. DOR provides a single measure of test performance, equal to LR+/LR− and corresponding to the odds for a score above the NIT specific cutoff in a subject with AF compared with the odds for a score above the NIT specific cutoff in a subject without AF. Values range from zero to infinity, with higher values indicating higher performance. A bivariate random-effects model was used for pooled analysis of the sensitivity and specificity; a random-effects model was used for pooled analysis of the remaining metrics (37). Hierarchical summary receiver operating characteristic (HSROC) curves were constructed too, and the areas under the curve (AUC) were estimated (37). Second, a head-to-head comparison of the accuracy of FIB-4 and NFS was performed. The significance of the differences between NITs was assessed on RDOR, RLR+, and RLR− (37,38). A sensitivity analysis was performed after excluding the 2 studies on biopsy-proven nonalcoholic steatohepatitis (NASH) (13,21). Heterogeneity between studies was assessed using I2, regarding 50% or higher values as high heterogeneity. Publication bias was not evaluated because of uncertainty about the determinants for diagnostic accuracy studies and the inadequacy of tests for detecting funnel plot asymmetry (38). All analyses were performed applying both the single threshold and the dual thresholds, per subject, using RevMan 5.3 (the Cochrane Collaboration) and STATA 16.0 (StataCorp software, 2019, Stata Statistical Software, Release 16, StataCorp LLC, College Station, TX). Significance was set at P < 0.05.

This meta-analysis was conducted in accordance with the principles of the Declaration of Helsinki. Analyses were performed on data extracted from published articles.


Study characteristics

In total, 356 articles were found: 107 on PubMed, 30 on Cochrane Central Register of Controlled Trials, 127 on Scopus, and 92 on Web of Science. One additional study was retrieved from a personal database (13). After the removal of 197 duplicates, 160 articles were analyzed for titles and abstracts; 84 records were excluded. The remaining 76 articles were retrieved in full-text, and 18 articles were finally included in the meta-analysis (Figure 1) (10–27).

Figure 1.
Figure 1.:
Flowchart of the systematic review. CENTRAL, Cochrane Central Register of Controlled Trials; NAFLD, nonalcoholic fatty liver disease.

Qualitative analysis

The characteristics of the included articles are summarized in Table 1 (10–27). The studies were published between 2012 and 2020 and had sample sizes ranging from 102 to 3,202 patients. Participants were adult subjects with biopsy-proven NAFLD; 2 studies included patients with biopsy-proven NASH alone (13,21). The prevalence of AF ranged from 8% in the study by Demir et al. to 71% in the study by Anstee et al. (12,21). The FIB-4 and NFS performance with both the lower and the higher cutoffs was generally evaluated, the only exceptions being the articles by Lee et al. that assessed only the lower ones and by Yoneda et al., Marella et al., and Singh et al., which assessed the higher ones (11,13,25,27). Overall, 12,604 patients with biopsy-proven NAFLD were included; 4,289 were diagnosed with AF.

Table 1.
Table 1.:
Characteristics of the included studies and availability of data

Quantitative analysis

Performance of the FIB-4 and NFS with a single threshold.

The forest plot of the sensitivity and specificity of each NIT, interpreted according to the lower or the higher cutoff, in identifying AF in subjects with NAFLD is shown in Figure 2. When considering the lower cutoff, the pooled sensitivities ranged from 76% to 81% and specificities from 64% to 67%; PPVs and NPVs were estimated at 43% and 90%, respectively. When considering the higher cutoff, the pooled sensitivities ranged from 34% to 39%, specificities from 94% to 95%, PPVs from 63% to 67%, and NPVs from 82% to 84%. Because these summary operating points are influenced by the prevalence of the disease in the population tested, we estimated the following parameters, which are independent of disease prevalence and, thus, characteristics of the specific NIT. The pooled LR+ was estimated to be 2.3 and ranged from 5.9 to 7.9, LR− ranged from 0.3 to 0.4 and from 0.6 to 0.7, and DOR ranged from 6.4 to 7.5 and from 8.5 to 12.3, respectively (Table 2). In addition, the HSROC AUCs ranged from 0.78 to 0.79 and from 0.80 to 0.86, respectively (see Figure, Supplementary Digital Content 2, A high heterogeneity was found for all the end points (data not shown). Then, we made a head-to-head comparison of the accuracy of the 2 NITs. NFS showed a higher DOR for the lower cutoff and FIB-4 for the higher cutoff. No differences were found regarding LR+ or LR− according to the lower or the higher cutoff (see Table, Supplementary Digital Content 3,

Figure 2.
Figure 2.:
Forest plot of the sensitivity and specificity of the FIB-4 and the NAFLD fibrosis score in identifying AF in subjects with NAFLD according to the lower and the higher cutoffs. AF, advanced fibrosis; CI, confidence interval; FIB-4, fibrosis-4 index; NAFLD, nonalcoholic fatty liver disease.
Table 2.
Table 2.:
Summary estimates of the accuracy of each noninvasive tool in identifying advanced fibrosis in subjects with NAFLD according to the lower and higher cutoffs

Performance of FIB-4 and NFS with dual thresholds.

The forest plot of the sensitivity and specificity of each NIT in identifying AF in subjects with NAFLD, interpreted according to the dual threshold approach, is shown in Figure 3. The pooled sensitivities ranged from 61% to 65%, specificities were estimated as 93%, PPVs ranged from 67% to 68%, and NPVs ranged from 89% to 90%. The pooled LR+ ranged from 9.1 to 9.4, LR− were estimated to be 0.4, and DOR ranged from 21.7 to 24.9 (Table 3). In addition, the HSROC AUC was estimated in 0.91 for both NITs (see Figure, Supplementary Digital Content 4, A high heterogeneity was found for all the outcomes (data not shown). It is worth noting that 30%–35% of findings were classified as indeterminate because they scored between the lower and the higher cutoffs. Then, we made a head-to-head comparison of the accuracy of the 2 NITs. No difference was found regarding RDOR, LR+, or LR− between FIB-4 and NFS; however, FIB-4 was associated with a lower prevalence of indeterminate findings (OR = 0.73, 95% confidence interval 0.66–0.80) (see Table, Supplementary Digital Content 5,

Figure 3.
Figure 3.:
Forest plot of the sensitivity and specificity of the FIB-4 and the NAFLD fibrosis score in identifying AF in subjects with NAFLD with the dual threshold approach. AF, advanced fibrosis; CI, confidence interval; FIB-4, fibrosis-4 index; NAFLD, nonalcoholic fatty liver disease.
Table 3.
Table 3.:
Summary estimates of the accuracy of each of the 2 noninvasive tools in identifying advanced fibrosis in subjects with NAFLD using the dual threshold approach

Sensitivity analysis

Because 2 studies included subjects with biopsy-proven NASH only, we repeated the abovementioned analyses after excluding these articles (13,21). Results were generally in line with the main analysis. The only exception was the head-to head comparison for the lower cutoff, for which the NFS and FIB-4 showed a similar performance (see Tables, Supplementary Digital Content 6 and Supplementary Digital Content 7,;

Study quality assessment

The risk of bias of the included studies is summarized in Supplementary Digital Content 8 (see Table,


The aim of this meta-analysis was to identify the best available evidence of the diagnostic performance in identifying AF among subjects with biopsy-proven NAFLD of the 2 most common NITs. To our knowledge, this is the first meta-analysis in which a head-to-head comparison of the 2 NITs was made according to specifically developed cutoffs and based on independent summary operating measures, allowing studies evaluating populations with a different prevalence of AF to be interpreted together. Eighteen studies were found, evaluating the performance of both FIB-4 and NFS among 4,289 subjects with and 8,315 subjects without AF.

It is common knowledge that both the NITs studied in this work were developed to stratify the risk of fibrosis. Two cutoffs were reported: a lower one to rule out AF and a higher one to rule in this condition. Two different uses have been proposed accordingly. In a single threshold approach, subjects scoring below the lower cutoff are unlikely to be affected by AF and should be monitored every 2 years; conversely, subjects scoring higher than the higher cutoffs are likely to have AF (8,9). In a dual threshold approach, the risk of AF cannot be adequately stratified in those subjects scoring between the lower and the higher cutoffs (i.e., indeterminate); a liver biopsy may, therefore, be considered in these subjects only (33). In both circumstances, a significant number of liver biopsies would be spared. This meta-analysis challenged both approaches. First, when the single threshold strategy according to the lower cutoff was considered, the sensitivity was 76%–81%, NPV 90%, and LR− 0.3–0.4, providing only weak evidence of a discriminatory performance. Second, when the single threshold strategy according to the higher cutoff was considered, the specificity was 94%–95%, PPV 63%–67%, and LR+ 5.9–7.9, providing only moderate evidence of a discriminatory performance. Third, when the dual threshold strategy was considered, approximately 1 in 3 patients was classified as indeterminate, confirming the weak evidence of a discriminatory performance among negative findings (LR− = 0.4) and moderate evidence among positive ones (LR+ of 9.1–9.4). These findings were also confirmed in the sensitivity analysis, after the exclusion of 2 studies that enrolled subjects with biopsy-proven NASH only. Applying the results of our analyses to a hypothetical population of subjects with NAFLD, some considerations may be drawn. Specifically, if only subjects with a score higher than the lower cutoff were scheduled for further assessments, approximately 1 in 5 patients with AF would have been missed. Moreover, if subjects with a score higher than the higher cutoff were considered as affected by AF, a liver biopsy would have confirmed this diagnosis only in 2 of every 3 patients. Finally, if only subjects with a score between the lower and the higher cutoffs were scheduled for further assessments, the number of diagnostic referrals would have been reduced by 65%–70% depending on the NIT adopted, but the limitations of the single strategies would still apply. In short, our data do not support the view of NITs as reliable tools for use to diagnose or exclude AF (39,40). Rather, they should be considered as tools to stratify the risk of AF featuring only a modest performance, thus highlighting the need for better markers (35).

Two NITs were included in this meta-analysis, FIB-4 and NFS. We selected these NITs because they have been validated in different populations and their use is specifically endorsed by current guidelines (8,9,41). In particular, these documents recommend the use of NITs as the first line triage for the purpose of excluding AF in subjects with NAFLD, thus according to the single lower cutoff approach (9,42). Given that the use of FIB-4 or NFS is recommended with the same strength of evidence, one may question whether one or the other should be preferentially used in clinical practice. A head-to-head meta-analysis was conducted accordingly. Compared with FIB-4, we found NFS to be associated with a higher DOR in the main analysis and to a similar performance in the sensitivity analysis. Nevertheless, NFS was never associated with a worse performance and should, therefore, possibly be preferred.

In November 2017, a meta-analysis was published on the same topic (43). Fifty-nine studies enrolling 12,558 subjects with biopsy-proven NAFLD and assessing the performance of at least one among aspartate aminotransferase to platelet ratio index, body mass index, aspartate aminotransferase/alanine aminotransferase ratio, and diabetes mellitus index, FIB-4, FibroScan, magnetic resonance elastography, NFS, or shear wave elastography were included. The authors concluded that, among the 4 blood models, FIB-4 and NFS offered the best diagnostic performance for detecting AF. It is worth noting that: (i) studies adopting different thresholds were pooled for estimating sensitivity, specificity, PPV, and NPV (e.g., from 1.24 to 1.45 for FIB-4) and (ii) separate sets of data estimated according to different thresholds in the same subjects from the same study were pooled to estimate DOR. The results of our meta‐analysis were based on 18 studies specifically evaluating both FIB-4 and NFS, providing data on 12,604 subjects, on a specific data extraction performed to ensure that consistent cutoffs were used before pooling data, on separate analyses according to the single or dual threshold approach, and on a head-to-head comparison. This resulted in a more objective and accurate interpretation of the available evidence, yielding weak-to-moderate evidence of diagnostic performance overall, favoring FIB-4 for ruling in and NFS for ruling out AF.

Limitations of this study should be discussed. First, liver biopsy was selected as the reference standard to diagnose AF. This might have resulted in a selection bias toward more severe forms, as confirmed by the high prevalence of AF in included subjects compared with the general population (1,44,45). In addition, sampling error or the known limited concordance rates when interpreting liver biopsy might have led to diagnostic and staging misclassification (21). Second, the performance of NITs may vary according to the age of the subject assessed; different age-specific cutoffs have, in fact, been reported (17). This aspect was not taken into account in most of the included studies, nor, therefore, in this meta-analysis. Nevertheless, the need for different cutoffs indirectly supports our findings of inadequate performance of NITs in clinical practice.

In conclusion, both FIB-4 and NFS proved to be characterized by only a weak-to-moderate diagnostic performance in identifying AF among subjects with biopsy-proven NAFLD. Because they are recommended as first-line tools for risk stratification, the lower cutoff with a single threshold approach should be used, and subjects with scores above this threshold referred for further assessments. Compared with FIB-4, NFS was associated with higher performance in ruling out AF and may be, therefore, preferred for this purpose. However, given the still relatively limited performance, further studies are needed to fully assess the potential benefits and drawbacks of optimizing thresholds of existing tools vs defining new tools.


Guarantor of the article: Marco Castellana, MD.

Specific author contributions: Substantial contributions to the conception or design of the work: M.C., R.D., V.G., and F.P.; acquisition, analysis, or interpretation of data for the work: M.C., R.D., V.G., F.P., and F.R.; drafting the work or revising it critically for important intellectual content: M.C., R.Z., F.C., L.L., R.S., G.D.P., P.T., and G.G.; and final approval of the version to be published: all authors.

Financial support: None to report.

Potential competing interests: None to report.


We thank Masanori Atsukawa (Japan), Münevver Demir (Germany), Jacob George (Australia), Chan Wah Kheong (Malaysia), Takeshi Okanoue (Japan), Noam Peleg (Israel), Panyavee Pitisuttithum (Thailand), Toshihide Shima (Japan), Amir Shlomai (Israel), Sombat Treeprasertsuk (Thailand), and Ming-Hua Zheng (China) for providing the requested data and Mary V. C. Pragnell, BA, (Monopoli, Italy) for editing.


1. Younossi ZM, Koenig AB, Abdelatif D, et al. Global epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of prevalence, incidence, and outcomes. Hepatology 2016;64:73–84.
2. Younossi ZM, Golabi P, de Avila L, et al. The global epidemiology of NAFLD and NASH in patients with type 2 diabetes: A systematic review and meta-analysis. J Hepatol 2019;71:793–801.
3. Younossi Z, Stepanova M, Ong JP, et al. Nonalcoholic steatohepatitis is the fastest growing cause of hepatocellular carcinoma in liver transplant candidates. Clin Gastroenterol Hepatol 2019;17:748–55.e3.
4. Parrish NF, Feurer ID, Matsuoka LK, et al. The changing face of liver transplantation in the United States: The effect of HCV antiviral eras on transplantation trends and outcomes. Transpl Direct 2019;5:e427.
5. Angulo P, Kleiner DE, Dam-Larsen S, et al. Liver fibrosis, but no other histologic features, is associated with long-term outcomes of patients with nonalcoholic fatty liver disease. Gastroenterology 2015;149:389–97.e10.
6. Ekstedt M, Hagström H, Nasr P, et al. Fibrosis stage is the strongest predictor for disease-specific mortality in NAFLD after up to 33 years of follow-up. Hepatology 2015;61:1547–54.
7. Hagström H, Nasr P, Ekstedt M, et al. Fibrosis stage but not NASH predicts mortality and time to development of severe liver disease in biopsy-proven NAFLD. J Hepatol 2017;67:1265–73.
8. Chalasani N, Younossi Z, Lavine JE, et al. The diagnosis and management of nonalcoholic fatty liver disease: Practice guidance from the American Association for the Study of Liver Diseases. Hepatology 2018;67:328–57.
9. European Association for the Study of the Liver (EASL); European Association for the Study of Diabetes (EASD); European Association for the Study of Obesity (EASO). EASL-EASD-EASO clinical practice guidelines for the management of non-alcoholic fatty liver disease. Diabetologia 2016;59:1121–40.
10. Xun YH, Fan JG, Zang GQ, et al. Suboptimal performance of simple noninvasive tests for advanced fibrosis in Chinese patients with nonalcoholic fatty liver disease. J Dig Dis 2012;13:588–95.
11. Yoneda M, Imajo K, Eguchi Y, et al. Noninvasive scoring systems in patients with nonalcoholic fatty liver disease with normal alanine aminotransferase levels. J Gastroenterol 2013;48:1051–60.
12. Demir M, Lang S, Schlattjan M, et al. Nikei: A new inexpensive and non-invasive scoring system to exclude advanced fibrosis in patients with NAFLD. PLoS One 2013;8:e58360.
13. Lee TH, Han SH, Yang JD, et al. Prediction of advanced fibrosis in nonalcoholic fatty liver disease: An enhanced model of BARD score. Gut Liver 2013;7:323–8.
14. Cui J, Ang B, Haufe W, et al. Comparative diagnostic accuracy of magnetic resonance elastography vs. eight clinical prediction rules for non-invasive diagnosis of advanced fibrosis in biopsy-proven non-alcoholic fatty liver disease: A prospective study. Aliment Pharmacol Ther 2015;41:1271–80.
15. Lykiardopoulos B, Hagström H, Fredrikson M, et al. Development of serum marker models to increase diagnostic accuracy of advanced fibrosis in nonalcoholic fatty liver disease: The new LINKI algorithm compared with established algorithms. PLoS One 2016;11:e0167776.
16. Joo SK, Kim W, Kim D, et al. Steatosis severity affects the diagnostic performances of noninvasive fibrosis tests in nonalcoholic fatty liver disease. Liver Int 2018;38:331–41.
17. McPherson S, Hardy T, Dufour JF, et al. Age as a confounding factor for the accurate non-invasive diagnosis of advanced NAFLD fibrosis. Am J Gastroenterol 2017;112:740–51.
18. Seki K, Shima T, Oya H, et al. Assessment of transient elastography in Japanese patients with non-alcoholic fatty liver disease. Hepatol Res 2017;47:882–9.
19. Peleg N, Sneh Arbib O, Issachar A, et al. Noninvasive scoring systems predict hepatic and extra-hepatic cancers in patients with nonalcoholic fatty liver disease. PLoS One 2018;13:e0202393.
20. Ampuero J, Pais R, Aller R, et al. Development and validation of hepamet fibrosis scoring system-A simple, noninvasive test to identify patients with nonalcoholic fatty liver disease with advanced fibrosis. Clin Gastroenterol Hepatol 2020;18:216–25.e5.
21. Anstee QM, Lawitz EJ, Alkhouri N, et al. Noninvasive tests accurately identify advanced fibrosis due to NASH: Baseline data from the STELLAR trials. Hepatology 2019;70:1521–30.
22. Arai T, Atsukawa M, Tsubota A, et al. Factors influencing subclinical atherosclerosis in patients with biopsy-proven nonalcoholic fatty liver disease. PLoS One 2019;14:e0224184.
23. Petta S, Wai-Sun Wong V, Bugianesi E, et al. Impact of obesity and alanine aminotransferase levels on the diagnostic accuracy for advanced liver fibrosis of noninvasive tools in patients with nonalcoholic fatty liver disease. Am J Gastroenterol 2019;114:916–28.
24. Kaya E, Bakir A, Kani HT, et al. Simple noninvasive scores are clinically useful to exclude, not predict, advanced fibrosis: A study in Turkish patients with biopsy-proven nonalcoholic fatty liver disease. Gut Liver 2020;14:486–91.
25. Marella HK, Reddy YK, Jiang Y, et al. Accuracy of noninvasive fibrosis scoring systems in African American and White patients with nonalcoholic fatty liver disease. Clin Transl Gastroenterol 2020;11:e00165.
26. Pitisuttithum P, Chan WK, Piyachaturawat P, et al. Predictors of advanced fibrosis in elderly patients with biopsy-confirmed nonalcoholic fatty liver disease: The GOASIA study. BMC Gastroenterol 2020;20:88.
27. Singh A, Gosai F, Siddiqui MT, et al. Accuracy of noninvasive fibrosis scores to detect advanced fibrosis in patients with type-2 diabetes with biopsy-proven nonalcoholic fatty liver disease. J Clin Gastroenterol 2020;54:891–7.
28. Patel YA, Gifford EJ, Glass LM, et al. Identifying nonalcoholic fatty liver disease advanced fibrosis in the veterans health administration. Dig Dis Sci 2018;63:2259–66.
29. Ooi GJ, Earnest A, Kemp WW, et al. Evaluating feasibility and accuracy of non-invasive tests for nonalcoholic fatty liver disease in severe and morbid obesity. Int J Obes (Lond) 2018;42:1900–11.
30. Eusebi P. Diagnostic accuracy measures. Cerebrovasc Dis 2013;36:267–72.
31. Glas AS, Lijmer JG, Prins MH, et al. The diagnostic odds ratio: A single indicator of test performance. J Clin Epidemiol 2003;56:1129–35.
32. McInnes MDF, Moher D, Thombs BD, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: The PRISMA-DTA statement. JAMA 2018;319:388–96.
33. Angulo P, Hui JM, Marchesini G, et al. The NAFLD fibrosis score: A noninvasive system that identifies liver fibrosis in patients with NAFLD. Hepatology 2007;45:846–54.
34. Sterling RK, Lissen E, Clumeck N, et al. Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology 2006;43:1317–25.
35. Shah AG, Lydecker A, Murray K, et al. Comparison of noninvasive markers of fibrosis in patients with nonalcoholic fatty liver disease. Clin Gastroenterol Hepatol 2009;7:1104–12.
36. Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med ;155:529–36.
37. European Network for Health Technology Assessment. Meta-analysis of Diagnostic Test Accuracy Studies, 2014 (
38. Bossuyt P, Davenport C, Deeks J, et al. Chapter 11:Interpreting results and drawing conclusions. In: Deeks JJ, Bossuyt PM, Gatsonis C (eds). Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 0.9. The Cochrane Collaboration, 2013 (
39. Kim D, Kim W, Adejumo AC, et al. Race/ethnicity-based temporal changes in prevalence of NAFLD-related advanced fibrosis in the United States, 2005-2016. Hepatol Int 2019;13:205–13.
40. Schonmann Y, Yeshua H, Bentov I, et al. Liver fibrosis marker is an independent predictor of cardiovascular morbidity and mortality in the general population. Dig Liver Dis 2021;53:79–85.
41. Eslam M, Sarin SK, Wong VW, et al. The Asian Pacific Association for the Study of the Liver clinical practice guidelines for the diagnosis and management of metabolic associated fatty liver disease. Hepatol Int 2020;14:889–919.
42. Younossi ZM, Noureddin M, Bernstein D, et al. Role of noninvasive tests in clinical gastroenterology practices to identify patients with nonalcoholic steatohepatitis at high risk of adverse outcomes: Expert panel recommendations. Am J Gastroenterol 2020;116:254–262.
43. Xiao G, Zhu S, Xiao X, et al. Comparison of laboratory tests, ultrasound, or magnetic resonance elastography to detect fibrosis in patients with nonalcoholic fatty liver disease: A meta-analysis. Hepatology 2017;66:1486–501.
44. Petta S, Di Marco V, Pipitone RM, et al. Prevalence and severity of nonalcoholic fatty liver disease by transient elastography: Genetic and metabolic risk factors in a general population. Liver Int 2018;38:2060–8.
45. Caballería L, Pera G, Arteaga I, et al. High prevalence of liver fibrosis among European adults with unknown liver disease: A population-based study. Clin Gastroenterol Hepatol 2018;16:1138–45.e5.

Supplemental Digital Content

© 2021 by The American College of Gastroenterology