A traditionally considered hepatic manifestation of the metabolic syndrome, nonalcoholic fatty liver disease (NAFLD) has dramatically increased in concert with the epidemics of both obesity and type 2 diabetes. It currently represents the most common liver disease in Western countries (1), with a burden of NAFLD-related cirrhosis about twice as high as that caused by chronic hepatitis C, and is projected to be the principal reason for hepatocellular carcinoma (2) and liver transplantation (3) within the decade. Furthermore, it is associated with an increased risk of extrahepatic—mostly cardiovascular and cancer—morbidity and mortality (4).
Two retrospective cohort studies and a meta-analysis of the natural history of patients with NAFLD have clearly shown that the severity of liver fibrosis estimated by liver histology is the strongest predictor of not only liver-related complications but also important extrahepatic diseases, including the cardiovascular disease and extrahepatic malignancy (5–7). Presently, liver biopsy is the gold standard for the assessment of liver fibrosis; however, this procedure being invasive, painful, and with potentially life-threatening complications cannot be implemented in all individuals with NAFLD (8). Consistently, different noninvasive scores/tools have been proposed to identify patients with NAFLD with advanced fibrosis.
In this heterogeneous landscape, NAFLD fibrosis score (NFS) and Fibrosis-4 (FIB-4)—based on easy-to-obtain clinico-metabolic variables—and liver stiffness measurement (LSM) by FibroScan—because it is not dangerous, easy to perform, and widely diffused—are the most available and validated noninvasive tools used in clinical practice to assess fibrosis severity in patients with NAFLD (9). A recent meta-analysis reported that the diagnostic accuracy of LSM for advanced liver fibrosis is higher than that of NFS and FIB-4 (10). However, although the main limitation of both NFS and FIB-4 lies in the high proportion of patients in the gray area of the test, the most relevant clinical concern of LSM is represented by the risk of wrong classification, mostly due to false-positive results (11). Consistently, the European Association for the Study of the Liver (EASL) guidelines suggest using LSM in patients with nondiagnostic noninvasive scores to improve the diagnostic accuracy (12).
Some studies suggested that obesity and severity of steatosis can affect the diagnostic accuracy of NFS and FIB-4 for advanced fibrosis in NAFLD (13). Similarly, there is evidence of the interference of obesity (14,15), severity of steatosis (16), and skin-to-capsule distance (17) on the diagnostic accuracy of LSM for advanced fibrosis in NAFLD, as well as the increased level of transaminases in patients with liver diseases due to other etiologies (18,19).
Consistent with the above, it is plausible that the diagnostic performance of clinical tests/strategies could differ according to the patient profile. For this purpose, we aimed to assess whether the diagnostic accuracy of LSM, FIB-4, and NFS and strategies based on these tools is affected by obesity and/or ALT levels.
Data from 968 patients who fulfilled the above reported inclusion criteria and were prospectively recruited at the first diagnosis of biopsy-proven NAFLD were retrospectively reviewed and analyzed. The patients were recruited at the GI and Liver Unit of the University Hospital in Palermo (287 patients), at the University Hospital of Pessac in France (294 patients), at the Prince of Wales Hospital in Hong Kong (180 patients), at the Division of Gastroenterology Department of Medical Sciences University of Torino (142 patients), and at the Department of Pathophysiology and Transplantation Ca' Granda IRCCS Foundation Policlinico Hospital University of Milan (65 patients), with complete biochemical data and reliable LSM values.
The patients underwent liver biopsy for assessment of liver damage after ultrasonographic evidence of fatty liver. The diagnosis of NAFLD was based on alcohol consumption in the last year, <20 g/d in women and <30 g/d in men, and steatosis (≥5% of hepatocytes) at histology with/without necroinflammation and/or fibrosis. Exclusion criteria were as follows: (i) advanced cirrhosis (Child-Turcotte-Pugh B and C), (ii) hepatocellular carcinoma, (iii) other causes of liver disease or mixed etiologies (alcohol abuse, hepatitis C, hepatitis B, autoimmune liver disease, Wilson's disease, hemochromatosis, or a1-antitrypsin deficiency), (iv) human immunodeficiency virus infection, (v) previous treatment with immunosuppressive drugs and/or regular use of steatosis-inducing drugs evaluated by a questionnaire (e.g., corticosteroid, valproic acid, tamoxifen, amiodarone), or (vi) active intravenous drug addiction or use of cannabis.
The study was performed in accordance with the principles of the Declaration of Helsinki and its appendices and with local and national laws. Approval was obtained from the hospitals' internal review boards and ethics committees, and written informed consent was obtained from all patients.
Clinical and laboratory assessments and histology
Clinical and anthropometric data including body mass index (BMI) and the presence of arterial hypertension and type 2 diabetes were collected at the time of enrollment. On the same day of liver biopsy, a 12-hour overnight fasting blood sample was drawn to determine the serum levels of AST, ALT, GGT, PLT, albumin, total and high density lipoprotein cholesterol, triglycerides, and plasma glucose concentration.
In each center, one liver-dedicated expert pathologist, who was unaware of patients’ identity and history, coded and read histological slides. A minimum of a 15 mm-length biopsy specimen or the presence of at least 10 complete portal tracts was required (20). Steatosis was assessed as the percentage of hepatocytes containing fat droplets (minimum 5%) and as a categorical variable. Kleiner classification (21) was used to stage fibrosis from 0 to 4.
Noninvasive fibrosis algorithms/tools
FIB-4 (age, AST, ALT, and PLT) and NFS (age, IFG/Diabetes, BMI, PLT, albumin, and AST/ALT) were calculated using the original reported formulas (22,23).
Transient elastography was performed with a FibroScan (Echosens, Paris, France) medical device using the M probe (also called the standard probe). In a subgroup of patients, both M and XL FibroScan probes were used because of their availability. In each center, LSM was assessed on the same day of liver biopsy, before the procedure and after an overnight fast, by a trained operator who had previously performed at least 300 determinations in patients with chronic liver disease. As recently reported in the literature (24), we classified all LSM examinations into 3 reliability categories: “very reliable” (IQR/M ≤ 0.10), “reliable” (0.10 < IQR/M ≤ 0.30, or IQR/M > 0.30 with LSM median <7.1 kPa), and “poorly reliable” (IQR/M > 0.30 with LSM median ≥7.1 kPa). Only patients with 10 valid measurements were included, and “poorly reliable” results were excluded from the analysis.
Continuous variables were summarized as mean ± s.d., and categorical variables as frequency and percentage.
The accuracy of each score for detection of advanced fibrosis (F3-F4) was assessed using receiver operating characteristic curves described as area under the receiver characteristic curve test (AUC) with 95% confidence intervals (95% CIs). A patient was assessed as positive or negative according to whether the noninvasive marker value was greater than, less than, or equal to a given cutoff value. Connected with any cutoff value is the probability of a true positive (sensitivity) and the probability of a true negative (specificity). AUCs for both paired and unpaired curves are compared using the bootstrap method, with nonparametric resampling and with the percentile method, as described by Carpenter and Bithell (25), with 2000 replicates, as recommended by the same authors. The cutoff points of LSM, NFS, and FIB-4 for the F3-F4 model were derived from the literature. Specifically, for LSM, the cutoffs of <7.9 kPa and of ≥9.6 kPa for the M probe and of <5.7 kPa and of ≥9.3 kPa were used to rule-out and rule-in, respectively, severe fibrosis (23,26); for NFS, the cutoffs of <−1.455 and of >0.676 were used to rule-out and rule-in, respectively, severe fibrosis (20); and for FIB-4, the cutoffs of <1.30 kPa and of >2.67 were used to rule-out and rule-in, respectively, severe fibrosis (21). Accordingly, false-negative and false-positive rates of the single test and of their combination, as well as sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, positive predictive value (PPV), and negative predictive value (NPV), are calculated.
Analysis was performed using R 3.5.1 (27).
Patient characteristics and histology
The baseline characteristics of the 968 patients with NAFLD are shown in Table 1. Mean age was 50 years, with male preponderance (62.9%). Thirty-nine percent of the patients were obese, diabetes was present in 37% of the cases, and 20.7% of the patients had ALT > 100 IU.
At liver biopsy, 28.5% of patient had fibrosis ≥3 by Kleiner score.
Is LSM better than NFS or FIB-4 for the diagnosis of severe liver fibrosis in the entire cohort?
Figure 1 shows the accuracy, in terms of AUC, of the different noninvasive tools to detect fibrosis ≥F3 in the entire cohort of 968 patients after excluding 99 subjects with invalid LSM (age 54.7 ± 11.9 years, men 57.5%, BMI 36.1 ± 6.5 kg/m2, obese 84.8%, ALT 68 ± 50.3 IU, ALT > 100 IU 15.1%, advanced fibrosis 42.4%).
In the entire cohort, LSM worked better than both FIB-4 and NFS (P < 0.001 for both) with the AUC values of 0.863, 0.777, and 0.765, respectively, whereas no differences were observed between NFS and FIB-4 (P = 0.32) (Table 2). LSM had the highest accuracy and the highest NPV (93.5%; 95% CI, 91.4%–95.7%), whereas FIB-4 and NFS higher PPV (72.1%, 95% CI, 62.9%–81.8% for FIB-4; 67.2%, 95% CI, 56.1%–79.2% for NFS) (Table 3).
Do obesity and ALT levels affect the diagnostic performance of LSM, FIB-4, and NFS?
Figure 2 and Table 2 show the accuracy, in terms of AUC, of the different noninvasive tools to detect fibrosis ≥F3 in subgroups of patients according to obesity and/or ALT levels (>100 IU, third quartile). The prevalence of advanced fibrosis was 38.2% (144/377) in obese, 22.3% (132/591) in nonobese, 31.8% (64/201) in high ALT, and 27.6% (212/767) in low ALT patients.
The diagnostic performance of LSM for advanced fibrosis, in terms of AUC, was better in nonobese patients compared with obese patients (0.902 vs 0.786, P < 0.001) (Figure 2a and Table 2), and in subjects with ALT levels ≤100 IU compared with their counterparts (0.877 vs 0.811, P = 0.04) (Figure 2b and Table 2). Consistent with these data, PPV and NPV were higher (64.8% and 95.2%, respectively) in nonobese subjects compared with obese subjects (58.7% and 88.3%, respectively), and in patients with ALT levels ≤100 IU (62.4% and 94%, respectively) compared to those with ALT > 100 IU (57.7% and 90.7%, respectively) (Table 3). A similar trend was observed for accuracy (Table 3). When splitting the patients according to both BMI and ALT levels, the AUC for advanced fibrosis progressively increased from the worst performance (patients with BMI > 30 and ALT > 100 IU (AUC 0.759) to the best one for BMI ≤ 30 and ALT ≤ 100 IU (AUC 0.916)) (Figure 2c and see Table 1, Supplementary Digital Content, http://links.lww.com/AJG/A65). As a consequence, PPV and NPV were better in the best class (66.7% and 96.4%, respectively) compared with the worst class (55% and 86.4%, respectively) (see Table 2, Supplementary Digital Content, http://links.lww.com/AJG/A65).
The lower accuracy of LSM in obese patients could raise the issue that this is related to the use of the M probe instead of the XL probe. To solve this concern, in a subgroup of 244 patients (55.7% men, mean age 54.3 ± 12.6 years, 50.8% obese, 40.1% with advanced fibrosis) with available and reliable LSM by both M (mean LSM 12 ± 8.9 kPa) and XL (mean LSM 9.2 ± 6.8 kPa) probes, we compared the diagnostic performance of LSM according to the probe and obesity. The AUC of LSM by the M and XL probe was similar in the entire cohort of 244 patients (0.815 vs 0.812, P = 0.86) (see Figure 1a, Supplementary Digital Content, http://links.lww.com/AJG/A64) as well as in obese patients (0.745 vs 0.760, P = 0.64) (see Figure 1b, Supplementary Digital Content, http://links.lww.com/AJG/A64) and nonobese patients (0.892 vs 0.865, P = 0.19) (see Figure 1c, Supplementary Digital Content, http://links.lww.com/AJG/A64), and similar results were observed for accuracy (see Table 3, Supplementary Digital Content, http://links.lww.com/AJG/A65). Consistently, when splitting the population according to BMI, the AUC was lower in obese patients compared with nonobese patients using both M (0.745 vs 0.892, P = 0.007) (see Figure 2a, Supplementary Digital Content, http://links.lww.com/AJG/A64) and XL probes (0.760 vs 0.865, P = 0.06) (see Figure 2b, Supplementary Digital Content, http://links.lww.com/AJG/A64), and similar results were observed for accuracy (see Table 3, Supplementary Digital Content, http://links.lww.com/AJG/A65).
The AUC of FIB-4 for advanced fibrosis was higher but not statistically significant in nonobese patients when compared with obese patients (0.800 vs 0.742, P = 0.10) (Figure 2d and Table 2). When considering ALT levels, the AUCs were similar in patients with ALT ≤ 100 IU and in those with ALT > 100 IU (0.789 vs 0.737, P = 0.228) (Figure 2e and Table 2). PPV and NPV were better in nonobese patients (74.5% and 90%, respectively) compared with obese patients (69.2% and 77.9%, respectively), whereas they are similar in subjects with ALT ≤ 100 IU (73.1% and 86.4%, respectively) or >100 IU (68.4% and 81.8%, respectively) (Table 3). A similar trend was reported for accuracy (Table 3).
Similar to FIB-4, the AUC of NFS for advanced fibrosis was higher but not statistically significant in nonobese patients when compared with obese patients (0.767 vs 0.718, P = 0.18) (Figure 2f and Table 2). When considering ALT levels, the AUCs were similar in patients with ALT ≤ 100 IU and in those with ALT > 100 IU (0.774 vs 0.753, P = 0.85) (Figure 2g and Table 2). Along this line, PPV and NPV were better in nonobese patients (76.5% and 89.7%, respectively) compared with obese patients (64% and 80.5%, respectively), whereas they are similar in subjects with ALT ≤ 100 IU (67.2% and 88.5%, respectively) or >100 IU (66.7% and 82.4%, respectively) (Table 3). A similar trend was reported for accuracy (Table 3).
When analyzing the 99 patients (84.8% obese) with invalid LSM, the observed diagnostic performance of noninvasive scores for advanced fibrosis (NFS: AUC 0.718 (0.616–0.820), PPV 68%, NPV 85%, accuracy 34%, uncertainty area 54.5%, wrong classification rate 11.15; FIB-4: AUC 0.722 (0.623–0.820), PPV 100%, NPV 63.5%, accuracy 58%, uncertainty area 11.1%, wrong classification rate 31.3%) was similar to that observed in the previously reported obese group of patients with reliable liver stiffness.
Is the diagnostic performance of LSM better than that of FIB-4 and NFS according to BMI and ALT levels?
LSM was confirmed better in terms of AUC when compared with both FIB-4 and NFS in patients without obesity (AUC 0.902, 0.800, 0.767, respectively; P < 0.001 for both) and in those with ALT ≤ 100 IU (AUC 0.877, 0.789, 0.774, respectively; P < 0.001 for both), but not in their counterparts, except that compared with NFS also in obese patients (Figure 3, Table 2). Consistent with these data, with lower PPV and higher NPV with respect to both FIB-4 and NFS independent of the obesity, accuracy, uncertainty area, and wrong classification rate of LSM were better than those of FIB-4 and NFS in nonobese patients, and LSM is also found to be superior to NFS in obese patients (Table 3). Similarly, with lower PPV and higher NPV with respect to both FIB-4 and NFS independent of ALT levels, LSM had better accuracy, uncertainty area, and wrong classification rate in subjects with ALT ≤ 100 IU, while it is similar in the other groups (Table 3).
Consistent with all the above, when splitting the patients according to both BMI and ALT levels, the AUC of LSM for advanced fibrosis was significantly higher than that of both FIB-4 and NFS in patients in the best class (BMI ≤ 30 and ALT ≤ 100 IU; AUC 0.916 for LSM, 0.816 for FIB-4, 0.783 for NFS; P < 0.001 for both), but not in those in the worst class (BMI > 30 and ALT > 100 IU, AUC 0.759 for LSM, 0.715 for FIB-4, 0.741 for NFS; P = non-significant for both) (Figures 3e, f; see Table 1, Supplementary Digital Content, http://links.lww.com/AJG/A65). As a consequence, with lower PPV and higher NPV with respect to both FIB-4 and NFS in all classes, in the best class, LSM had better accuracy, uncertainty area, and wrong classification rate than both FIB-4 and NFS, whereas no differences were observed in the worst class (Table 2, Supplementary Digital Content, http://links.lww.com/AJG/A65).
Is serial combination strategy of NFS or FIB-4 with LSM better than one-test strategy according to BMI and ALT levels?
In the entire cohort of patients with NAFLD, the AUC of the logistic model including LSM and FIB-4 was better than that of LSM and FIB-4 alone (0.877 vs 0.863 and vs 0.777, P = 0.01 and P < 0.001, respectively), whereas the AUC of the logistic model including LSM and NFS was better than NFS alone, but similar to that of LSM (0.867 vs 0.863 and vs 0.765, P < 0.001 and P = 0.60, respectively). Consistently, the use of LSM in patients in the uncertainty area of FIB-4 generated higher accuracy and lower gray area than LSM or FIB-4 alone (Table 4). The algorithm also generated a PPV and a NPV of 60.7% and 91.0%, respectively (Table 4). They were similar to those obtained by LSM alone, whereas NPV was higher and PPV was lower than FIB-4 alone (Table 4). Similarly, the use of LSM in patients in the uncertainty area of NFS generated higher accuracy and lower gray area than LSM or NFS alone (Table 4). The algorithm also generated a PPV and a NPV of 65.7% and 88.7%, respectively (Table 4). They were similar to those obtained by NFS alone, whereas NPV was lower and PPV was higher than LSM alone (Table 4).
Because of the evidence that LSM works better than NFS and FIB-4 for the diagnosis of advanced fibrosis only in patients without obesity and/or in those with ALT ≤ 100 IU, we tested whether the serial combination strategy of FIB-4 or NFS with LSM is always better than FIB-4 or LSM alone.
When splitting the cohort according to obesity or ALT levels, we confirmed that in all subgroups, the serial combination strategy increased the diagnostic performance for advanced fibrosis to a similar extent than in the entire population, even if the strategy worked better in nonobese patients (PPV 61.6% and 59.3%, NPV 93.1% and 94.7% for FIB-4 and NFS, respectively) compared with obese patients (PPV 59.8% and 71.5%, NPV 86.7% and 76.2% for FIB-4 and NFS, respectively). In these subgroups of patients, combination strategies generated PPV and NPV similar to those obtained by LSM alone, but higher NPV and lower PPV compared with FIB-4 or NFS alone, except for NFS in obese patients, in whom the combination strategy had lower PPV and NPV than NFS alone (Table 4). The diagnostic performance of serial combination strategies was better in nonobese patients with ALT ≤ 100 IU (PPV 67.0% and 65.5%, NPV 94.4% and 94.1% for FIB-4 and NFS, respectively) compared with obese patients with ALT > 100 IU (PPV 57.1% and 66.7%, NPV 80% and 82.6% for FIB-4 and NFS, respectively) (see Table 4, Supplementary Digital Content, http://links.lww.com/AJG/A65).
In this study on a large cohort of patients with histological diagnosis of NAFLD, we confirmed that LSM is better than FIB-4 and NFS for diagnosing advanced fibrosis, but this superiority is mostly observed in nonobese patients and/or in those with low ALT levels, whereas it is partially lost in obese subjects and/or with high ALT levels. We also demonstrated that a serial combination of FIB-4 or NFS with LSM improves, and in all subgroups, the overall diagnostic performance for advanced fibrosis, with the higher accuracy observed in nonobese patients and the lower accuracy in obese patients.
LSM, FIB-4, and NFS are the mostly available, used, and validated nonivasive tools aimed at identifying patients with NAFLD with advanced fibrosis—i.e., at risk of hepatic and extrahepatic complications. In our cohort, we reported that LSM has a better diagnostic accuracy than both FIB-4 and NFS for the diagnosis of advanced fibrosis in patients with NAFLD; this result largely confirms what was reported in a recent meta-analysis (10).
In the present study, we found that LSM is more accurate for diagnosing advanced fibrosis in nonobese patients and/or in those with lower ALT levels, compared with obese and/or high ALT subjects, with the accuracy ranging from 50% to 76.9%. We previously reported that BMI could affect the performance of LSM for the diagnosis of advanced fibrosis by increasing the rate of false-positive results (15,28); along this line, Caussy et al. (16) identified in BMI the main reason of disagreement between LSM and magnetic resonance elastography for staging fibrosis in NAFLD. However, in 315 Asiatic patients with NAFLD, LSM showed consistent diagnostic performance for advanced fibrosis, regardless of obesity (13). Differences in baseline characteristics of populations and prevalence of liver disease severity could explain the discrepancies among the observed results, even if a high number of patients enrolled in our study could make a conclusive point about this topic. However, our results can raise the doubt that the lower accuracy of LSM in nonobese patients could be due to the use of the M probe, instead of the XL probe. To solve this question, we compared the accuracy of M and XL probes in a subgroup of patients, confirming that the 2 probes have similar accuracy and that both had worse diagnostic performance in obese patients compared with nonobese patients. Further studies assessing skin-to-capsule distance could add insights about this topic.
Evidence in chronic hepatitis B and C already demonstrated that high ALT levels affect the accuracy of LSM for fibrosis by overestimating liver damage (18,19). However, we first confirmed this in a population of patients with chronic liver disease due to NAFLD. We also showed that although ALT levels did not interfere with the diagnostic ability of both FIB-4 and NFS—except that for a lower PPV by using FIB-4—they performed less well in obese patients with an overall accuracy ranging from 41.4% to 65.5%. These data agree with what was suggested in a smaller study on an Asiatic population of patients with NAFLD (13). Our evidence about lower PPV and NPV of FIB-4 and NFS in obese patients, as well as the lower PPV of FIB-4 in high ALT patients, is not merely an expression of a biological phenomenon; it can be explained by the fact that these variables—included in the scores—are associated between them and with advanced fibrosis but are also present in the absence of advanced fibrosis, sometimes lowering the accuracy of noninvasive scores.
A relevant feature from our study is that LSM works better than both FIB-4 and NFS in obese and/or low ALT patients, providing a gain in accuracy of 10%–15% mostly due to reduction in the proportion of patients in the uncertainty area. Conversely, the diagnostic performance of LSM was similar to that of both NFS and FIB-4 in high ALT patients and similar to that of FIB-4 in obese patients, when retaining slightly superior performance to NFS in obese patients.
The EASL guidelines on NAFLD (12) recommend to noninvasively assess liver fibrosis by using as the first test, a simple and available tool such as FIB-4 or NFS and then applying LSM only in patients with noninvasive tests in the gray area. In the present study, we confirmed the superiority of a serial combination strategy to the use of a singular test, but due to the effect of obesity and/or ALT levels on the diagnostic performance of especially LSM, we investigated whether the combination strategy is to be preferred, regardless of obesity and ALT levels. Notably, we found that a serial combination strategy is always better than the use of only one test by providing an increase in accuracy of about 20%, mostly due to reduction in the uncertainty area, which also shows better performance in nonobese patients compared with obese patients with an accuracy ranging from 67% to 84%.
From a clinical point of view, our study in patients with NAFLD confirms that noninvasive tools were better in rule-out than rule-in advanced fibrosis. Furthermore, it shows that, when using only one test for diagnosing advanced fibrosis, LSM should be preferred to both FIB-4 and NFS in nonobese and/or low ALT patients, although it does not provide any advantage in high ALT patients and provides a slight advantage in obese patients only when compared with NFS. In a setting where LSM is not available/reliable, clinicians should be aware that the overall proportion of patients in the uncertainty area will be higher than with LSM and that the diagnostic accuracy in terms of PPV and NPV will be worse in obese patients. When applying a serial combination strategy as recommended by the EASL guidelines, it should be the preferred strategy in all categories of patients despite taking into account a lower overall accuracy in obese patients (Figure 4). Notably, this strategy, in our cohort, led to the avoidance of 72.2% of liver biopsy by using FIB-4 as the first test in both obese and nonobese patients and 73.3% of liver biopsy by using NFS as the first test in nonobese patients and FIB-4 in obese patients.
This study has limitations. First is the potentially limited validity of the results in different populations and settings. It is plausible that the performance of the proposed algorithms could change according to the characteristics of patients, and the prevalence of obesity, high ALT levels, and advanced fibrosis. Criteria for biopsy selection and the lack of data about patients who underwent liver biopsy but without confirmation of NAFLD could further affect the interpretation of our results and the validity of the proposed algorithm. Furthermore, our strategies should be well validated in the general population, where the prevalence of the advanced fibrosis is low. Consistent with these criticisms, to strengthen our study, in our analysis, we used published standardized cutoffs and non-cutoffs calculated from the data of our populations, and we used a bootstrap method for internal validation. Other limitation of our study lies in the fact that the interobserver concordance of LSM examination was not assessed; this issue has potentially affected the interpretation of our results. However, all tests were performed by expert operators, following the same protocol and fulfilling the validity criteria. Moreover, different relevant studies assessing FibroScan in NAFLD were based on multicenter cohorts and/or on multiple operators (29,30). Finally, many different studies reported good interobserver concordance for LSM (31,32). Another relevant limitation of our study is the lack of measured interobserver agreement assessment among pathologists for liver histology, which could strongly affect the clinical interpretation of our results because fibrosis by histology is the assumed gold standard for the outcome in our study. Consistently, even if the data from the literature (21) and from our group (33) clearly reported that the overall interobserver agreement for staging severe fibrosis in NAFLD was good, and again different relevant studies assessing FibroScan in histologically defined NAFLD were based on multicenter cohorts (11,29,30), we cannot exclude that the lack of central reading does not affect our results. Finally, not including patients with unreliable liver stiffness could overestimate the diagnostic accuracy of the proposed algorithms, which is a significant limitation of the proposed approach.
In conclusion, we demonstrated that obesity and/or ALT levels affect the diagnostic accuracy of noninvasive tools, especially LSM, which leads to a better accuracy of LSM compared with both FIB-4 and NFS only in nonobese patients and/or patients with low ALT levels. We also observed that serial combination strategies are better than a single-tool strategy, regardless of obesity and ALT levels, although the accuracy was lower in obese patients.
CONFLICTS OF INTEREST
Article guarantor: S. Petta, MD, PhD.
Specific author contributions: S.P., V.W.W., E.B., A.L.F., C.C., J.-B.H., G.L.W., J.V., A.W.C., A.G., W.M., H.L.C., B.L., R.L., S.G., A.C., and V.d.L. had full control of the study design, data analysis and interpretation, and preparation of article. All authors were involved in planning the analysis and drafting the article. The final draft of the article was approved by all the authors.
Financial support: None to report.
Potential competing interests: V.W.-S.W., G.L.-H.W., and H.L.-Y.C. have received lecture fees from Echosens.
WHAT IS KNOWN
- ✓ NFS, FIB-4, and LSM by FibroScan are the most validated and best performing tools for the noninvasive diagnosis of advanced fibrosis in NAFLD.
- ✓ The sequential combination of NFS or FIB-4 with LSM is recommended due to the high uncertainty area of noninvasive scores and the high-false positive rate of LSM.
- ✓ The impact of obesity and ALT levels on the diagnostic accuracy for advanced fibrosis of NFS, FIB-4, and LSM in NAFLD remains to be determined.
WHAT IS NEW HERE
- ✓ LSM has a higher diagnostic accuracy for advanced fibrosis in nonobese patients compared to obese patients with NAFLD and/or in patients with ALT ≤ 100 IU compared with those with ALT > 100. The accuracy of both NFS and FIB-4 is slightly negatively affected by obesity.
- ✓ LSM has a higher diagnostic accuracy for advanced fibrosis than both FIB-4 and NFS, which is mostly observed in nonobese patients with NAFLD.
- ✓ A serial combination of NFS or FIB-4 with LSM, compared with NFS or FIB-4 alone, increases the diagnostic accuracy of about 20%, even with worse performance in obese patients with NAFLD, regardless of obesity and ALT levels.
1. Younossi ZM, Koenig AB, Abdelatif D, et al. Global epidemiology of non-alcoholic fatty liver disease-meta-analytic assessment of prevalence, incidence and outcomes. Hepatology 2016;64:73–84.
2. Dyson J, Jaques B, Chattopadyhay D, et al. Hepatocellular cancer: The impact of obesity, type 2 diabetes and a multidisciplinary team. J Hepatol 2014;60(1):110–7.
3. Wong RJ, Aguilar M, Cheung R, et al. Nonalcoholic steatohepatitis is the second leading etiology of liver disease among adults awaiting liver transplantation in the United States. Gastroenterology 2015;148(3):547–55.
4. Adams LA, Anstee QM, Tilg H, et al. Non-alcoholic fatty liver disease and its relationship with cardiovascular disease and other extrahepatic diseases. Gut 2017;66(6):1138–53.
5. Ekstedt M, Hagström H, Nasr P, et al. Fibrosis stage is the strongest predictor for disease-specific mortality in NAFLD after up to 33 years of follow-up. Hepatology 2015;61(5):1547–54.
6. Angulo P, Kleiner DE, Dam-Larsen S, et al. Liver fibrosis, but no other histologic features, associates with long-term outcomes of patients with nonalcoholic fatty liver disease. Gastroenterology 2015;149:389–97.
7. Dulai PS, Singh S, Patel J, et al. Increased risk of mortality by fibrosis stage in nonalcoholic fatty liver disease: Systematic review and meta-analysis. Hepatology 2017;65(5):1557–65.
8. Ratziu V, Charlotte F, Heurtier A, et al; LIDO Study Group. Sampling variability of liver biopsy in nonalcoholic fatty liver disease. Gastroenterology 2005;128(7):1898–906.
9. Maida M, Macaluso FS, Salomone F, et al. Non-invasive assessment of liver injury in non-alcoholic fatty liver disease: A review of literature. Curr Mol Med 2016;16(8):721–37.
10. Xiao G, Zhu S, Xiao X, et al. Comparison of laboratory tests, ultrasound, or magnetic resonance elastography to detect fibrosis in patients with nonalcoholic fatty liver disease: A meta-analysis. Hepatology 2017;66(5):1486–501.
11. Petta S, Wong VW, Cammà C, et al. Serial combination of non-invasive tools improves the diagnostic accuracy of severe liver fibrosis in patients with NAFLD. Aliment Pharmacol Ther 2017;46(6):617–27.
12. European Association for the Study of the Liver (EASL); European Association for the Study of Diabetes (EASD); European Association for the Study of Obesity (EASO). EASL-EASD-EASO Clinical Practice Guidelines for the management of non-alcoholic fatty liver disease. J Hepatol 2016;64(6):1388–402.
13. Joo SK, Kim W, Kim D, et al. Steatosis severity affects the diagnostic performances of noninvasive fibrosis tests in nonalcoholic fatty liver disease. Liver Int 2018;38(2):331–41.
14. Petta S, Maida M, Macaluso FS, et al. The severity of steatosis influences liver stiffness measurement in patients with nonalcoholic fatty liver disease. Hepatology 2015;62(4):1101–10.
15. Petta S, Di Marco V, Cammà C, et al. Reliability of liver stiffness measurement in non-alcoholic fatty liver disease: The effects of body mass index. Aliment Pharmacol Ther 2011;33(12):1350–60.
16. Caussy C, Chen J, Alquiraish MH, et al. Association between obesity and discordance in fibrosis stage determination by magnetic resonance vs transient elastography in patients with non-alcoholic liver disease. Clin Gastroenterol Hepatol 2018;16:1974–82.e7.
17. Shen F, Zheng RD, Shi JP, et al. Impact of skin capsular distance on the performance of controlled attenuation parameter in patients with chronic liver disease. Liver Int 2015;35(11):2392–400.
18. Tapper EB, Cohen EB, Patel K, et al. Levels of alanine aminotransferase confound use of transient elastography to diagnose fibrosis in patients with chronic hepatitis C virus infection. Clin Gastroenterol Hepatol 2012;10(8):932–7.e1.
19. Coco B, Oliveri F, Maina AM, et al. Transient elastography: A new surrogate marker of liver fibrosis influenced by major changes of transaminases. J Viral Hepat 2007;14(5):360–9.
20. Colloredo G, Guido M, Sonzogni A, et al. Impact of liver biopsy size on histological evaluation of chronic viral hepatitis: The smaller the sample, the milder the disease. J Hepatol 2003;39:239–44.
21. Kleiner DE, Brunt EM, Van Natta M, et al. Design and validation of a histological scoring system for nonalcoholic fatty liver disease. Hepatology 2005;411:313–21.
22. McPherson S, Stewart SF, Henderson E, et al. Simple non-invasive fibrosis scoring systems can reliably exclude advanced fibrosis in patients with non-alcoholic fatty liver disease. Gut 2010;59:1265–9.
23. Angulo P, Hui JM, Marchesini G, et al. The NAFLD fibrosis score: A noninvasive system that identifies liver fibrosis in patients with NAFLD. Hepatology 2007;45(4):846–54.
24. Boursier J, Zarski JP, de Ledinghen V, et al; Multicentric Group from ANRS/HC/EP23 FIBROSTAR Studies. Determination of reliability criteria for liver stiffness evaluation by transient elastography. Hepatology 2013;57:1182–91.
25. Carpenter J, Bithell J. Bootstrap confidence intervals: When, which, what? A practical guide for medical statisticians. Stat Med 2000;19:1141–64.
26. Wong VW, Vergniol J, Wong GL, et al. Liver stiffness measurement using XL probe in patients with nonalcoholic fatty liver disease. Am J Gastroenterol 2012;107(12):1862–71.
27. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria, 2018. (https://www.R-project.org/
28. Wong GL, Chan HL, Choi PC, et al. Association between anthropometric parameters and measurements of liver stiffness by transient elastography. Clin Gastroenterol Hepatol 2013;11(3):295–302.
29. Wong VW, Vergniol J, Wong GL, et al. Diagnosis of fibrosis and cirrhosis using liver stiffness measurement in nonalcoholic fatty liver disease. Hepatology 2010;51:454–62.
30. Petta S, Wong VW, Cammà C, et al. Improved noninvasive prediction of liver fibrosis by liver stiffness measurement in patients with nonalcoholic fatty liver disease accounting for controlled attenuation parameter values. Hepatology 2017;65(4):1145–55.
31. Afdhal NH, Bacon BR, Patel K, et al. Accuracy of FibrosScan, compared with histology, in analysis of liver fibrosis in patients with hepatitis B or C: A United States multicenter study. Clin Gastroenterol Hepatol 2015;13(4):772–9.e1–3.
32. Boursier J, Konate A, Guilluy M, et al. Learning curve and interobserver reproducibility evaluation of liver stiffness measurement by transient elastography. Eur J Gastroenterol Hepatol 2008;20(7):693–701.
33. Petta S, Valenti L, Marra F, et al. MERTK rs4374383 polymorphism affects the severity of fibrosis in non-alcoholic fatty liver disease. J Hepatol 2016;64(3):682–90.