Secondary Logo

Journal Logo


A Comparison of Prognostic Scores (Mayo, UK-PBC, and GLOBE) in Primary Biliary Cholangitis

Goet, Jorn C. MD1; Murillo Perez, Carla F. MSc, PhD1,2; Harms, Maren H. MD, PhD1; Floreani, Annarosa MD, PhD3; Cazzagon, Nora MD, PhD3; Bruns, Tony MD, PhD4,5; Prechter, Florian MD4; Dalekos, George N. MD, PhD6; Verhelst, Xavier MD, PhD7; Gatselis, Nikolaos K. MD, PhD6; Lindor, Keith D. MD, PhD8; Lammers, Willem J. MD, PhD1; Gulamhusein, Aliya MD2; Reig, Anna MD9; Carbone, Marco MD, PhD10; Nevens, Frederik MD, PhD11; Hirschfield, Gideon M. MD, PhD2; van der Meer, Adriaan J. MD, PhD1; van Buuren, Henk R. MD, PhD1; Hansen, Bettina E. MSc, PhD2,12; Parés, Albert MD, PhD9; on behalf of the GLOBAL PBC Study Group

Author Information
The American Journal of Gastroenterology: July 2021 - Volume 116 - Issue 7 - p 1514-1522
doi: 10.14309/ajg.0000000000001285
  • Free



Primary biliary cholangitis (PBC) is a chronic cholestatic liver disease that predominantly affects middle-aged women (1,2). PBC is a usually slowly progressive disorder, potentially leading to cirrhosis, liver failure requiring liver transplantation (LT), or death (1,2). On an individual level, patients are nowadays often asymptomatic at diagnosis, whereas the clinical course and response to therapy vary greatly (3,4).

Over the past decades, several risk scores have been proposed in PBC that can estimate a patient's risk of adverse outcomes and that can aid in the process of patient counseling and medical management, in particular with respect to treatment decisions and timing of LT. The Mayo Risk Score (MRS) is a frequently used model to predict survival probability with an initial intended application in the selection and timing of LT. This score was originally developed in untreated patients with PBC to predict survival up to 7 years (5), later adapted to predict short-term survival at 2 years and for use at any point during follow-up (6), and eventually abbreviated to quickly estimate the risk score (7). Data regarding the prognostic performance of the MRS in ursodeoxycholic acid (UDCA)-treated patients are conflicting (8–12).

A more general model currently used to allocate patients for LT is the Model for End-stage Liver Disease (MELD). The MELD score was originally developed to predict survival in patients with cirrhosis who underwent placement of a transjugular intrahepatic portosystemic shunt (13) and later modified and validated for the prediction of short-term survival in patients with cirrhosis with varying disease severity and etiology, including PBC (14). To date, data on the appropriateness of the MELD score for risk stratification in the context of medical treatment in patients with PBC are lacking.

More recently, 2 new models were introduced. The UK-PBC group developed a new scoring system for long-term prediction of LT and liver-related death with the best fitting model comprising baseline albumin and platelet count, as well as bilirubin, transaminases, and alkaline phosphatase, after 12 months of UDCA (15). The GLOBE score comprises age, bilirubin, albumin, alkaline phosphatase, and platelet count as independent predictors of LT or death in UDCA-treated patients with PBC (16). The performance of the UK-PBC risk score and GLOBE score as compared to the MRS in UDCA-treated patients with PBC is not known.

In the current study, we aimed to assess and compare the performance of these prognostic scores developed for PBC in an international cohort of UDCA-treated patients with PBC, while also taking into consideration the MELD score.


Population and study design

Patients' data were derived from the GLOBAL PBC Study Group database (GPBCsg) Characteristics of the GPBCsg's cohort, comprising long-term follow-up data of 18 liver units across Europe and North America, have been described elsewhere (16). For the current study, patients' data were derived from 7 centers from the GPBCsg: Toronto Centre for Liver Disease, University of Toronto, Canada; University of Padua, Padua, Italy; University of Thessaly, Larissa, Greece; University of Jena, Jena, Germany; University of Barcelona, Barcelona, Spain; Ghent University Hospital, Ghent, Belgium; and Erasmus University Medical Center, Rotterdam, the Netherlands. UDCA-treated patients with an established diagnosis of PBC in accordance with internationally accepted guidelines were included (17,18). Patients were excluded if the follow-up was less than 6 months and/or less than 2 recorded visits, the date of start of treatment or date of major clinical events was unknown, or in the case of concomitant liver disease.

Data collection

The following clinical data were collected for the original cohort: sex, age, date of PBC diagnosis, liver histology, treatment (type of medication, dosage, and duration), last follow-up date, and clinical outcomes (death, cause of death, and LT). Previously collected laboratory data collected included baseline antimitochondrial antibody status and baseline and yearly laboratory values (serum alkaline phosphatase [ALP], total bilirubin, albumin, aspartate aminotransferase, alanine aminotransferase, and platelet count). Stage of disease was defined biochemically. Biochemical disease stage was classified according to Rotterdam criteria (11), namely mild (normal bilirubin and albumin), moderately advanced (abnormal bilirubin or albumin), and advanced disease (both abnormal bilirubin and albumin). Data in the original cohort were collected up to December 31, 2012 (19). For 3 centers (University of Jena, Jena, Germany; University of Thessaly, Larissa, Greece; and Ghent University Hospital, Ghent, Belgium), data were collected up to December 31, 2015. To enable calculation of all risk scores, additional information was collected on dialysis treatment, use of diuretics, presence of peripheral edema, serum creatinine, prothrombin time (PT), and international normalized ratio (INR). In case a physical examination was documented and there was an absence of documented edema, we presumed no edema.

Extensive efforts were made to ensure completeness and reliability of the data, including center visits for paper and electronic chart review. This study was conducted in accordance with the protocol and the principles of the Declaration of Helsinki. The protocol was approved by the Institutional Research Board of the corresponding center and at each participating center in accordance with local regulations.

Statistical analyses

Baseline was set at start of UDCA therapy. The primary end point was defined as a composite of either LT or death. Patients without documented events during follow-up were censored at their last follow-up visit. The 1989 MRS was calculated using the formula: 0.0394 × age + 0.8707 × ln(bilirubin [mg/dL]) + 2.380 × ln(PT)+ 0.8592 × edema − 2.533 × ln(albumin [g/dL]). The 1994 MRS was calculated with the formula: 0.051 × age + 1.209 × ln(bilirubin [mg/dL]) + 2.754 × ln(PT)+ 0.675 × edema − 3.304 × ln(albumin [g/dL]). In cases when PT was missing, 6.843 × ln(INR) was used instead of PT. Edema was coded as 0 for no edema and no diuretic therapy; 0.5 for edema present without diuretic therapy or edema resolved with diuretic therapy; and 1 for edema despite diuretic therapy. For comparative purposes, we included MELD and calculated laboratory MELD score using the formula: 10 × 0.957 × Loge (creatinine [mg/dL]) + 0.378 × Loge (bilirubin [mg/dL]) + 1.120 × Loge (INR) + 0.643. Laboratory values less than 1.0 were set to 1.0 in the calculation; maximum serum creatinine in the equation was 4.0 mg/dL; laboratory MELD scores exceeding 40 were adjusted to 40 (20). The GLOBE score was calculated using the formula: 0.044378 × ageat start of UDCA therapy + 0.335648 × ln(ALP1 year UDCA/upper limit of normal [ULN]) + 0.93982 × ln(bilirubin1 year UDCA/ULN) − 0.002581 × Platelet count1 year UDCA per 109/L − 2.266708 × albumin1 year UDCA/lower limit of normal (LLN) + 1.216,865. The UK-PBC score was calculated as follows: r = 0.0287854 × (ALP12 × ULN − 1.722136304) − 0.0422873 × {[(altast12 × ULN/10)−1] − 8.675729006} + 1.4199 × (ln[bili12 × ULN/10) + 2.709607778] − 1.960303 × [alb0 × LLN − 1.17673001] – 0.4161954 × (plt0 × LLN − 1.873564875]).

These scores were calculated at yearly intervals up to 5 years after initiation of UDCA therapy. We used descriptive statistics, including boxplots, to visualize the various risk score indexes during follow-up in patients who would eventually have a composite end point of LT or death in comparison to patients alive at the end of follow-up.

Validity of the prediction models was assessed based on discrimination and calibration of the models. Discrimination is the ability to categorize those with and without the outcome of interest based on predictive values (21). Calibration is the measure of how accurately the predicted outcome matches the observed outcome (21). At yearly time points, Cox proportional hazards regressions were conducted, and the overall discriminative performance for the different scores was calculated with concordance statistic (C-statistic). Cox regression analyses were performed to assess the additional value of combining risk prediction models in estimating the risk of LT or death with application at 1 year of UDCA. In addition, the C-statistic for various combinations of risk prediction models was assessed.

Subanalyses of discriminative ability for the various risk prediction models were performed in patients with bilirubin ≤ 0.6 × ULN compared with those with bilirubin values > 0.6 × ULN at baseline and 1 year of UDCA, as this threshold was associated with an increased risk of LT and death (22). In addition, to assess the performance of the various risk prediction models in those with no or low fibrosis stage (stage 1 and 2) versus those with advanced fibrosis (stage 3 and 4), patients were stratified according to Fibrosis-4 (FIB-4) Index for Liver Fibrosis (23). Patients with a FIB-4≥ 1.8 were considered to have advanced (stage 3 and 4) fibrosis (24).

Model calibration for the MRS, MELD, UK-PBC, and GLOBE score was assessed graphically by comparing observed transplant-free survival from Kaplan-Meier estimates with transplant-free survival predicted by the risk prediction models at 1 year of UDCA. The calibration for the UK-PBC survival estimates was not included in this analysis, as it relates to liver-related death survival rather than transplant-free survival.

Statistical analyses were performed with IBM SPS Statistics version 22.0 (IBM Released 2013, IBM, Armon, NY). To account for missing values, SAS version 9.4 (SAS Institute, Cary, NC, SAS Proc MI, MCMC method) was used to generate 10 imputed data sets of laboratory results at yearly time points between initiation of UDCA therapy and 5 years of follow-up, as described in a previous study, as described in a previous study (25–28). This method uses chained equation to simultaneous impute all missing values drawing from the distribution of known values. Missing data were considered to be missing at random. Rubin rules were used for estimation of the parameters and the SE (28). The imputation model included baseline variables that were potentially predictive for outcomes in PBC (e.g., year of diagnosis and age) as well as the outcomes themselves. In cases in which PT was missing, we assumed normal PT and INR values when albumin and bilirubin were within the normal range. Subsequently, the missing PT and INR values were imputed by multiple imputation as previously described. Data are presented as median and interquartile range (IQR) for continuous variables.


Study population characteristics

A total of 1,100 UDCA-treated patients with PBC were included, with a mean age at start of follow-up of 53.6 (SD 12.0) years, of whom 1,003 (91%) were females. Clinical and biochemical patient characteristics at initiation of UDCA therapy are shown in Table 1. Median follow-up was 7.6 (IQR 4.1–11.7) years. During follow-up, a total of 169 patients experienced a clinical end point, 42 underwent LT, and 127 patients died. In 86/127 (67.7%) patients, the cause of death was considered liver related. For the current study population, the 5-, 10-, and 15-year transplant-free survival rates were 93.4%, 83.8%, and 75.6%, respectively, as shown in Figure 1.

Table 1.
Table 1.:
Baseline cohort characteristics
Figure 1.
Figure 1.:
Kaplan-Meier estimate of survival in this cohort.

At initiation of UDCA therapy, 215 (19.6%) patients had serum bilirubin values above the ULN, and 107 (9.7%) had albumin values below the LLN. The patient population consisted of 816 (74.2%) patients with biochemically early disease stage according to Rotterdam criteria (normal albumin and bilirubin), 241 (21.9%) had moderately advanced disease stage (abnormal albumin or bilirubin), and 43 (3.9%) had advanced disease stage (abnormal albumin and bilirubin).

At the start of UDCA therapy, the median (IQR) score for MRS (1989 model), MRS (1994 model), MELD, and GLOBE was 3.94 (3.38–4.58), 4.24 (3.50–5.05), 7.00 (6.00–9.00), and 0.02 (−0.64 to 0.75), respectively (Table 1). Median scores of the various risk score indexes at initiation of UDCA therapy and 5 years thereafter in patients who developed a clinical end point versus those who were still alive at the end of follow-up are shown in Figure 2.

Figure 2.
Figure 2.:
Boxplots of the various risk prediction scores from initiation of UDCA therapy to 5 years according to whether they experienced a clinical outcome at the end of follow-up. MELD, Model for End-stage Liver Disease; UDCA, ursodeoxycholic acid.

Discriminatory performance of the MRS, MELD, UK-PBC, and GLOBE scores

At baseline, the overall discriminatory performance of the GLOBE score, expressed as the C-statistic, for predicting the risk of death or LT was 0.78 (95% confidence interval [CI] 0.74–0.82) versus 0.77 (95% CI 0.73–0.81) for the MRS (1989 and 1994) and 0.68 (95% CI 0.65–0.71) for the MELD score (see Supplementary Table 1, Supplementary Digital Content 1, At 1 year of UDCA therapy, the C-statistic was 0.80 (95% CI 0.76–0.84) for the GLOBE score, 0.76 (95% CI 0.72–0.81) for the MRS (1989 and 1994), 0.68 (95% CI 0.64–0.72) for the MELD score, and 0.74 (95% CI 0.67–0.80) for the UK-PBC score. The performance of MELD, as assessed with C-statistics, was statistically significantly lower compared with the remaining scores. In the 5 years after initiation of UDCA therapy, the difference in discriminatory performance for the various risk prediction models remained comparable (Table 2 and see Supplementary Table 1, Supplementary Digital Content 1, Although the performance of the GLOBE score was statistically different from that of UK-PBC for the prediction of LT and death at 1 year (P = 0.02), there were no statistically significant differences between these scores for the prediction of liver-related death or LT at 1 year of UDCA therapy, which was 0.81 (95% CI 0.77–0.86) for the GLOBE score and 0.81 (95% CI 0.76–0.85) for the UK-PBC score (P = 0.45) (see Supplementary Table 1, Supplementary Digital Content 1,

Table 2.
Table 2.:
Discriminative performance of the various risk prediction scores calculated after 1, 3, and 5 years of UDCA therapy

Subanalyses of the discriminatory ability in patients with bilirubin values ≤ 0.6 × ULN and those with bilirubin values > 0.6 × ULN at baseline and 1 year of UDCA showed that in general, all scores had better discriminative performance in patients with bilirubin values > 0.6 × ULN (Table 3). Subanalyses according to FIB-4 were also performed, in which a total of 387 (35.2%) patients had FIB-4 scores > 1.8 at initiation of UDCA therapy indicating advanced fibrosis. At 1 year of UDCA therapy, 253/905 (28.0%) patients met the threshold for advanced fibrosis. Discriminatory ability of the risk scores stratified according to FIB-4 demonstrated that the performance is higher in those with FIB-4 ≥ 1.8.

Table 3.
Table 3.:
Discriminative performance of the various risk prediction scores calculated at baseline and after 1 year of UDCA therapy stratified by bilirubin values and FIB-4

Combined performance of the MRS, MELD, UK-PBC, and GLOBE scores

In univariable Cox regression analyses, the prognostic indexes of all individual scores were significantly associated with death or LT (Table 4). In a multivariable analysis that included all respective scores with the exclusion of MRS 1989, only the GLOBE score (hazard ratio 2.36 [95% CI 1.71–3.27 P < 0.001]) and MRS 1994 (hazard ratio 1.28 [95% CI 1.06–1.55; P = 0.01]) remained significantly associated with death or LT.

Table 4.
Table 4.:
Multivariable analyses of risk prediction scores at 1 year of UDCA therapy (N = 905)

Addition of the MRS, MELD, or UK-PBC to the GLOBE score did not result in an increase in discriminatory performance, which remained at 0.80 (Table 5). Combining the UK-PBC score with the MRS, MELD, or GLOBE resulted in an increase in C-statistic ranging from 0.01 to 0.06, with the highest increase observed from the addition of the GLOBE score and lowest from MELD. For various combinations of the MRS with other scores, relatively smaller changes in C-statistic were observed with the highest being from the addition of the GLOBE score (+0.04) (Table 5). In contrast, the addition of all scores to the MELD score yielded an increase in C-statistic, ranging from 0.07 to 0.12.

Table 5.
Table 5.:
Cox regression analyses and combined discriminatory performance of prognostic scores at 1 year of UDCA therapy (N = 905)

Prediction accuracy (calibration) of the MRS, MELD, UK-PBC, and GLOBE scores

In Figure 3, the observed and median predicted survival for the various risk prediction models are shown. For all models, good calibration for short-term and long-term survival was observed. In the estimates of survival, both the GLOBE and MRS 1994 tended to overestimate transplant-free survival, with the greatest deviation from observed survival at 10 years for GLOBE (3.5%) and 2 years for MRS 1994 (2.9%). MRS 1989 demonstrated the best calibration, as the difference in predicted versus observed survival was generally less than 1% at yearly intervals up to 7 years (see Supplementary Table 2, Supplementary Digital Content 1,

Figure 3.
Figure 3.:
Predicted versus observed liver transplant-free survival for the GLOBE score and MRS (1989 and 1994). The figure shows prediction accuracy (calibration) of the GLOBE score and MRS up to 15 years of follow-up after 1 year of UDCA therapy (N = 905). Solid line = actual observed transplant-free survival probabilities estimated by Kaplan-Meier analyses. Dashed lines = the predicted median transplant-free survival probabilities as predicted by the GLOBE score and MRS. MRS, Mayo Risk Score; UDCA, ursodeoxycholic acid.


In this large cohort of patients with PBC, we assessed the performance of various published risk prediction models. We demonstrate that in a cohort of mainly early biochemical disease stage PBC patients, all prognostic scores evaluated (GLOBE, UK-PBC, and MRS) have adequate discriminatory performance and good prediction accuracy. The discriminatory performance of these PBC-specific scores increased in those with bilirubin > 0.6 × ULN and advanced fibrosis. Not surprisingly, our data also show that the performance of the MELD score, which was not developed for or has previously shown promise as a prognostic tool in early or noncirrhotic liver disease, was clearly inferior to that of the PBC-specific scores.

The consistently high discriminative performance of the GLOBE score in our cohort suggests that more patients who experienced an event had a higher risk score and more patients without an event had a lower risk score than with the use of other scores. However, there were no significant differences in comparison to UK-PBC and MRS. In general, models with a C-statistic greater than 0.8 are considered good prognostic models, of which the GLOBE score was the only score to consistently reach this threshold in the prediction of transplant-free survival at various time points (29). Secondary to the GLOBE score in discriminatory performance was the MRS (1989 and 1994). Although the MRS did not have a C-statistic above 0.8 at 1 year of UDCA therapy, the discriminatory performance increased when applied at other time points during prolonged UDCA treatment. Although the MRS is the traditional risk prediction model in patients with PBC, its clinical utility may be hampered by the use of peripheral edema as a subjective parameter. It should be noted that the MELD score and MRS were derived in patients with end-stage liver disease and our cohort mainly comprised patients with biochemically early disease stage. In addition, although the MRS was developed in untreated patients with PBC, the current study included UDCA-treated patients. The prognostic value of MRS has been demonstrated in UDCA-treated patients to be associated with transplant-free survival as it stratifies patients into high-risk and low-risk groups using the original thresholds (9,10). Given the adequate discriminatory performance and good prediction accuracy of these scores, the GLOBE and MRS can be implemented to predict overall transplant-free survival, whereas the clinical utility of the UK-PBC score can be aimed at predicting LT and liver-related death.

Not surprisingly, subgroup analyses showed that all risk prediction scores tended to have improved discriminatory performance in patients with bilirubin values > 0.6 × ULN compared with those with bilirubin values ≤ 0.6 × ULN. Bilirubin is one of the most robustly validated markers of disease progression in PBC and is included in all risk prediction models for PBC (19,30,31). Bilirubin is mostly considered a late biomarker, i.e., elevations are seen only in late stages of the disease and increase shortly before a clinical event, and therefore may be considered less discriminatory for early detection of progression of disease and clinical outcome (30,32). However, a recent study by the Global PBC study group showed that bilirubin values within the normal range, both at baseline and after 1 year of UDCA therapy, were predictive of transplant-free survival, suggesting that even increases in bilirubin values within the normal range should prompt reconsideration for second-line therapies and optimal management (22). The threshold of 0.6 used in the current article has been shown to be associated with the lowest risk for LT or death, after which the risk increases (22). Akin to the results observed for patients with bilirubin > 0.6 × ULN, the various risk prediction models had better performance in those with FIB-4 levels above 1.8, which was the threshold best associated with advanced fibrosis (24). These subgroup analyses suggest that current risk stratification tools are less accurate when used to risk stratify patients in earlier stages of disease.

Interestingly, combination of the indexes of various risk prediction models in the estimation of death or LT, although statistically significant, did not result in a numerical increase in C-statistic, particularly for the GLOBE score. This suggests that although it is not feasible to calculate multiple risk scores in clinical practice, there may some additional value of considering scores such as the MRS in addition to GLOBE. Various studies in UDCA-treated patients have reported that the MRS may underestimate survival (8,11,12). In our study, we demonstrate that the MRS has good prediction accuracy and adequate performance and may therefore be of value in UDCA-treated patients. Theoretically, the added value of the MRS in discriminatory performance may be driven by PT and edema. However, because our cohort mainly comprises patients with early-stage PBC in whom PT will be within the normal range and edema will be absent, this seems unlikely.

A strength of our study is the inclusion of a well-characterized large study population from multiple centers. Some limitations need to be considered. First, because of the retrospective nature of the current study, a proportion of data were missing (see Supplementary Table 3, Supplementary Digital Content 1, To overcome this problem multiple imputation techniques were used (26). Second, although some of the patients in this study were included in the derivation cohort of the GLOBE score, a substantial proportion (∼25%) of patients not originally used in the derivation of the GLOBE score. However, sensitivity analyses of the discriminative performance of the various scores in the 25% of patients not included in the derivation of the GLOBE score yielded similar results (see Supplementary Table 4, Supplementary Digital Content 1, Third, our cohort mainly comprised patients with early-stage disease. Although our study population is representative of the majority of current PBC patients, as most patients nowadays present at early stages of disease (33), comparison of the various risk prediction models in more advanced stages of disease would be of additional value. Last, although the UK-PBC risk score was developed to predict a different end point composed of liver-related death and LT, the discriminatory performance was also assessed for this end point and yielded similar results.

In conclusion, in this large cohort of mainly early disease stage PBC patients, we show that all prognostic scores developed for PBC (GLOBE, UK-PBC, and MRS) have comparable performance in the prediction of clinical outcomes. Although the discriminating performance for LT or death of the GLOBE score was superior, this difference was not statistically significant compared with the other scores (MRS and UK-PBC). This is true for various time points during UDCA treatment as well as in subgroups stratified according to biochemical and fibrosis disease stage. This suggests that implementation ought to be based on clinical context.


Guarantor of the article: Bettina E. Hansen, MSc, PhD, and Albert Parés, MD, PhD.

Specific author contributions: Bettina E. Hansen, MSc, PhD, and Albert Parés, MD, PhD, contributed equally to this work. Study concept and design: J.C.G., H.R.v.B., B.H., and A.P. Acquisition of data: all authors. Analysis and interpretation of data: J.C.G., C.F.M.P., and B.E.H. Drafting of the manuscript: J.C.G., C.F.M.P., M.H.H., M.C., F.N., H.R.v.B., B.E.H., and A.P. Critical revision of the manuscript for important intellectual content: all authors. Statistical analysis: J.C.G., C.F.M.P., and B.E.H. Obtained funding: G.M.H., H.R.v.B., and B.E.H. Study supervision: H.R.v.B., B.E.H., and A.P.

Financial support: This investigator-initiated study was supported by unrestricted grants from CymaBay Therapeutics Inc, Intercept Pharmaceuticals, and previously from Zambon Nederland BV and was funded by the Toronto General & Western Hospital Foundation (a not-for-profit foundation) in Toronto, Canada, and the Foundation for Liver and Gastrointestinal Research (a not-for-profit foundation) in Rotterdam, the Netherlands. The supporting parties had no influence on the study design, data collection and analyses, writing of the manuscript, or on the decision to submit the manuscript for publication.

Potential competing interests: The following authors declared that they have no conflicts of interest: J.C.G., G.N.D., N.K.G., A.R., and F.P. C.F.M.P. reports speaker fee from Merck & Co. M.H. reports speaker fees from Zambon Nederland B.V.A.F. reports consulting activities for Intercept Pharmaceuticals. N.C. reports consulting services for Intercept Pharmaceuticals. T.B. has received honoraria from Intercept Pharmaceuticals, Falk Foundation, AbbVie, and Norgine and travel expenses from Gilead. X.V. received grants from Gilead, AbbVie, Dr. Phalk Pharma, and MSD and acted as a consultant for Gilead, AbbVie, and MSD. K.D.L. reports that he is an unpaid advisor for Intercept Pharmaceuticals and Shire. W.J.L. reports consulting services for Intercept Pharmaceuticals. A.G. reports advisory for Intercept Pharmaceuticals and AbbVie. M.C. reports advisory for Intercept Pharmaceuticals and lecture fees from Perspectum Diagnostics. F.N.: advisory boards for Astellas, Janssen-Cilag, AbbVie, Gilead, CAF, Intercept, Gore, BMS, Novartis, MSD, Janssen-Cilag, Promethera Biosciences, Ono Pharma, Durect, Roche, and Ferring and research grants from Roche, Ferring, and Novartis. G.M.H. reports advisory services for Intercept Pharmaceuticals, Novartis, CymaBay Therapeutics, Genfit, and GlaxoSmithKline Pharmaceuticals. A.J.v.d.M. reports speaker fees from Gilead Sciences, AbbVie Pharmaceuticals, and Zambon Nederland B.V., received an unrestricted grant from Gilead Sciences, and reports travel expenses covered by Dr Falk Pharma. H.R.v.B. is a consultant for Intercept Pharma Benelux and received unrestricted research grants from Intercept Pharmaceuticals and from Zambon Nederland B.V. B.E.H. reports grants from Intercept Pharmaceuticals, CymaBay Therapeutics, and Zambon Nederland B.V. and consulting work for Intercept Pharmaceuticals, CymaBay Therapeutics, Albireo AB, and Novartis. A.P. reports consulting services for Intercept Pharmaceuticals and Novartis Pharma.

Study Highlights


  • ✓ Mayo Risk Score (MRS), UK-PBC score, and GLOBE score predict clinical outcomes in patients with primary biliary cholangitis.
  • ✓ These scores were developed in varying patient populations and with varying treatment status.


  • ✓ Prediction of clinical outcomes by MRS, UK-PBC, and GLOBE is equivalent in ursodeoxycholic acid-treated patients.
  • ✓ Implementation of risk scores in primary biliary cholangitis should be based on clinical context.


This study was performed on behalf of the Global PBC Study Group and was supported by Intercept Pharmaceuticals, CymaBay Therapeutics, and the Foundation for Liver and Gastrointestinal Research in Rotterdam, the Netherlands (a not-for-profit organization).


1. Kaplan MM, Gershwin ME. Primary biliary cirrhosis. N Engl J Med 2005;353:1261–73.
2. Poupon R. Primary biliary cirrhosis: A 2010 update. J Hepatol 2010;52:745–58.
3. Prince MI, Chetwynd A, Craig WL, et al. Asymptomatic primary biliary cirrhosis: Clinical features, prognosis, and symptom progression in a large population based cohort. Gut 2004;53:865–70.
4. Floreani A, Caroli D, Variola A, et al. A 35-year follow-up of a large cohort of patients with primary biliary cirrhosis seen at a single centre. Liver Int 2011;31:361–8.
5. Dickson ER, Grambsch PM, Fleming TR, et al. Prognosis in primary biliary cirrhosis: Model for decision making. Hepatology 1989;10:1–7.
6. Murtaugh PA, Dickson ER, Van Dam GM, et al. Primary biliary cirrhosis: Prediction of short-term survival based on repeated patient visits. Hepatology 1994;20:126–34.
7. Kim WR, Wiesner RH, Poterucha JJ, et al. Adaptation of the Mayo primary biliary cirrhosis natural history model for application in liver transplant candidates. Liver Transpl 2000;6:489–94.
8. Parés A, Caballería L, Rodés J. Excellent long-term survival in patients with primary biliary cirrhosis and biochemical response to ursodeoxycholic acid. Gastroenterology 2006;130:715–20.
9. Kilmurry MR, Heathcote EJ, Cauch-Dudek K, et al. Is the Mayo model for predicting survival useful after the introduction of ursodeoxycholic acid treatment for primary biliary cirrhosis? Hepatology 1996;23:1148–53.
10. Angulo P, Lindor KD, Therneau TM, et al. Utilization of the Mayo risk score in patients with primary biliary cirrhosis receiving ursodeoxycholic acid. Liver 1999;19:115–21.
11. ter Borg PCJ, Schalm SW, Hansen BE, et al. Prognosis of ursodeoxycholic acid-treated patients with primary biliary cirrhosis. Results of a 10-yr cohort study involving 297 patients. Am J Gastroenterol 2006;101:2044–50.
12. Poupon RE, Bonnand AM, Chrétien Y, et al. Ten-year survival in ursodeoxycholic acid-treated patients with primary biliary cirrhosis. The UDCA-PBC Study Group. Hepatology 1999;29:1668–71.
13. Malinchoc M, Kamath PS, Gordon FD, et al. A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts. Hepatology 2000;31:864–71.
14. Kamath PS, Wiesner RH, Malinchoc M, et al. A model to predict survival in patients with end-stage liver disease. Hepatology 2001;33:464–70.
15. Carbone M, Sharp SJ, Flack S, et al. The UK-PBC risk scores: Derivation and validation of a scoring system for long-term prediction of end-stage liver disease in primary biliary cholangitis. Hepatology 2016;63:930–50.
16. Lammers WJ, Hirschfield GM, Corpechot C, et al. Development and validation of a scoring system to predict outcomes of patients with primary biliary cirrhosis receiving ursodeoxycholic acid therapy. Gastroenterology 2015;149:1804–12.e4.
17. Hirschfield GM, Beuers U, Corpechot C, et al. EASL Clinical Practice Guidelines : The diagnosis and management of patients with primary biliary cholangitis. J Hepatol 2017;67:145–72.
18. Lindor KD, Gershwin ME, Poupon R, et al. Primary biliary cirrhosis. Hepatology 2009;50:291–308.
19. Lammers WJ, van Buuren HR, Hirschfield GM, et al. Levels of alkaline phosphatase and bilirubin are surrogate end points of outcomes of patients with primary biliary cirrhosis: An international follow-up study. Gastroenterology 2014;147:1338–49.e5.
20. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: A framework for traditional and novel measures. Epidemiology 2010;21:128–38.
21. Freeman RBJ, Wiesner RH, Harper A, et al. The new liver allocation system: Moving toward evidence-based transplantation policy. Liver Transpl 2002;8:851–8.
22. Murillo Perez CF, Harms MH, Lindor KD, et al. Goals of treatment for improved survival in primary biliary cholangitis: Treatment target should be bilirubin within the normal range and normalization of alkaline phosphatase. Am J Gastroenterol 2020;115:1066–74.
23. Sterling RK, Lissen E, Clumeck N, et al. Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology 2006;43:1317–25.
24. Murillo Perez CF, Hirschfield GM, Corpechot C, et al. Fibrosis stage is an independent predictor of outcome in primary biliary cholangitis despite biochemical treatment response. Aliment Pharmacol Ther 2019;50:1127–36.
25. Harms MH, Lammers WJ, Thorburn D, et al. Major hepatic complications in ursodeoxycholic acid-treated patients with primary biliary cholangitis: Risk factors and time trends in incidence and outcome. Am J Gastroenterol 2018;113:254–64.
26. Sterne JA, White IR, Carlin JB, et al. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ 2009;338:b2393.
27. Rubin DB. Multiple imputation after 18+ years. J Am Statitical Assoc 1996;91:473–89.
28. Little RJA, Rubin DB. Statistical Analysis with Missing Data. John Wiley & Sons: New York, 1987.
29. Royston P, Altman DG. External validation of a Cox prognostic model: Principles and methods. BMC Med Res Methodol 2013;13:33.
30. Shapiro JM, Smith H, Schaffner F. Serum bilirubin: A prognostic factor in primary biliary cirrhosis. Gut 1979;20:137–40.
31. Bonnand AM, Heathcote EJ, Lindor KD, et al. Clinical significance of serum bilirubin levels under ursodeoxycholic acid therapy in patients with primary biliary cirrhosis. Hepatology 1999;29:39–43.
32. Harms MH, Pares A, Mason AL, et al. Behavioral patterns of total serum bilirubin prior to major clinical endpoints in 3529 patients with primary biliary cholangitis. J Hepatol 2016;64:S633–S634.
33. Murillo Perez CF, Goet JC, Lammers WJ, et al. Milder disease stage in patients with primary biliary cholangitis over a 44-year period: A changing natural history. Hepatology 2018;67:1920–30.

Supplemental Digital Content

© 2021 by The American College of Gastroenterology