Where Are We Now?
Prior to the current study on mortality with primary bone sarcomas, Forsberg and colleagues [1, 2] demonstrated how the relatively unfamiliar Bayesian belief network model could usefully predict one-year survival in skeletal metastases. To help explain the Bayesian belief network, I will run through each part of its name.
Bayesian means Bayes’ Theorem, which relates the total probability of an event occurring to the conditional probability of it occurring, given that another event has happened. For example, sensitivity—a concept all clinicians know about—is technically defined as the conditional probability of a positive test result given that the patient has a disease. Positive predictive value is the corresponding probability of the patient having the disease given the positive test result. Bayes’ Theorem states that the relationship between the two is the ratio between the total probability of having the disease (the prevalence) and the total probability of having a positive result, (which can be calculated from sensitivity, specificity, and prevalence). A low prevalence explains why the positive predictive value of a test can be low even if the sensitivity is high.
The Belief refers to the Bayesian view of probability. In most of the statistical analyses published in journals, probability is assumed to describe the observable frequency of an event or set of events, and so this approach to probability is called “frequentist.” In Bayesian methods, probability is seen as an initial belief (called the prior), which is updated based on accumulation of evidence. This can be explained through the clinician's own experience. If a patient presents to the office with fevers and a draining sinus, the physician will estimate a “pre-test probability”, also known as the Bayesian prior, of the patient having an infection. This pre-test probability might be the overall prevalence of infection or it might be based on the initial assessment of the patient. After diagnostic tests (such as white blood cell counts or bone scans) are performed, a “post-test probability” can calculated based on the pre-test probability, the test results, and the known reliability of the tests. One of the arguments in favor of the Bayesian approach to statistics is that it more closely mirrors actual experience.
The Network element is shown in Figure 1 of the current study , where each block may be directly or indirectly associated with the outcome of 1-year survival. In a regression model, each of the blocks would directly affect 1-year survival, and the resulting diagram would have predictors distributed like spokes around a hub. A regression model may explicitly allow for interaction terms in which the effect of one predictor is dependent on the level of another predictor. For example, the odds ratio for age could have one value for men and another for women; this should be explicitly defined by the analyst. In contrast, the Bayesian belief network would express it as different probabilities of mortality conditional on age and gender groups, which may be a more flexible arrangement. The graphical network can be predetermined by the researcher, or can be determined by an algorithm as was done in this study.
Putting it all together, the Bayesian belief network estimates conditional probabilities at given nodes of the network. For a specific set of predictors, the estimated probability of a patient dying at 1 year is calculated from the combination of the conditional probabilities. Forsberg and colleagues [1, 2] suggest that the Bayesian belief network's representation of conditional probabilities is more robust to missing data than alternatives such as artificial neural networks or logistic regression. Although the math can be complicated, the resulting probabilities make sense, as can be seen in Table 4 of the current study .
Indeed, other tools may outperform it , but the Bayesian belief network has the advantage in that it can derive and display relationships between predictors. Logistic regressions do not allow for that sort of complexity, and artificial neural networks produce relatively opaque models without easily-interpreted coefficients. Another advantage is that if data are missing, then the input to logistic regression or neural networks is either absent or has to be filled in by an imputed value.
Where Do We Need To Go?
Although I agree with Nandra and colleagues  that an individual clinician may be inaccurate, I'm not sure I would defer prognosis to an algorithm. Statistical expertise, no matter how sophisticated, cannot and should not substitute for clinical knowledge. However, the Bayesian belief network (or any good predictive model) would provide an objective reference point that can be used as part of the information available to the clinician facing a patient in his office. Similarly, a model generated by a naïve algorithm could suggest interesting and testable relationships between predictors affecting outcome.
Parallel to the growth of “Big Data” and the spread of electronic health records has been the development of potential models to analyze data. As of this writing, there are 10,391 individual packages available for the statistical platform R , and this includes four packages for generating Bayesian networks. It may be that a definitive model is not to be expected, as the British statistician George Box once famously observed that “all models are wrong but some are useful.” However, available statistical tools still allow findings that can impact clinical practice, and it strengthens a study to show that a result persists across multiple analysis approaches.
How Do We Get There?
The search for improved predictive models is ongoing, and such models can serve a wide range of needs. Depending on the situation, the clinician may want a model that accurately assesses the risk of an outcome, or that classifies a patient definitively as having a particular diagnosis. Hospitals and insurers want to predict effects in patient populations—how will demand for services change, or how many complications should a surgeon see based on a case mix? Predicting the next advance in predictive models is ambitious. Also, most if not all of these models must extrapolate from known history, and so cannot account for an unforeseeable event such as a new medical breakthrough or the emergence of a new pathogen. In the near-term, the obligation remains with clinical researchers to ensure that they are selecting the most appropriate models for their particular research question, and that the tools are being applied and interpreted properly.
1. Forsberg JA, Eberhardt J, Boland PJ, Wedin R, Healey JH. Estimating survival in patients with operable skeletal metastases: An application of a Bayesian belief network. PLOS One
2. Forsberg JA, Sjoberg D, Chen Q-R, Vickers A, Healey JH. Treating metastatic disease: which survival model is best suited for the clinic? Clin Orthop Relat Res.
3. Nandra R, Parry M, Forsberg J, Grimer R. Can a Bayesian belief network be used to estimate 1-year survival in patients with bone sarcomas? Clin Orthop Relat Res. [Published online ahead of print April 10, 2017]. DOI: 10.1007/s11999-017-5346-1.