Secondary Logo

Journal Logo

Decision Making

Bayes' Theorem and the Physical Examination: Probability Assessment and Diagnostic Decision Making

Herrle, Scott R., MD, MS; Corbett, Eugene C. Jr., MD; Fagan, Mark J., MD; Moore, Charity G., PhD, MSPH; Elnicki, D. Michael, MD

Author Information
doi: 10.1097/ACM.0b013e318212eb00
  • Free

Abstract

Despite the key role that the physical examination occupies in patient care,1–8 the decline in examination skills has been well documented.9–20 In a 1996 commentary, Mangione and Peitzman6 argued that one way to improve examination skills is for “teachers of physical diagnosis [to] separate wheat from chaff and discard signs or maneuvers of little value.” This approach requires examination findings to be viewed as diagnostic tests each with their own test characteristics.21–25 Fortunately, the recent focus on the principles of evidence-based medicine has led to the determination of these characteristics for a number of findings.21–25 However, the quality of the available data is variable,24,25 and the extent to which educators have incorporated this literature into their teaching is not known.

Bayes' theorem (see the Appendix) allows clinicians to apply the published test characteristics of examination findings to their probability assessment.26–30 When using a Bayesian approach, clinicians develop an initial probability that a patient has a disorder. This probability is then sequentially revised using information obtained from the history, physical examination, and diagnostic testing to arrive at a final probability estimate.

Although the frequency with which clinicians employ a Bayesian approach in their decision making is not known,29 research dating back to the 1970s has highlighted several commonly made mistakes.26,29,31–35 First, clinicians often inaccurately form their initial probability estimates by overestimating the prevalence of rare conditions and underestimating the prevalence of common conditions.26,31,35 Second, clinicians often do not revise their initial probability estimate as much as would be suggested by Bayes' theorem.26,31 This observation may be due to “anchoring,” in which an individual's final probability estimate is highly sensitive to the probability at which he or she starts.26,31 Third, clinicians tend to give more weight to items encountered later on in a patient interaction than to those items encountered earlier on.32,33 Finally, physicians tend to overemphasize the importance of diagnostic testing, perhaps because they believe that diagnostic tests are more accurate than historical items and examination findings.34

Currently, little is known about how clinicians apply examination findings to their probability assessments. In this cross-sectional, multiinstitutional, Web-based survey study, we sought to determine and compare how third- and fourth-year medical students, internal medicine residents, and academic general internists apply examination findings to their probability assessments and to determine the impact that these findings have on the ordering of diagnostic tests.

Method

Study sites and participants

In 2008, our study recruited participants from three U.S. medical schools—the University of Pittsburgh School of Medicine, Alpert Medical School of Brown University, and the University of Virginia School of Medicine—and their affiliated residency programs. The target population was third-year and fourth-year medical students, internal medicine residents, and academic general internists. No training on Bayesian principles or diagnostic test characteristics was provided to any participant prior to study participation. The study was approved by the institutional review boards at each institution.

Survey design and content

The survey was written, reviewed, and edited by the study authors and pilot-tested by seven individuals who were not study participants but who represented the three levels of training included in the study.

The survey began with a series of sociodemographic questions about age, sex, current training or employment status, medical school attended, and, if applicable, date of medical school graduation, name of residency program attended, and date of completion. Next, it asked participants whether they had received formal teaching about the physical examination and about the principles of evidence-based medicine during medical school, residency, or both. Finally, it presented four cases and asked participants questions about condition probability and diagnostic strategy.

We decided to limit the survey to four conditions to avoid overburdening the participants. To provide content validity, we selected the conditions from cases highlighted in the Rational Clinical Examination Series (RCES), a series that has been appearing in JAMA since 1992 and is now available in book form.21,25 To determine which conditions to include, we ranked the conditions according to their perceived relevance and importance and asked five academic general internists to do the same. We then included the four highest-ranked conditions—ascites, heart failure, group A beta-hemolytic streptococcal pharyngitis, and acute anterior cruciate ligament (ACL) tear—in the survey.

Each of the four cases consisted of the following: a brief history, a preexamination probability (pre-EP) that the condition was present based on the prevalence of the condition and information contained in the history, two separate examination scenarios containing information about whether specific findings were present or absent, and questions about the participant's estimated postexamination probability (post-EP) and choice of diagnostic strategy, given the findings. The key historical items and exam findings included in each case are detailed in Table 1. Participants were instructed to assume that the pre-EPs were accurate, that the exam findings were based on examinations conducted by competent clinicians, and that there were no barriers to performing any tests if so needed to arrive at a diagnosis. Pre-EPs were provided because we wanted to start each participant at the same probability, as clinicians vary widely in their initial assessments of condition probability.35

Table 1
Table 1:
Description of Case Histories and Associated Examination Scenarios Presented in a Survey of 684 Medical Students, Internal Medicine Residents, and Academic General Internists at Three U.S. Medical Schools, 2008

For each of the four conditions, the survey presented two examination scenarios: one consisting of an examination of positive findings, making the condition more likely, and a second examination consisting largely of negative findings, making the condition less likely. For each scenario, the survey asked participants to provide a post-EP, expressed in terms of a whole number percentage ranging from 0 to 100. It also asked participants to select one of three diagnostic options: tell the patient that he or she has the condition in question, tell the patient that further testing is required for diagnosis, or tell the patient that he or she does not have the condition in question. Participants were not asked to provide specific treatment regimens or to choose specific diagnostic tests.

We neither encouraged nor discouraged participants from using outside resources. An unlimited amount of time was given to complete the survey, although, once started, the survey had to be completed in a single sitting.

Primary outcomes and sample size calculations

The primary outcome measures were (1) participants' mean post-EPs using inverse transformed logit (ITL) values, (2) comparison of participants' ITL mean post-EPs with corresponding literature-derived post-EPs, and (3) diagnostic option selected by participants for each of the eight scenarios.

We determined a priori to sample sufficient numbers of participants to allow for meaningful comparison across groups. For sample size calculations, we estimated response rates of 70% for the student and resident groups and 50% for the faculty group. For calculations involving post-EPs, we determined that we would have 90% power to detect an effect size of 0.15 (defined as the average group deviation divided by the within-group standard deviation) for comparing mean post-EPs across the three groups using an analysis of variance (ANOVA) with an α of .05 and an average sample size of 195 per group.36 For calculations involving diagnostic strategy, we determined that a total sample size of 590 (175 students, 290 residents, and 125 faculty) would achieve 90% power to detect an effect size of 0.16 using the chi-square test of independence.36

Study recruitment

After we identified 906 eligible individuals, we sent each an e-mail invitation to complete the survey during an eight-week period in the fall of 2008. To increase the response rate, we sent weekly e-mail reminders. We also indicated that we would give a $15 gift card to students and residents who completed the survey and would enter all participants into a drawing for three prizes.

Analysis of data

We used descriptive statistics to characterize participants in terms of sociodemographic data.

Pre-EPs and literature-derived post-EPs were calculated by using data obtained from the RCES.25,37–40 Calculation of pre-EPs began with the authors' estimate of condition prevalence. This initial probability estimate was then sequentially revised for each historical item using published likelihood ratios in a step-by-step manner by making the posterior probability of the first item the initial probability of the second item, and so on. A similar approach was followed to calculate literature-derived post-EP point estimates and their associated 95% confidence intervals by using the pre-EP as the initial probability then adding in the examination findings. To account for the possibility that exam findings with similar pathophysiologic mechanisms may not be conditionally independent, we also calculated “adjusted” literature-derived post-EPs as follows: For groups of findings with a theoretically similar pathophysiologic basis (i.e., bulging flanks, shifting dullness, and fluid wave), we used only the finding with the most extreme likelihood ratio and ignored the other findings. Table 2 details the calculation of unadjusted and adjusted literature-derived post-EPs.

Table 2
Table 2:
Calculation of Literature-Derived Postexam Probabilities for Four Scenarios Presented as Part of a Survey of 684 Medical Students, Internal Medicine Residents, and Academic General Internists at Three U.S. Medical Schools, 2008
Table 2
Table 2:
(Continued)

For each of the eight exam scenarios, we calculated the participants' ITL mean post-EP for the total sample and for each group. ITL values were used instead of raw responses in order to account for the fact that the probability scale behaves differently at the extremes than it does in the middle range. We converted the participants' raw responses to logit values using the following formula: ln(p / (1 − p)) where ln = natural logarithm and p = probability value. After calculating mean and 95% confidence intervals for the logit values, we then converted these values back to traditional probabilities using the following formula: e(p / (1 + p)) where e = exponential function and p = probability function to arrive at the ITL values. We then used ANOVA and t tests to determine the degree of concordance among the study groups and between the study-derived values and the literature-derived values. When overall differences were detected between groups, we used the Scheffé method to test for pairwise differences.

For the diagnostic strategy outcome, we calculated the frequencies of options chosen for each scenario by the total sample and each group. We used chi-square tests, Fisher exact tests, and 3 × 3 contingency tables to compare distributions of choices and tests of marginal homogeneity to analyze changes in the choice of diagnosis option made by individuals responding to the two scenarios for each case (within-person changes).

All statistical analyses were performed using STATA 10.0 (StataCorp, College Park, Texas).

Results

Response rates and characteristics of participants

Of 906 individuals invited to participate, 684 (75%) completed the survey. The total sample of participants consisted of 255 students, 264 residents, and 165 academic general internists. The response rate was highest among students (80%) and lowest among faculty (72%). Each institution had at least a 65% response rate.

As shown in Table 3, participation in a formal physical examination curriculum was nearly universal during medical school (91% of total sample) but less common during residency (24%). Training in evidence-based medicine was also more common during medical school (70% of total sample) than during residency (59%), but the faculty group was less likely than the other groups to have had this training.

Table 3
Table 3:
Response Rates and Sociodemographic Characteristics of 684 Medical Students, Internal Medicine Residents, and Academic General Internists at Three U.S. Medical Schools Who Responded to the Survey, 2008

Conditional probabilities

Table 4 provides a summary of the conditional probabilities for each case. For all eight examination scenarios, the participants' ITL mean post-EPs were significantly different than the unadjusted literature-derived post-EP point estimates (P < .001 for each). When comparing the participants' ITL mean post-EPs with corresponding adjusted literature-derived post-EP values, comparisons in five of eight scenarios revealed significant differences (P < .001 for each).

Table 4
Table 4:
Conditional Probabilities for Cases and Associated Examination Scenarios Presented as Part of a Survey of 684 Medical Students, Internal Medicine Residents, and Academic General Internists at Three U.S. Medical Schools, 2008

In all four scenarios consisting of positive findings, the participants' ITL mean post-EPs were significantly lower than the unadjusted literature-derived post-EP point estimates (P < .001 for each). When comparing the participants' ITL mean post-EPs with corresponding adjusted literature-derived post-EP values, comparisons in two of four scenarios were significantly lower (P < .001 for each).

In all four scenarios consisting mainly of negative findings, the participants' ITL mean post-EPs were significantly higher than the unadjusted literature-derived post-EP point estimates (P < .001 for each). When comparing the participants' ITL mean post-EPs with corresponding adjusted literature-derived post-EP values, comparisons in three of four scenarios were significantly higher (P < .001 for each).

Findings for all eight scenarios were consistent across groups, with only small differences seen (the mean absolute difference in post-EP estimates between groups was 2.6%).

Diagnostic strategy options

Table 5 shows the diagnostic options selected for each scenario by the three groups and the total sample. In four of the eight scenarios (positive and negative scenarios for ascites, and positive scenarios for streptococcal pharyngitis and acute ACL tear), there were significant differences between groups in terms of the frequencies at which each diagnostic option was chosen. However, in all eight scenarios, the relative ordering of diagnostic options (i.e., from most frequently chosen to least frequently chosen) was consistent for all groups.

Table 5
Table 5:
Diagnostic Option Selected by Participants for Each of Four Examination Scenarios Presented as Part of a Survey of 684 Medical Students, Internal Medicine Residents, and Academic General Internists at Three U.S. Medical Schools, 2008

In the four scenarios with positive findings, most participants (range, 62%–83%) chose to tell the patient that he or she had the condition and treat accordingly. Interestingly, 17% to 38% of participants ordered additional testing even when the literature indicated a >85% probability that the condition was present. In the four scenarios with largely negative findings, the majority of participants (range, 70%–85%) chose to order diagnostic tests to further refine diagnostic uncertainty.

For all four conditions, a significant proportion of participants (P < .001) changed their diagnostic decision when the scenario changed from positive to negative findings: 82% regarding ascites, 85% regarding heart failure, 67% regarding streptococcal pharyngitis, and 50% regarding an acute ACL tear. For each condition, most participants changed from telling the patient that he or she had the condition in the scenario with positive findings to ordering tests to confirm the diagnosis in the scenario with largely negative findings.

Discussion

In this multiinstitutional study examining how medical students, residents, and faculty estimate conditional probabilities and choose diagnostic options for four commonly encountered conditions, we found that these groups tended to similarly undervalue physical examination findings and that they tended to undervalue negative findings to an ever greater extent than they undervalued positive findings.

There are several possible explanations for our findings. First, these results may in fact represent a true undervaluing of the physical examination. This could support the previous finding that physicians value diagnostic testing more than findings from the history and physical examination regardless of their diagnostic test characteristics.34 Second, these results may reflect the phenomenon of anchoring that was discussed previously.26,31 Third, although there is an increasing body of evidence available to help clinicians clarify the value of examination findings for a number of common conditions, it is not clear how widely this body of evidence is being used by clinicians. As a result, our findings could reflect that clinicians are unaware of this evidence and, hence, do not apply it to their decision making. Finally, although the instructions for our study indicated that all examinations were performed by competent clinicians, the participants may have lacked confidence in their own ability to perform physical examinations and may have factored in their own skills when they selected a post-EP or diagnostic option.

Interestingly, we observed only small and clinically insignificant differences between estimates of post-EPs provided by the students, residents, and faculty in our study. This suggests that physicians with more experience in performing examinations do not assign a greater value to examination findings than do trainees. This may also reflect the effect of faculty modeling on residents and students, leading each group to think and perform similarly and to undervalue findings. In the process, it may hinder the teaching of examination skills beyond the basic level.

We also found that large numbers of participants chose to order additional diagnostic testing even when available data suggest that the estimated condition probability is low. Although we did not ask participants to provide the threshold probabilities above and below which they would accept that no further workup is required, our results help to illustrate that physicians differ significantly in their comfort levels in dealing with uncertainty. Unfortunately, there is not a consensus amongst physicians about what threshold probabilities should be used for “ruling in” and “ruling out” individual disorders. This is not unexpected, because thresholds could be expected to vary among conditions of different severity and impact.

Responses to the questions about training in our study indicate that although the formal teaching of examination skills is nearly universal in medical schools, it is a much less common in residency programs. With most trainees lacking exposure to a formal curriculum during residency, it is not surprising that a widespread decline in clinical skills has been reported.9–20

Although our study had excellent response rates, it had several limitations. First, although it focused on four common conditions with high face validity (i.e., ascites, heart failure, streptococcal pharyngitis, and acute ACL tear), its findings may not be generalizable to other conditions. Second, some might argue that the use of the condition probability outcome is artificial because most physicians probably do not explicitly calculate condition probabilities for their patients. We attempted to address this concern by also including a more clinically important outcome—that is, choice of diagnostic strategy option. Third, in calculating probabilities, we used published likelihood ratios, but these ratios are of varying quality and precision, and some are based on research conducted many years ago when the gold standards for diagnosing conditions may have differed from those used today. Also, these published likelihood ratios were developed in specific patient populations and may not be generalizable to other populations. Fourth, when calculating probabilities, we assumed that each item was conditionally independent from the others used. To address the concern that this may not be true, we calculated adjusted probabilities that were more conservative and made the magnitude of our results smaller. Fifth, because we wanted participants to focus on examination findings rather than history items, we provided pre-EPs for all scenarios. Even though we instructed the participants to view the pre-EPs as accurate, it is possible that some of them ignored this instruction.

Conclusions

In this study, trainees and experienced physicians similarly underestimated the impact of examination findings when estimating condition probabilities and, as a consequence, often chose to order additional diagnostic testing to reduce diagnostic uncertainty. A better understanding of when and how physicians apply examination findings in their assessment of condition probability may provide the foundation for improving the way physicians use these observations in everyday clinical practice. This, in turn, may reduce the unnecessary use of expensive and potentially risky testing in today's increasingly cost-conscious and patient-safety-oriented environment.

Acknowledgments:

The authors thank Rosanne Granieri, MD, and Kevin Kraemer, MD, MSc, for their thoughtful input and support of the study.

Funding/Support:

This study was made possible by Grant Number UL1 RR024153 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH), and NIH Roadmap for Medical Research. Its contents are solely the responsibility of the authors and do not necessarily represent the official view of NCRR or NIH. Additional funding for this study was provided by the Shadyside Hospital Foundation. The Shadyside Hospital Foundation had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

Other disclosures:

None.

Ethical approval:

The institutional review boards (IRBs) at the University of Pittsburgh School of Medicine and the Alpert Medical School of Brown University gave expedited approval for the study; the IRB at the University of Virginia School of Medicine exempted the study from review.

Previous presentations:

This study was presented in part in poster format in 2009 at the 32nd annual meeting of the Society of General Internal Medicine (SGIM) in Miami, Florida.

References

1Sandler G. The importance of the history in the medical clinic and the cost of unnecessary tests. Am Heart J. 1980;100(pt 1):928–931.
2Kern DC, Parrino TA, Korst DR. The lasting value of clinical skills. JAMA. 1985;254:70–76.
3Sackett DL, Rennie D. The science of the art of the clinical examination. JAMA. 1992;267:2650–2652.
4Peterson MC, Holbrook JH, Von Hales D, Smith NL, Staker LV. Contributions of the history, physical examination, and laboratory investigation in making medical diagnoses. West J Med. 1992;156:163–165.
5Kravitz RL, Cope DW, Bhrany V, Leak B. Internal medicine patients' expectations for care during office visits. J Gen Intern Med. 1994;9:75–81.
6Mangione S, Peitzman SJ. Physical diagnosis in the 1990s: Art or artifact? J Gen Intern Med. 1996;11:490–493.
7Reilly BM. Physical examination in the care of medical inpatients: An observational study. Lancet. 2003;362:1100–1105.
8Smith MA, Burton WB, Mackay M. Development, impact, and measurement of enhanced physical diagnosis skills. Adv Health Sci Educ. 2009;14:547–556.
9Wiener S, Nathanson M. Physical examination: Frequently observed errors. JAMA. 1976;236:852–855.
10Wray N, Friedland J. Detection and correction of housestaff errors in physical diagnosis. JAMA. 1983;249:1035–1037.
11Johnson J, Carpenter J. Medical house staff performance in physical examination. Arch Intern Med. 1986;146:937–941.
12Li J. Assessment of basic physical examination skills of internal medicine residents. Acad Med. 1994;69:296–299. http://journals.lww.com/academicmedicine/Abstract/1994/04000/Assessment_of_basic_physical_examination_skills_of.13.aspx. Accessed February 11, 2011.
13Paauw DS, Wenrich MD, Curtis JR, Carline JD, Ramsey PG. Ability of primary care physicians to recognize physical findings associated with HIV infection. JAMA. 1995;274:1380–1382.
14Mangione S, Nieman LZ. Cardiac auscultation skills of internal medicine and family practice trainees. JAMA. 1997;278:717–722.
15Mangione S, Nieman LZ. Pulmonary auscultatory skills during training in internal medicine and family practice. Am J Respir Crit Care Med. 1999;159:1119–1124.
16Ozuah PO, Dinkevich E. Physical examination skills of US and international medical graduates. JAMA. 2001;286:1021.
17Peixoto AJ. Birth, death, and resurrection of the physical examination: Clinical and academic perspectives on bedside diagnosis. Yale J Biol Med. 2001;74:221–228.
18Jauhar S. The demise of the physical exam. N Engl J Med. 2006;354:548–551.
19Verghese A. Culture shock—Patient as icon, icon as patient. N Engl J Med. 2008;359:2748–2751.
20Fred HL, Grais IM. Bedside skills: An exchange between dinosaurs. Tex Heart Inst J. 2010;37:205–207.
21Sackett DL. A primer on the precision and accuracy of the clinical examination. JAMA. 1992;267:2638–2644.
22Holleman DR, Simel DL. Quantitative assessments from the clinical examination: How should clinicians integrate the numerous results? J Gen Intern Med. 1997;12:165–171.
23Hatala R, Smieja M, Kane SL, Cook DJ, Meade MO, Nishikawa J. An evidence-based approach to the clinical examination. J Gen Intern Med. 1997;12:182–187.
24McGee S. Evidence-Based Physical Diagnosis. 2nd ed. St. Louis, Mo: Saunders Elsevier; 2007.
25Simel D, Rennie D. The Rational Clinical Examination: Evidence-Based Clinical Diagnosis. New York, NY: McGraw-Hill; 2008.
26Elstein AS. Heuristics and biases: Selected errors in clinical reasoning. Acad Med. 1999;74:791–794. http://journals.lww.com/academicmedicine/Abstract/1999/07000/Heuristics_and_biases__selected_errors_in_clinical.12.aspx. Accessed February 11, 2011.
27Goodman SN. Toward evidence-based medicine statistics. 2: The Bayes factor. Ann Intern Med. 1999;130:1005–1013.
28Lurie JD, Sox HC. Principles of medical decision making. Spine. 1999;24:493–498.
29Elstein AS, Schwartz A. Evidence base of clinical diagnosis: Clinical problem solving and diagnostic decision making: Selective review of the cognitive literature. BMJ. 2002;324:729–732.
30Gill CJ, Sabin L, Schmid CH. Why clinicians are natural bayesians. BMJ. 2005;330:1080–1083.
31Tversky A, Kahneman D. Judgment under uncertainty: Heuristics and biases. Science. 1974;185:1124–1131.
32Bergus GR, Chapman GB, Gjerde C, Elstein AS. Clinical reasoning about new symptoms in the face of pre-existing disease: Sources of error and order effects. Fam Med. 1995;27:314–320.
33Chapman GB, Bergus GR, Elstein AS. Order of information affects clinical judgment. J Behav Decis Making. 1996;9:201–211.
34Halkin A, Reichman J, Schwaber M, Paltiel O, Brezis M. Likelihood ratios: Getting diagnostic testing into perspective. Q J Med. 1998;91:247–258.
35Phelps MA, Levitt A. Pretest probability estimates: A pitfall to the clinical utility of evidence-based medicine? Acad Emerg Med. 2004;11:692–694.
36NCSS, PASS, and GESS [software]. www.ncss.com. Accessed January 20, 2011.
37Williams JW Jr, Simel DL. The rational clinical examination. Does this patient have ascites? How to divine fluid in the abdomen. JAMA. 1992;267:2645–2648.
38Wang CS, FitzGerald JM, Schulzer M, Mak E, Ayas NT. Does this dyspneic patient in the emergency department have congestive heart failure? JAMA. 2005;294:1944–1956.
39Ebell MH, Smith MA, Barry HC, Ives K, Carey M. Does this patient have strep throat? JAMA. 2000;284:2912–2918.
40Solomon DH, Simel DL, Bates DW, Katz JN, Schaffer JL. Does this patient have a torn meniscus or ligament of the knee? Value of the physical examination. JAMA. 2001;286:1610–1620.

References cited only in the Appendix

41Jaynes ET. Probability Theory: The Logic of Science. 3rd ed. New York, NY: Cambridge Press; 2003.
42Straus SE, Richardson WS, Glasziou P, Haynes RB. Evidence-Based Medicine. 3rd ed. New York, NY: Churchill Livingstone; 2005.

Appendix

Bayes' Theorem

Bayes' theorem was first developed by Sir Thomas Bayes, an 18th century English minister and amateur mathematician.30 When employing a Bayesian approach to probability assessment, one starts with an initial probability estimate that is based on one's knowledge of disease prevalence or from one's previous experiences.35 This initial probability estimate, termed the prior probability, is then sequentially modified on the basis of each piece of additional evidence encountered to form new probabilities, termed posterior probabilities.22,30

In mathematical terms, Bayes' theorem can be stated as follows:

In this theorem,

P(x) = the probability of condition x being present.

P(A) = the probability of A being present.

P(x|A) = the probability of condition x being present given the presence of A.

P(A|x) = the probability of A being present given the presence of condition x.

Thus, in this formula,41 P(x|A) is the posterior probability and P(x) is the prior probability.

Bayes' theorem is most commonly expressed using likelihood ratios (LRs). Fortunately, the LRs for many physical examination findings are now available, although the data are of varying quality.24,25 An LR is the likelihood that a given finding would be expected in a patient with a particular disorder P(A|x) compared to the likelihood that the same finding would be expected in a patient without that condition P(A|no x).30,34,42 An LR > 1 for a finding means that the condition is more likely given the finding and results in a posterior probability that is greater than the prior probability. An LR = 1 for a finding does not change the probability of the condition being present and results in a posterior probability that is the same as the prior probability. An LR <1 for a finding means that the condition is less likely given the finding and results in a posterior probability that is less than the prior probability.24,25,34,40,42

When using Bayes' theorem with likelihood ratios, one must convert from probabilities to odds (probability/1-probability) and then back to probabilities.34,42 One starts by calculating the prior odds from prior probability by using the following formula: prior odds = prior probability/(1-prior probability).42 Next, the posterior odds is calculated using the LR for the finding and the prior odds as follows: posterior odds = LR * prior odds.34 In this formula, the LR represents the weight of new evidence encountered. The posterior odds can then be converted back to a probability as follows: posterior probability = posterior odds/(posterior odds+1).34,42 The task is made easier by use of one of many published nomograms.34,42

As an example, suppose that you are seeing a 24-year-old man who complains of two days of right knee pain that began while he was playing football. He reports hearing a “pop” after being hit by another player. He feels like the knee is going to buckle. Based on the information that you obtain as part of the history and your clinical experience, your probability estimate for him having suffered an anterior cruciate ligament (ACL) tear is 50%. Next, you examine the knee. As part of your knee examination, you perform a Lachman test, which you find to be positive. You recall that a positive Lachman test has a likelihood ratio of 42 (95% CI, 2.7, 651).40

First, you determine the prior odds:

Prior odds of ACL tear = prior probability/(1-prior probability) = 0.50(1−0.50) = 1

Then, you calculate the posterior odds using the Bayes' theorem formula that includes likelihood ratios:

Posterior odds of ACL tear = LR * prior odds = 42 * 1 = 42

Finally, you convert the posterior odds to a posterior probability:

Posterior probability of ACL tear = posterior odds/(1+posterior odds) = 42/(1+42) = 0.977

Thus, you determine that the posterior probability that this patient has an ACL tear given a prior probability of 50% and the presence of a positive Lachman test is approximately 98%.

© 2011 Association of American Medical Colleges