Consider the case of a 55-year-old patient presenting to your clinic with isolated medial compartment pain and radiographic findings of medial compartment osteoarthritis (OA). On physical examination, the patient has varus alignment, a passively correctable deformity, and a clinically intact anterior cruciate ligament. The two existing trials [21, 23] that directly compare the effectiveness of high tibial osteotomy (HTO) with unicompartmental knee arthroplasty (UKA), two treatment options available to these patients, provide weak evidence for the superiority of one or the other surgical technique. As a result, there are at least two groups of surgeons: those who believe these patients should be treated with HTO and those who believe UKA is the best option. How would you treat this patient? What research evidence supports your decision?
Randomized controlled trials are recognized as the most valid method of evaluating the effectiveness of interventions. In the conventional RCT design, patients are randomized to treatment groups and surgeons perform both interventions depending on the treatment arm to which the patient was assigned (Fig. 1A). Although surgeons who wish to participate must be willing to perform both techniques, lack of expertise or belief in one of the interventions under evaluation may undermine the validity and applicability of the results. It also may create an ethical challenge for the participating surgeon if that surgeon believes one treatment is likely better presenting a barrier to recruitment [18, 22]. This may explain in part the resistance of surgeons to fully buy in to trial participation.
An alternative RCT design, originally coined the nonrandomized surgeon design  and more recently referred to as the expertise-based RCT  may offer a practical solution to these barriers. In this type of trial, a surgeon with expertise in one of the procedures being evaluated is paired with a surgeon with expertise in the other procedure (ideally at the same institution). Subjects then are randomized to a surgeon, who performs only one of the interventions (ie, the procedure that he or she has expertise and belief in) throughout the course of the trial (Fig. 1B).
The expertise-based design has several potential advantages over the conventional RCT. For example, it may decrease the likelihood of procedural crossovers and enhance validity because unlike the conventional RCT, there is a low likelihood of differential expertise bias . Several authors have explored the advantages and limitations of the expertise-based RCT design [4, 18, 22].
Despite the empiric evidence and theoretical arguments favoring the expertise-based RCT over the conventional RCT, if surgeons do not also prefer this design, then using the expertise-based design in future trials may present an even bigger barrier to recruitment than currently is encountered with the conventional RCT design. We assessed orthopaedic surgeons' willingness and preference for participating in an expertise-based versus a conventional RCT. We then explored the relationship between surgeon expertise, RCT recruiting experience and research education, and surgeon willingness and preference for participating in an expertise-based versus a conventional RCT. We also asked surgeons to comment on barriers to RCT participation in general.
Materials and Methods
We performed a cross-sectional survey of the 767 surgeons on the 2005 Canadian Orthopaedic Association members' list. Three hundred thirty-one surgeons did not respond to the survey, 129 declined participation, and 31 could not be contacted (no e-mail, fax, or telephone contact information was available) or were not eligible (participated in the pilot phase of the survey). Two hundred seventy-six surgeons responded, giving a response rate of 37.5% . The Research Ethics Board at the University of Western Ontario approved this study.
We created our survey using Survey Monkey (©2005 SurveyMonkey.com), which allowed electronic self-administration and facilitated data collection and followup by preventing multiple entries from the same individual. To ensure proper functioning of images and links, and to check for clarity and content validity of our questionnaire, we pilot-tested the survey on 4 local orthopaedic surgeons and 5 administrative staff and incorporated their comments and suggestions into the final version of the survey before launching the study.
Surgeons with an active e-mail address were sent an information letter, which outlined the purpose of our study and identified the investigators. The e-mail contained a link to the survey. Surgeons without electronic access were faxed a hard copy of the survey (Appendix 1). Surgeons were informed the survey would take approximately 10 minutes to complete. They also were informed individual responses would be identified in our database by a unique identification number accessible only to the investigator. Surgeons were informed survey completion was voluntary and were given the option to decline with no consequence. No incentives were offered for survey completion. Surgeons who expressed no desire to complete the survey, by indicating their choice in a return e-mail or fax, were removed from the contact list and not contacted further. Consent was implied if surgeons completed and submitted the survey.
Followup e-mails were sent 2 weeks apart, followed by personal telephone calls and faxed hard copies of the survey to encourage participation. Each new e-mail wave was sent to the nonresponders of the previous wave for a total of six e-mail waves and one fax wave. Response rates for individual waves were: Wave 1: n = 32 (15.9%); Wave 2: n = 17 (8.5%); Wave 3: n = 18 (9%); Wave 4: n = 22 (10.9%); Wave 5: n = 22 (10.9%); Wave 6: n = 52 (25.9%); fax wave: n = 38 (18.9%).
We collected data between August and November 2005. The electronic survey was administered in a predetermined sequence over eight pages, with the number of questions ranging from one to six per page. The faxed survey was six pages long, with the number of questions ranging from one to seven items per page. We used adaptive questioning to ensure participants answered only the questions that pertained to them. The electronic version of the survey was designed not to allow blank responses to ensure all questions were answered. Participants were given the option to go back to review and/or change their responses during the electronic survey; the final response submitted was used in the analysis. Owing to the nature of the faxed survey, we were not able to control for completeness and review of responses in the faxed wave.
We obtained information regarding surgeons' type of practice, whether their clinical practice includes patients with OA, their subspecialty(ies), gender, research experience, research education, hospital affiliation (including city and province), and the number of years since completion of orthopaedic training. Surgeons who indicated their practice involves patients with medial compartment OA were asked to indicate the number of UKAs and HTOs they had performed throughout their career and during the previous year; and their opinion regarding relative superiority of HTO versus UKA for treatment of medial compartment OA in middle-aged patients.
Next, we provided surgeons with an unbiased description of the conventional and expertise-based RCT designs and included a diagram to aid comprehension of each design (Fig. 1). Surgeons were asked to indicate their willingness to participate in a conventional RCT and in an expertise-based RCT to compare outcomes of patients with medial compartment OA who underwent either an HTO or UKA. Finally, we asked surgeons to specify in which of the two types of RCT design they would prefer to participate. The questionnaire framed response options as Likert-type scales (seven choices per question) with open-ended options to offer surgeons the opportunity to provide a more detailed explanation for opinions and preferences, if desired.
To assess the likelihood of response bias (ie, the extent that responses in successive survey waves [sampling from nonresponders to prior waves] are similar to those in the first wave), we created forest plots (Fig. 2) to provide an illustration of the proportion of surgeons in successive waves who prefer the evidenced-based RCT and the proportion of surgeons willing to participate in evidence-based RCTs. We used the Cochrane chi square test to formally assess whether responses were similar and consistent across survey waves (ie, test for heterogeneity). If response patterns across survey waves are consistent (ie, those who required several reminders before responding are similar to those who responded immediately), we can be confident survey responders are representative of nonresponders. If response patterns are inconsistent across waves, than we would be uncertain as to the likely similarity between responders and nonresponders [14, 15]. The test for heterogeneity on our data regarding surgeons' preference for (p = 0.21) and willingness to (p = 0.34) participate in a conventional or expertise-based RCT design indicated similar response patterns across e-mail and fax waves.
Our analysis of the survey results included only surgeons whose current practice included patients with knee OA (n = 201). In general, we summarized responses for categorical and dichotomous variables using proportions and conducted a chi square test to make comparisons between groups. A p value < .05 was considered significant. Missing data were excluded from the analysis. All analyses were calculated using SPSS 11.0 (SPSS Inc, Chicago, IL). Tables and figures were generated using Microsoft® Office Excel (Microsoft Inc, Redmond, WA).
Two hundred seventy-six surgeons responded to our survey. Almost two-thirds of eligible respondents (63.8% [n = 166]) were from Ontario, British Columbia, and Alberta; most were male (83.8% [n = 218]). Seventy-three percent of surgeons [n = 201] saw patients with knee OA in their clinical practice, and nearly all worked in a full-time practice (95.5% [n = 192]) at this time of our survey. In this group, 63 surgeons (31.3%) had greater than 20 years clinical experience and another 60 surgeons (29.8%) had practiced for 10 to 20 years. Surgeons were able to select more than one subspecialty and the most common areas were arthroplasty (54.7% [n = 110]), sports injury (34.8% [n = 70]), and trauma (24.4% [n = 24]). One hundred eighteen surgeons (58.7%) had RCT recruiting experience (defined as having participated in recruiting and/or entering patients into RCTs and/or planning and implementing RCTs), whereas only 26 surgeons (12.9%) had received formal research education (defined as having completed a thesis-based MSc and/or PhD) (Table 1).
Of the 198 surgeons who specified the number of UKAs they have performed, 100 (49.8%) performed zero UKAs, 51 (25.4%) performed one to nine, 29 (14.4%) performed 10 to 20, and 18 (9.0%) performed more than 20 in the year before this study (range, 0-250). Seventy-two surgeons (35.8%) indicated they had performed zero UKAs throughout their career, 99 (49.3%) performed one to 50, and 27 (13.4%) performed more than 50 throughout their career (range, 0-1400).
Of the 195 surgeons who specified the number of HTOs they have performed, 101 (50.2%) performed zero procedures, 77 (38.3%) performed one to nine, 15 (7.5%) performed 10 to 20, and two (1.0%) performed more than 20 in the year before this study (range, 0-30). Thirty-four surgeons (16.9%) performed zero HTOs, 129 (64.2) performed one to 50, and 32 (15.9%) performed more than 50 throughout their career (range, 0-300).
When we arbitrarily defined expertise as having performed 10 or more of the procedure (either HTO or UKA) during the past year, 68.6% (n = 138) of surgeons had no expertise in either intervention, 21.4% (n = 43) had expertise only in performing UKA, 6.5% (n = 13) had expertise only in performing HTO, and 2.0% (n = 4) had expertise in performing both interventions. We were unable to classify three of the surgeons (1.5%) because they did not provide the number of each procedure they had performed during the past year.
One hundred two surgeons (53.4%) were willing to participate in an expertise-based RCT compared with 35 surgeons who were willing to participate in a conventional RCT (18.3%) (p < 0.001). Ten (5.0%) of 201 surgeons did not respond to this question. Ninety-seven surgeons (52.4%) strongly or moderately preferred the expertise-based RCT design, 25 (13.5%) strongly or moderately preferred the conventional RCT, six (3.2%) mildly preferred the conventional RCT, 10 (5.4%) mildly preferred the expertise-based RCT, and 47 (25.4%) had no preference for either trial design (p < 0.001) (Fig. 3). Sixteen (8.0%) of 201 surgeons did not respond to this question.
Surgeons with expertise in only one intervention were more likely (p < 0.001) to select that particular intervention as markedly or definitely superior. All surgeons with expertise in both procedures had an opinion (lack of individual equipoise) about which procedure was superior (p < 0.001), with three surgeons indicating UKA is probably superior and one surgeon indicating HTO is probably superior (Fig. 4).
Surgeons' level of expertise with an intervention influenced (p = 0.001) the type of RCT they were willing to participate in with 38.4% (n = 73) of surgeons willing to participate in an expertise-based RCT only (49 surgeons with no expertise in either procedure; 18 surgeons with expertise in UKA only; five surgeons with expertise in HTO only; one surgeon with expertise in both procedures) and 3.2% (n = 6) of surgeons willing to participate in a conventional RCT only (four surgeons with no expertise in either procedure; one surgeon with expertise in HTO only; one surgeon with expertise in both procedures). Twenty-nine (15.3%) surgeons were willing to participate in both types of RCTs (12 surgeons with no expertise in either procedure; 13 surgeons with expertise in UKA only; two surgeons with expertise in HTO only; two surgeons with expertise in both procedures). Eighty-two (43.2%) surgeons were not willing to participate in either RCT design (66 surgeons had no expertise in either procedure; 11 surgeons had expertise in UKA only; five surgeons had expertise in HTO only).
Of the 82 surgeons unwilling to participate in either type of RCT, 40 (48.8%) provided a reason for their choice. Eleven surgeons (27.5%) indicated a lack of expertise in either procedure, 10 (25.0%) implied they were in favor of the expertise-based design with no reason provided for not wanting to participate, five (12.5%) indicated their low patient volume made study participation difficult, four (10.0%) were not interested in research, two (5.0%) believe the expertise-based design falls short by not generalizing beyond experts; one surgeon cited difficulties with feasibility with the expertise-based design, and one surgeon believed the research question already had been answered. The remaining surgeons did not believe in either procedure (n = 1), had no preference for either design (n = 1), or provided a noninterpretable response (n = 4).
Surgeons' level of expertise also influenced their preference for participating in RCTs. Of the 53 surgeons with expertise in only one intervention, 32 (60.4%) surgeons strongly or moderately preferred the expertise-based RCT design, eight (15.1%) strongly or moderately preferred the conventional design, and eight (15.1%) had no preference. One surgeon (with expertise in HTO) mildly preferred the conventional RCT design, and four (all with expertise in UKA) mildly preferred the expertise-based RCT design. Surgeons with expertise in both procedures preferred either the conventional RCT (n = 2) or had no preference (n = 1); one surgeon did not indicate his preference. Neither research experience nor formal research education influenced the type of RCT design in which surgeons were more willing to participate (p = 1.00, p = 0.52, respectively) or which RCT design they preferred (p = 0.33, p = 0.96, respectively).
The merits of the expertise-based RCT design have been discussed in several reports [4, 18, 22]. The objectives of our study included exploring surgeons' willingness and preference for the expertise-based design compared with the conventional RCT design. Among the surgeons who stated their willingness to participate in an RCT (approximately 56%), we found 94.4% were willing to participate in an expertise-based RCT whereas only 32.4% were willing to participate in a conventional RCT (26.8% were willing to participate in both designs). In terms of preference, we found 57.8% of surgeons preferred the expertise-based design compared with 16.8% who preferred the conventional design and 25.4% who had no preference.
One limitation of this study is the low response rate leaving some uncertainty regarding whether the opinions of responders will generalize to nonresponders. Montori et al.  reported that procedures assessing inconsistency in meta-analyses also can be used to assess the likelihood of response bias in multiwave surveys (eg, to evaluate the validity of surveys with low response rates). Using their approach, we found no evidence that nonresponders would have responded in such a way as to substantially alter the findings of this survey.
A second limitation may be our definition of expert. Although we defined an expert as having performed the procedure at least 10 times during the past year, there is no accepted definition of what constitutes an expert. The literature contains a broad spectrum of definitions from the liberal (eg, those who are a certain number of years post-training, those who have completed a certain number of cases) to conservative (eg, those with a certain success rate, those who have documented their expertise at the plateau of the learning curve, etc). It is possible that had we defined expert differently, we may not have observed the positive relationship between expertise and willingness to participate in or preference for the expertise-based RCT design.
There are several issues with feasibility of implementing an expertise-based design that this survey does not address. First, an expertise design requires at least one expert in each of the interventions being compared at each participating center or some convenient way in which to shuffle patients from one center to another. Second, to avoid biasing the patient, it may be necessary for potential patients in a center to have their initial consultation with a neutral party (eg, a fellow or other qualified health professional) who will determine study eligibility, explain the study and the need for its unique design (ie, uncertainty in the surgical community regarding the superiority of either intervention or community equipoise versus individual surgeon equipoise), and obtain consent before randomization to surgeon occurs. Third, an expertise-based design presents unique challenges in acute settings where, to participate in an expertise-based design, each participating facility needs to have two specialists on-call, one with expertise in one of the two procedures being investigated and the other with expertise in the other procedure being investigated. Despite these perceived challenges, we are aware of five published expertise-based RCTs (four in orthopaedics [6, 16, 24, 25]; one in cardiac surgery ) and several currently being conducted (two orthopaedic; one vascular surgery).
It has been said that equipoise, “a state of balance or equilibrium between two alternative therapies”  such that “there is no preference [on the part of the clinician] between treatments…”  is the state of mind under which a randomized trial is ethical. More recently however, the notion of equipoise has been abandoned as theoretical, describing true equipoise on the part of a clinician as unlikely. Instead, this concept has been replaced with the term uncertainty, which reflects a more common state of mind; the clinician has an idea that a certain treatment is probably superior to the current standard but is uncertain whether he or she is correct. A clinical trial is initiated when there is clinical equipoise, or a “genuine uncertainty on the part of the expert clinical community about the comparative merits of two or more treatments for a defined group of patients or population” . The notion of the expertise-based RCT may be a more attractive and valid alternative study design when individual uncertainty does not exist but community equipoise is present.
Although UKA and HTO have been proposed for treatment of medial compartment OA of the knee and currently are used in clinical practice, we found the majority of surgeons we surveyed have experience performing either UKA or HTO, but not both. In addition, the majority of surgeons surveyed believe in the superiority of only one procedure, which indicates a lack of individual uncertainty. All four surgeons who fit our definition of having expertise in both procedures expressed an opinion regarding the superiority of one of the procedures.
It is possible that the lack of individual surgeon uncertainty may influence (consciously or unconsciously) how the surgeon performs the operation and the postoperative rehabilitation protocol that he or she prescribes . Surgeons may unknowingly perform their preferred procedure more meticulously or follow up with these patients differently compared with patients who underwent the procedure that the surgeon does not have as much experience with, thereby biasing the results in favor of their preferred procedure. All of these opportunities for potential biases bring into question the validity of implementing a conventional RCT to address this particular research question.
However, one must recognize how interpretation of results of an expertise-based trial differs from interpretation of results of a conventional RCT. In a conventional RCT in which differential expertise bias is not present, the results of the trial inform which intervention is superior. If an imbalance in expertise exists or there is a lack of individual equipoise about the superiority of one of the interventions being tested, one cannot ensure the results inform the superiority of the intervention, whether they reflect biases introduced by differential expertise, or both. The positive results of an expertise-based RCT inform the question of which intervention is superior in the hands of a surgeon with expertise in the intervention of interest. Surgeon experience predicts patient outcomes [1-3, 8-11, 14, 17, 19]. Simply put, surgeons who do not have expertise in the surgical technique deemed superior through an expertise-based design, must first develop expertise in performing the superior procedure to expect to experience outcomes similar to those observed in the trial.
Finally, we cannot ignore that 83 surgeons (41.5%) who responded to our survey were unwilling to participate in an RCT to address the research question we presented (HTO versus UKA for middle-aged adults with medial compartment knee OA) regardless of design. The most common reasons provided were a self-perceived lack of expertise in either or both techniques and an insufficient number of patients with the condition of interest to warrant study participation. Although not reported in our survey, others have reported additional barriers to participation including subjugation of individual patient care to resolve clinical uncertainly for the common good of future patients, the belief that the permanence of surgery negates a patient's ability to withdraw from the intervention should things go wrong, and difficulty generating enthusiasm for an RCT once the technique is established by experience; a necessary step before any skilled-based intervention is tested using an RCT . Finally, the length of followup necessary to determine outcome is much longer for surgical than for medical interventions, generating logistic and feasibility issues .
Given many surgeons hold a strong opinion about the relative superiority of interventions for which they have expertise, investigators should determine surgeons' uncertainty before initiating a research trial. When there is clinical equipoise regarding the superiority of two surgeries but there is a lack of individual surgeon uncertainty, or the majority of participating surgeons have expertise in one but not both procedures being compared, then the results of a conventional trial likely are biased. In the example we present, we believe the expertise-based design will enhance the validity of the trial results; our limited data suggest the expertise-based design also may be more feasible as the orthopaedic surgeons who participated in this survey preferred this design. We recommend researchers investigating the superiority of skill-based interventions consider using the expertise-based RCT design.
Expertise-Based Working Group: Elzbieta Bednarska, Mohit Bhandari, Dianne Bryant, Jason Busse, Claudio Cina, Deborah Cook, P.J. Devereaux, Gordon Guyatt, Brian Haynes, Diane Heels-Ansdell, Brad Johnston, Mary Law, Joy MacDermid, Tara Mastracci, Ed Mills, Victor Montori, David Sackett, Holger Schünemann, Stephen Walter, Salim Yusuf, and Qi Zhou.
1. Begg CB, Riedel ER, Bach PB, Kattan MW, Schrag D, Warren JL, Scardino PT. Variations in morbidity after radical prostatectomy. N Engl J Med
2. Birkmeyer JD, Stukel TA, Siewers AE, Goodney PP, Wennberg DE, Lucas FL. Surgeon volume and operative mortality in the United States. N Engl J Med
3. Bridgewater B, Grayson AD, Au J, Hassan R, Dihmis WC, Munsch C, Waterworth P. Improving mortality of coronary surgery over first four years of independent practice: retrospective examination of prospectively collected data from 15 surgeons. BMJ
4. Devereaux PJ, Bhandari M, Clarke M, Montori VM, Cook DJ, Yusuf S, Sackett DL, Cina CS, Walter SD, Haynes B, Schunemann HJ, Norman GR, Guyatt GH. Need for expertise based randomised controlled trials. BMJ
5. Dillman DA. Mail and Telephone Surveys: The Total Design Method.
Toronto, Canada: John Wiley & Sons; 1978.
6. Finkemeier CG, Schmidt AH, Kyle RF, Templeman DC, Varecka TF. A prospective, randomized study of intramedullary nails inserted with and without reaming for the treatment of open and closed fractures of the tibial shaft. J Orthop Trauma
7. Freedman B. Equipoise and the ethics of clinical research. N Engl J Med
8. Habib M, Mandal K, Bunce CV, Fraser SG. The relation of volume with outcome in phacoemulsification surgery. Br J Ophthalmol
9. Hannan EL, O'Donnell JF, Kilburn H Jr, Bernard HR, Yazici A. Investigation of the relationship between volume and mortality for surgical procedures performed in New York State hospitals. JAMA
10. Hannan EL, Racz M, Kavey RE, Quaegebeur JM, Williams R. Pediatric cardiac surgery: the effect of hospital and surgeon volume on in-hospital mortality. Pediatrics
11. Hervey SL, Purves HR, Guller U, Toth AP, Vail TP, Pietrobon R. Provider volume of total knee arthroplasties and patient outcomes in the HCUP-nationwide inpatient sample. J Bone Joint Surg Am
12. Lilford RJ, Jackson J. Equipoise and the ethics of randomization. J R Soc Med
13. Machler HE, Bergmann P, Anelli-Monti M, Dacar D, Rehak P, Knez I, Salaymeh L, Mahla E, Rigler B. Minimally invasive versus conventional aortic valve operations: a prospective study in 120 patients. Ann Thorac Surg
14. Montori VM, Leung TW, Devereaux PJ, Schunemann HJ, Akl EA, Gafni A, Guyatt GH. Can contraindications compromise evidence-based, patient-centered clinical practice? Can J Clin Pharmacol
15. Montori VM, Leung TW, Walter SD, Guyatt GH. Procedures that assess inconsistency in meta-analyses can assess the likelihood of response bias in multiwave surveys. J Clin Epidemiol
16. Phillips WA, Schwartz HS, Keller CS, Woodward HR, Rudd WS, Spiegel PG, Laros GS. A prospective, randomized study of the management of severe ankle fractures. J Bone Joint Surg Am
17. Prystowsky JB, Bordage G, Feinglass JM. Patient outcomes for segmental colon resection according to surgeon's training, certification, and experience. Surgery
18. Rudicel S, Esdaile J. The randomized clinical trial in orthopaedics: obligation or option? J Bone Joint Surg Am
19. Sainsbury R, Haward B, Rider L, Johnston C, Round C. Influence of clinician workload and patterns of treatment on survival from breast cancer. Lancet
20. Singer PA, Lantos JD, Whitington PF, Broelsch CE, Siegler M. Equipoise and the ethics of segmental liver transplantation. Clin Res
21. Stukenborg-Colsman C, Wirth CJ, Lazovic D, Wefer A. High tibial osteotomy versus unicompartmental joint replacement in unicompartmental knee joint osteoarthritis: 7-10-year follow-up prospective randomised study. Knee
22. Linden W. Pitfalls in randomized surgical trials. Surgery
23. Weidenhielm L, Olsson E, Brostrom LA, Borjesson-Hederstrom M, Mattsson E. Improvement in gait one year after surgery for knee osteoarthrosis: a comparison between high tibial osteotomy and prosthetic replacement in a prospective randomized study. Scand J Rehabil Med
24. Wihlborg O. Fixation of femoral neck fractures: a four-flanged nail versus threaded pins in 200 cases. Acta Orthop Scand
25. Wyrsch B, McFerran MA, McAndrew M, Limbird TJ, Harper MC, Johnson KD, Schwartz HS. Operative treatment of fractures of the tibial plafond: a randomized, prospective study. J Bone Joint Surg Am
Appendix 1. Surgeons' Survey