We performed a quasi-experimental study using an evaluation of 4 readings (i.e., articles) used in an operating room (OR) management course.1–4 The 4 readings provided for a balanced quasi-experimental design because (a) 2 articles contained data with specific examples of application for health systems and 2 did not and (b) 2 articles contained appendices of formulas and 2 did not. We assessed the influence of these 2 factors on trust in the articles’ content, including its quality, usefulness, and reliability.
The study was motivated by many economic problems in OR and anesthesia group management having tasks with optimal decisions and yet experienced personnel make decisions that are worse or no better than random chance.5–12 Such decisions include staffing, staff scheduling, and case scheduling.5–12 Such tasks are intellectivea and generally not highly demonstrable (i.e., personnel can generally only understand and apply the decision-making process if they have attained knowledge through an educational course or research).13–18 In such settings, the quality of leadership decision-making is better when following an autocratic style rather than a participative group style.18 Autocratic decision-making calls for managers to solicit and consider feedback from stakeholders in the decision outcome and then to make the decision themselves.18,19 Such a model is practical, because typically only a handful of anesthesiologists are involved in staff scheduling, programming information systems, etc.20,21
To make good decisions, the manager needs expert advice.17,18,22 Searching for relevant article(s) in PubMed is ineffective for many OR management decisions because the searcher often needs to know the precise vocabulary before searching.17 Generally it is easier to ask an expert advisor essentially “what paper should I read” rather than trying to find the relevant paper through a database search.17
The expert can provide the autocratic decision-maker with answers to intellective problems via different communication channels22: face to face, video conference, telephone, live electronic chat, discussion group, and e-mail. Asynchronous written communication (i.e., e-mail) is currently the dominant channel. We recently reviewed experimental and observational studies and showed that it should be that way.22 E-mail provides improved time management via asynchronicity, low cognitive load, the ability to hide undesirable and irrelevant cues, the appropriateness of adding desirable cues (e.g., titles and degrees), the opportunity to provide written expression of confidence, the ability to demonstrate answers for the decision-maker, and the ability to get answers from people whom the decision-maker knows only remotely.22 Although richer communication channels facilitate the development of trust, this is unimportant for this application because managers e-mail advisors whose competence the manager trusts.22
However, our review was limited to “e-mail using clients providing…reading of attachments with [formulas] (when appropriate), highlighting of published papers, etc.”22 How attached articles influence managers’ trust in the recommendations is unknown. “Trust indicates a positive belief about the perceived reliability of, dependability of, and confidence in a process.”23 Education increases trust in recommendations and skill at evaluating when a recommendation may be based on incomplete information.5,15,16
The observations are not sufficiently precise as to whether and how articles that would be included with e-mail influence trust in the content. Presence of formulas, even when complex and difficult to understand, may serve as a cue that the information presented in the article is legitimate and scientifically supported. On the other hand, formulas may seem esoteric and theoretical compared with practical and supporting of scientific validity.
The University of Iowa institutional review board declared that this investigation did not meet the regulatory definition of human subject research. The OR management course starts with approximately 15 hours of statistics review and reading and learning the course articles before class (Table 1).16,17 The objective of reading the articles ahead is to learn the vocabulary.17 An annotated dictionary is provided to facilitate the reading.17 Then, there are 35 hours of class time for lectures and many cases.15 These are completed over 3.5 days while participants work in teams, typically of 3 participants. The course schedule, cases, readings, and lectures are at www.FranklinDexter.net/education.htm. Each team answers questions in its Excel file (Microsoft Inc., Redmond, WA). Adaptive feedback is provided until each question is answered correctly by the use of >10,000 binary statements.16 An example of what participants see is shown in Figure 2 of reference 16 and how it is produced using Excel formulas is shown in the appendix of that article. By the end of the course, each participant has the knowledge to evaluate his or her trust in each article.
For our study, we used 3 scales, each assessing what would potentially be a different facet of trust (quality, usefulness, and reliability). However, from “Data Available from Previous Year’s Course Evaluations and Prior Studies,” we expected the 9 items to provide a unidimensional construct.
Information quality was taken from the study by Wixom and Todd24 (Table 2). They used a 7-point Likert type scale, ranging from 1 (strongly disagree) to 7 (strongly agree). We used the same scale (Table 2). Their observed Cronbach α for the 3 items was 0.94.
Reliability of information content was measured using Shen et al.’s25 adaptation of Cheung and Lee’s scale.26 Prior Cronbach α for the 3 items have been 0.95 and 0.93, respectively.25,26
Information usefulness was adapted from Sussman and Siegal’s 3-item scale.25,27 Their observed Cronbach α was 0.87. However, they assessed usefulness for content of e-mail. We changed the phrase “in this e-mail” at the end of each item to “for my work.” To choose that wording, on January 6, 2015, we searched Google Scholar for all 30 potentially relevant phrases we could identify. Among these, 3 phrases each had at least 10 results: “helpful for my work,” 164; “valuable for my work,” 89; and “helpful for my job,” 16. The first 2 phrases were both contained within our evaluation’s items (Table 2).
Power analysis is in Appendix 1, section “Study Evaluation Form and Protocol.” As participants finished the course, usually around 11:45 am of the course’s fourth day, each was handed a 1-page, 2-sided anonymous (no identifier) evaluation form with 18 questions per page. There also was a 1-page cover form providing explanation that the evaluation was voluntary as well as multiple options for returning the evaluation form including mail, Fax, scanning, or taking a mobile phone picture and e-mailing it to an assistant, etc. A reminder request to complete the evaluation form was e-mailed to each course participant 1 week later. We stopped the study after we had not received an evaluation form for 1 month.
To print each evaluation form, 5 uniform random numbers were generated. The first random number determined whether the 4 articles were presented in either ascending or descending sequence of the content in the course (Table 3). The instructions were the same for each article: “Look again at [title]. Answer the questions after you have scrolled through each page, [listed].” The second random number determined the sequence of each of the 3 scales (Table 3). The third, fourth, and fifth random numbers determined the sequence of the 3 items within each of the 3 scales (data not shown). The second through fifth random choices were applied to each of the 4 readings (i.e., each subject had the same sequence of the 9 items for each of the 4 readings). Statistical analysis was performed using a mixed-effects model with random intercept for each subject and 2 factors: data 0/1 and formulas 0/1.
All 17 subjects completed the 9 items for each article (i.e., answered 36 questions). As designed, distributions of subjects among courses, sequence of articles, and sequence of scales were random (Table 3). Cronbach α of the 9 items was large, 0.94 (95% confidence interval, 0.92–0.96; Table 2).28,29
Formulas in the articles significantly increased trust in the information (P = 0.0019; Table 4). Presence of data did not significantly influence trust (P = 0.15) as shown in Tables 4 and 5.
Our results have implications for the OR management course. Excluding formulas from articles may be counterproductive if the goal is to increase use of the principles and if the formulas will not actually be used by the reader but by an analyst. Formulas in appendices likely serve a role, even for the reader who does not understand them. Furthermore, after the course, when an expert sends e-mail to a participant asking a question, choosing a paper with formulas has no disadvantage and may be advantageous. However, we had substantial heterogeneity (SDs) among subjects in the perceived value of the formulas (Table 5). Managers and experts work with (e-mail) 1 person at a time. Although the average manager with some education had greater trust in readings with formulas, based on the standard deviation (Table 2), we should expect some managers to perceive otherwise. We also do not know the extent to which our results are limited to the specific OR management course and to having an author of the articles known to the course participants as the course instructor (Appendix 1, section “Prior Studies and Gender”).
The formulas may have increased trust by serving as a cue to legitimacy. First, it is unlikely the subjects had prior exposure to the content, which, if they had, might have influenced their trust in the content.30,31 Review of iteratively solving a single equation with a single unknown was added to the statistics review for the course because many course participants struggled with this problem although at the middle school (sixth grade) level. In contrast, the formulas in the reading for lecture #5 involved a series of integrals for stochastic optimization using the Lagrange method. Second, it is unlikely that the subjects planned to use the formulas, because the formulas in the reading for lecture #5 are useful in showing that they do not need to be used routinely. The formulas show that what is taught in the reading for lecture #5, cases, and discussion (i.e., tactical planning of OR capacity using financial criteria) has negligible influence on anesthesia and OR nursing productivity provided the organization acts rationally within a few days of surgery. Third, none of the formulas in the 2 articles were used in the lectures or cases. Therefore, it is unlikely that use contributed to the greater trust.
Dr. Ruth E. Wachtel questioned narrative review articles for the course content, motivating the study. Ms. Jennifer Espy of the University of Iowa’s Department of Anesthesia received scanned evaluation forms and edited parts of the article.
Dr. Franklin Dexter is the Statistical Editor for Anesthesia & Analgesia. This manuscript was handled by Dr. Alan Schwartz, Editor-in-Chief of A&A Case Reports, and Dr. Dexter was not involved in any way with the editorial process or decision.
SELECTION OF THE 4 ARTICLES USED IN THE STUDY
The course curriculum provides the knowledge and problem-solving skills needed for participation in projects that satisfy the Accreditation Council for Graduate Medical Education’s competency in systems-based practice.15 There are different hierarchical levels of knowledge and problem-solving skills.15 For example, to “apply” is to carry out or use a procedure in a given situation. The reading for lecture #7 is used by the course participants as they “apply methods of perioperative managerial cost accounting for budgetary decisions.”4 The next highest hierarchical level of knowledge and problem-solving skills is “explain,” which is to “construct a cause and effect model of a system and predict how a change in one part of a system will affect another.”15 The reading for lecture #5 is used by the course participants as they “explain when tactical decisions to increase OR utilization and contribution margin result in differential increases in block time for subspecialties” and “explain appropriate increases in block time for surgeons based on the impact of operational decisions on tactical decisions.”3 These hierarchical levels of knowledge and problem-solving skills could be achieved even if the readings for lectures #5 and #7 were combined into a single review article. Therefore, to assure that we learned whether future courses could have a single combined article, the readings for lectures #5 and #7 were included in the study.
There were data included in both the reading for lecture #5 and that for lecture #7.3,4 Therefore, our comparison readings needed to be at least 2 articles without data to achieve a balanced design. The reading for lectures #3 and #6 contained data,32 and so did the reading33 for lecture #10. Thus, these 2 articles were excluded. Among the 3 remaining articles,1,2,34 only the reading2 for lecture #4 did not have formulas, and thus, that article was included (Table 1). There were 2 remaining readings1,34 for a potential total of 4 or 5 articles. Because a balanced design could not be achieved with an odd number (i.e., 5 articles), the study included 4 articles.
Both the reading for lectures #1 and #2 and the reading for lecture #9 contained formulas.1,34 However, the formulas in the reading for lectures #1 and #2 were in the appendix and complicated, matching those of the reading for lecture #5. In contrast, the formulas for lecture #9 were basic arithmetic. This was not as good a match as was the reading for lectures #1 and #2 (i.e., the preference was to use the latter reading).
The journal in which each article was published could potentially be a cue influencing trust in the article. Anesthesiology and Anesthesia & Analgesia had similar formats from 2004 to 2008, when the articles were published. Still, “information quality is shaped” by factors including the “user’s perception of how well the information is presented,”24 and, thus, we wanted balance in design. Among the 3 articles so far to be included,2–4 there was 1 published in Anesthesiology4 and 2 published in Anesthesia & Analgesia.2,3 To achieve balance in design, we therefore preferred that the fourth article be published in Anesthesiology. The reading for lectures #1 and #2 was published in Anesthesiology.1 In contrast, the reading for lecture #9 was published in Anesthesia & Analgesia.34 Therefore, we used the reading for lectures #1 and #2 (Table 1) to achieve a balanced design with 4 articles.
Another potential cue influencing trust was the number of references. When undergraduate and graduate students provided thematic statements of importance about trust in Wikipedia articles, the most commonly mentioned theme was the number of references.23 Articles with more references were trusted more.23 In quantitative assessment of factors that would influence the students’ use of the content in an assignment, the number of references was considered the most important factor.23 Therefore, although the number of references was not 1 of the 4 variables balanced among the 4 articles (Table 1), we included the count of references in our summary of the results (Table 6).
DATA AVAILABLE FROM THE PREVIOUS YEAR’S COURSE EVALUATIONS AND PRIOR STUDIES
Although the curriculum of the course has not changed since validated,15 the prerequisite review to be completed by participants before class was updated in 2013 to focus on vocabulary based on the 2013 paper from the course.17 In addition, the discussions of OR management leadership were revised in 2013 to focus on autocratic decision-making based on the results of the other 2013 paper motivated from course discussions.18 Consequently, the prior data used for designing the analysis of readings in 2015 were limited to the 14 course evaluations from 2014 (i.e., earlier years were not used). These participants had distributions of backgrounds typical for the course: 3 Master’s/PhD business analysts, 2 Master’s in Business Administration, and 9 physicians. In 2014 and 2015, among 117 participants, 54.7% were physicians.
There was the same instructor (FD) for both 2014 and 2015. He was the first or second author for each article and the corresponding author for each course article (www.FranklinDexter.net/education.htm). Every (14 of 14) response from the evaluations in 2014 was “strongly agreed” (5 on the 5-point scale) to “Instructor(s) demonstrated thorough knowledge of the subject area.” Therefore, the instructor, per se, could not be a studied covariate. There also were no data supporting the addition of related questions to the study evaluation form.
The 3 features of perceived trustworthiness are ability, benevolence, and integrity.35 Ability relates to the course instructor (above) and the articles (i.e., data or no data and formula or no formulas as being studied). Benevolence does not apply.36 However, integrity could apply. All 4 articles had the same financial disclosure, matching those of the current article. In addition, 14 of 14 “strongly agreed” that “lectures were balanced and free of commercial bias.” Therefore, it seemed unlikely that integrity would be a covariate for the study’s results, and so we did not add such items to the study evaluation form.
Trust in content has different facets, including quality, usefulness, and reliability. However, from the evaluations of 2014, there were 3 of 14 “agreed” and 11 of 14 “strongly agreed” that the “course increased my trust in applying evidence-based statistical methods and analytic reports in health care management decisions.”15,16 That observation suggested that the 3 facets would be observed as unidimensional. Furthermore, in 2013, there was published a study of trust in content; we were (are) unaware of any other such study. Undergraduate students assessed Wikipedia in general (i.e., any content on any topic).25 Pearson correlation coefficients were 0.67 between quality and usefulness of content, 0.84 between quality and reliability of content, and 0.77 between usefulness and reliability.b From that study, we would also expect a large Cronbach α among questions assessing the 3 facets of trust. Thus, we used 3 scales, each assessing what would potentially be a different facet of trust (quality, usefulness, and reliability), with the understanding that prior data suggested the 9 items would provide a unidimensional construct.
STUDY EVALUATION FORM AND PROTOCOL
For the statistical power analysis, the prior information we had was the preceding (2013) study of the association between university “students’ perception of the degree to which information in Wikipedia [was] well-presented” and “the extent to which [the same] students believe[d] that adopting the information in Wikipedia [would] enhance academic performance.”25 The effect size reported was a t-statistic of 3.13. The OR management course was given 4 times in the first half of 2015. We had essentially no information on how many participants there would be in these 4 courses. Because there had been 14 total for 2 courses in 2014, we doubled the estimate to 28. If 50% would complete the evaluation (N = 14 subjects), then with t = 3.13, P = 0.008, which would be suitable. On the basis of this analysis, we proceeded with the study.
PRIOR STUDIES AND GENDER
We are aware of only 2 potentially relevant prior studies to influence our study design, both of which used mock jurors to assess the persuasiveness of expert testimony.37–39c The elaboration likelihood model predicts that, in this situation, the reader’s trust in the information is influenced by cues peripheral to the argument such as the credibility of the source, complexity of the content, and presence of formulas.37 Although the 2 studies included complex (versus not complex) legal testimony,38,39 rather than formulas (vs no formulas), both factors increased the complexity of information and thus lessened the ability of the subject to process the information.37 The 2 juror studies38,39 and our study were unique scientifically in that the complexity of the content served both as the source of limitation in understanding the content and as the peripheral cue being studied. When viewing videotapes of an expert with substantial credentials, greater complexity of testimony was more persuasive than less complex testimony.38 In contrast, in another study, when reading testimony from an expert with substantial credentials (and doing so in a quiet environment matching the preceding study and our conditions), the opposite finding was found for experts with male names, and there were no differences between complex and less complex testimony for experts with female names.39 Our 2 articles with formulas both had the course instructor (male) as the first author. The 2 articles without formulas both had Ruth E. Wachtel (female) as the first author. As described earlier, this was not changeable for the course. When complex testimony was read in a quiet environment, testimony authored by a male name was less persuasive than that by a female name.39 (This would be the opposite direction of our finding for formulas.) Thus, the gender of the instructor (author) may be an uncontrolled covariate specific to the studied course.
1. Dexter F, Epstein RH, Traub RD, Xiao Y. Making management decisions on the day of surgery based on operating room efficiency and patient waiting times. Anesthesiology 2004;101:144453.
2. Wachtel RE, Dexter F. Tactical increases in operating room block time for capacity planning should not be based on utilization. Anesth Analg 2008;106:21526.
3. Dexter F, Ledolter J, Wachtel RE. Tactical decision making for selective expansion of operating room resources incorporating financial criteria and uncertainty in subspecialties’ future workloads. Anesth Analg 2005;100:142532.
4. Wachtel RE, Dexter F, Lubarsky DA. Financial implications of a hospital’s specialization in rare physiologically complex surgical procedures. Anesthesiology 2005;103:1617.
5. Dexter F, Willemsen-Dunlap A, Lee JD. Operating room managerial decision-making on the day of surgery with and without computer recommendations and status displays. Anesth Analg 2007;105:41929.
6. Dexter F, Lee JD, Dow AJ, Lubarsky DA. A psychological basis for anesthesiologists’ operating room managerial decision-making on the day of surgery. Anesth Analg 2007;105:4304.
7. Dexter F, Xiao Y, Dow AJ, Strader MM, Ho D, Wachtel RE. Coordination of appointments for anesthesia care outside of operating rooms using an enterprise-wide scheduling system. Anesth Analg 2007;105:170110.
8. Stepaniak PS, Mannaerts GH, de Quelerij M, de Vries G. The effect of the Operating Room Coordinator’s risk appreciation on operating room efficiency. Anesth Analg 2009;108:124956.
9. Dexter EU, Dexter F, Masursky D, Garver MP, Nussmeier NA. Both bias and lack of knowledge influence organizational focus on first case of the day starts. Anesth Analg 2009;108:125761.
10. Wachtel RE, Dexter F. Review article: review of behavioral operations experimental studies of newsvendor problems for operating room management. Anesth Analg 2010;110:1698710.
11. Ledolter J, Dexter F, Wachtel RE. Control chart monitoring of the numbers of cases waiting when anesthesiologists do not bring in members of call team. Anesth Analg 2010;111:196203.
12. Wang J, Dexter F, Yang K. A behavioral study of daily mean turnover times and first case of the day start tardiness. Anesth Analg 2013;116:133341.
13. Forsyth DR. Group Dynamics. 2009:5th ed. Belmont, CA: Wadsworth Cengage Learning, 303.
14. Laughlin PR, Ellis AL. Demonstrability and social combination processes on mathematical intellective tasks. J Exp Soc Psychol 1986;22:17789.
15. Wachtel RE, Dexter F. Curriculum providing cognitive knowledge and problem-solving skills for anesthesia systems-based practice. J Grad Med Educ 2010;2:62432.
16. Dexter F, Masursky D, Wachtel RE, Nussmeier NA. Application of an online reference for reviewing basic statistical principles of operating room management. J Stat Educ 2010;18(3).
17. Wachtel RE, Dexter F. Difficulties and challenges associated with literature searches in operating room management, complete with recommendations. Anesth Analg 2013;117:146079.
18. Prahl A, Dexter F, Braun MT, Van Swol L. Review of experimental studies in social psychology of small groups when an optimal choice exists and application to operating room management decision-making. Anesth Analg 2013;117:12219.
19. Vroom VH, Yetton PW. Leadership and Decision-Making. 1973:Pittsburgh, PA: University of Pittsburgh Press, 104.
20. Dexter F, Wachtel RE. Strategies for net cost reductions with the expanded role and expertise of anesthesiologists in the perioperative surgical home. Anesth Analg 2014;118:106271.
21. Dexter F, Wachtel RE, Todd MM, Hindman BJ. The ‘fourth mission’: the time commitment of anesthesiology faculty for management is comparable to their time commitments to education, research, and indirect patient care. AA Case Rep 2015;5:20611.
22. Prahl A, Dexter F, Swol LV, Braun MT, Epstein RH. E-mail as the appropriate method of communication for the decision-maker when soliciting advice for an intellective decision task. Anesth Analg 2015;121:66977.
23. Rowley J, Johnson F. Understanding trust formation in digital information sources: The case of Wikipedia. J Inform Sci 2013;39:494508.
24. Wixom BH, Todd PA. A theoretical integration of user satisfaction and technology acceptance. Inform Syst Res 2005;16:85202.
25. Shen XL, Cheung CMK, Lee MKO. What leads students to adopt information from Wikipedia? An empirical investigation into the role of trust and information usefulness. Br J Educ Technol 2013;44:50217.
26. Cheung CMK, Lee MKO. Understanding consumer trust in Internet shopping: a multidisciplinary approach. J Am Soc Inform Sci Technol 2006;57:47992.
27. Sussman SW, Siegal WS. Informational influence in organizations: an integrated approach to knowledge adoption. Inform Sys Res 2003;14:4765.
28. Feldt LS, Woodruff DJ, Salih FA. Statistical inference for coefficient alpha. Appl Psychol Meas 1987;11:93103.
29. Lance CE, Butts MM, Michels LC. The sources of four commonly reported cutoff criteria what did they really say? Organ Res Meth 2006;9:20220.
30. Hasher L, Goldstein D, Toppino T. Frequency and the conference of referential validity. J Verbal Learning Verbal Behav 1977;16:10712.
31. Schwartz M. Repetition and rated truth value of statements. Am J Psychol 1982;95:393407.
32. McIntosh C, Dexter F, Epstein RH. The impact of service-specific staffing, case scheduling, turnovers, and first-case starts on anesthesia group and operating room productivity: a tutorial using data from an Australian hospital. Anesth Analg 2006;103:1499516.
33. Wachtel RE, Dexter F. Differentiating among hospitals performing physiologically complex operative procedures in the elderly. Anesthesiology 2004;100:155261.
34. Dexter F, Epstein RH. Calculating institutional support that benefits both the anesthesia group and hospital. Anesth Analg 2008;106:54453.
35. Mayer RC, Davis JH, Schoorman FD. An integrative model of organizational trust. Acad Manag Rev 1995;20:70934.
36. Van Swol LM. Forecasting another’s enjoyment versus giving the right answer: trust, shared values, task effects, and confidence in improving the acceptance of advice. Int J Forecast 2011;27:10320.
37. Petty RE, Cacioppo JT. The elaboration likelihood model of persuasion. Adv Exp Soc Psychol 1986;19:123205.
38. Cooper J, Bennett EA, Sukel HL. Complex scientific testimony: how do jurors make decisions? Law Hum Behav 1996;20:37994.
39. McKimmie BM, Newton SA, Schullerb RA, Terrya DJ. It’s not what she says, it’s how she says it: the influence of language complexity and cognitive load on the persuasiveness of expert testimony. Psychiatr Psychol Law 2013;20:57889.