Secondary Logo

Journal Logo

Construct Validation of a Laparoscopic Surgical Simulator

Mathis, Kellie L. MD; Wiegmann, Douglas A. PhD

Simulation In Healthcare: The Journal of the Society for Simulation in Healthcare: October 2007 - Volume 2 - Issue 3 - p 178-182
doi: 10.1097/SIH.0b013e318137aba1
Empirical Investigations

Background: Laparoscopic simulators are increasingly used to train and evaluate surgical skill, and validating laparoscopic simulators for these purposes is paramount. Our goal was to determine if the SurgicalSIM laparoscopic surgical simulator can discriminate between novices and experts and to assess learning curves among novices.

Methods: Twenty novices and five experts performed five repetitions on the following modules: place arrow, retract, dissect, and traverse tube. For each module, median baseline performance was calculated. Novices performed 35 additional repetitions to assess learning with practice.

Results: Experts outperformed novices at baseline for time to completion on the dissect, place arrow, and traverse tube modules, as well as for error frequency on the traverse tube and retract modules. Novices' performance improved significantly with practice, approaching the experts' baseline in all modules.

Conclusion: The SurgicalSIM laparoscopic simulator exhibits construct validity on three of four basic-skills modules when considering completion time and on two modules when considering error frequency. Among novices, learning occurred with additional repetitions. Whether acquired skills transfer to the actual surgical environment has yet to be determined.

From the Department of Surgery, Mayo Clinic College of Medicine, Rochester, MN.

Reprints: Kellie L. Mathis, MD, Department of Surgery, Mayo Clinic, 200 First Street, SW, Rochester, MN 55905 (e-mail:

The authors have indicated they have no conflict of interest to disclose.

Over the last two decades, minimally invasive surgery (MIS) has become the standard approach to many operations, including cholecystectomy. The introduction of MIS has required surgeons to learn a new set of skills.1 The two-dimensional display on the monitor alters depth perception and requires the surgeon to look away from the patient. The instruments are longer than those used in open surgery, which reduces tactile feedback. Range of motion is limited by the abdominal wall trocars, and the fulcrum effect creates a paradoxical motion such that moving the hand down moves the instrument up.2–4

Laparoscopic simulators are increasingly used to train these and other surgical skills within surgical residency programs. Simulation offers many advantages over the traditional apprenticeship model. It is a safe way for learners to obtain new motor skills without putting patients at risk and allows a learner to practice skills when it is convenient. Virtual reality simulators offer immediate and objective feedback without the need for an instructor to be present. Furthermore, integrating simulation into the curriculum provides a means of standardizing skills training among residents.

Many commercially available laparoscopic simulators are used in residency programs, and many have undergone validity testing, including the LapSim5 and the Minimally Invasive Surgical Trainer Virtual Reality (MIST-VR).6–8 Such construct testing has shown that these simulators are sensitive to performance differences between expert and novice surgeons and that novices' performance improves over repeated trials. A few studies have also demonstrated that surgical skills acquired via simulators transfer to animal models and human patients in the actual operating room.8–10

The Medical Education Technologies, Inc. SurgicalSIM laparoscopic simulator is also a commonly marketed simulator that is relatively popular due to its low cost and portability. However, the SurgicalSIM has yet to undergo validity testing. Although the simulator is somewhat similar to others previously tested, the content of its training models and manner in which information is visualized does differ from other devices. Therefore, it is important that the simulator and associated performance metrics are validated before it is used to teach and assess surgical skills.

The objective of the present study was to take the first step in validating the SurgicalSIM laparoscopic surgical simulator by determining whether it could discriminate between the performance of experts and novices and whether performance of novices would change over trials. If the simulator has construct validity, experts should perform better than novices at baseline, and novices should improve with continued training.

Back to Top | Article Outline



Twenty-five participants were recruited, including 20 novices and 5 experts. Novices were first through third year medical students. Experts were general surgeons who had performed a minimum of 200 laparoscopic procedures. The Institutional Review Board reviewed the protocol and marked it exempt; informed consent was obtained from each participant.

Back to Top | Article Outline


The SurgicalSIM from Medical Education Technologies, Inc. is a virtual reality laparoscopic simulator that uses software run on a personal computer using Windows XP. The user watches a 17-inch monitor while manipulating two laparoscopic instrument handles, part of a virtual laparoscopic interface (SurgicalSIM Education Platform, SimSurgery). Additional product information and images of the simulator can be obtained from

Back to Top | Article Outline


The SurgicalSIM offers six basic tasks that mimic components of laparoscopic procedures used in the “real world.” Only four of these simulated tasks were used in the present study (place arrow, retract, dissect, and traverse tube). The two basic tasks that were not included were the gallbladder dissection and clip application modules. The gallbladder dissection module requires the integration of several surgical skills and therefore did not allow for the assessment of any individual skill acquisition. Limitations with the metrics for the clip application module restricted data collection for this task. The four used tasks are described briefly.

The “place arrow” module involves the grasping of two ends of a virtual arrow and then moving that arrow into the same simulated three-dimensional plane as another virtual arrow which is randomly placed in the viewing screen. A total of five arrows must be properly moved. Errors include dropping, stretching or compressing the arrow and grasping the arrow in the incorrect position.

For the “retract” module, the participant uses alternating hands to properly retract five pieces of tissue. Errors include excessive traction, unsteady hold, and dropping the tissue.

The “dissect” module involves the participant retracting a piece of tissue with one hand while using the cautery (by pressing on a foot pedal) to separate the tissue from the underlying surface. Ten areas must be dissected, and instruments are alternated between hands. Errors include excessive traction, unsteady hold, dropping tissue, and cauterizing on nontarget or dealigned tissue.

For the “traverse tube” module, the subject uses two graspers to alternately grip a tube in specified regions. The subjects must “walk” their way up and down the tube five times. Errors include grabbing outside the target and dropping the tube.

Back to Top | Article Outline


Each subject was given a brief scripted introduction to the SurgicalSIM simulator and to the assigned task(s); no practice was allowed prior to the start of data collection. To establish baseline performance of experts on the simulator, the five expert surgeons completed five trials for each task. The twenty novices were randomly assigned to one of the four task conditions (n = 5 in each task group). Novice participants completed five trials of their assigned task to establish baseline performance and then completed an additional 35 trials to assess the potential impact of practice. Novices were assigned to only one task due to time constraints (forty trials of testing took roughly 2 hours), as well as to avoid any carrier-over effects (practice and/or fatigue) across modules.

Back to Top | Article Outline

Statistical Analysis

The dependent variables collected in this study were trial completion times (sec) and number of errors made per trial. Experts and novices were compared for differences in median baseline performance (first five repetitions) using the Kruskal-Wallis (KW) nonparametric test. Analyses of learning curves of novices across the additional testing trials were also examined and asymptotic performance levels were compared with baseline performance of experts. A nonparametric approach was used because some dependent variables (eg, errors) were skewed frequency counts. Furthermore, the KW-test is essentially equivalent to a t test for continuous variables when assumptions of normality are met. Across all tests, a P value of ≤0.05 is implied when an effect is referred to as significant.

Back to Top | Article Outline


Baseline Performance

Time to Completion

Figure 1 shows the baseline performance (first five repetitions) for novices and experts for time to completion for all four modules. Examination of the figure indicates that the completion times were generally lower (quicker) for experts across all tasks except the retract task. After the third repetition of each task, performance was relatively stable for the experts but continued to improve in the novice group. Group differences were significant for the place arrow task (novices: median = 133 seconds, range 109–158; experts: median = 59 seconds, range 39–75; P = 0.01); the dissect task (novices: median = 180 seconds, range 147–230; experts: median = 121 seconds, range 73–145; P = 0.01), the traverse tube task (novices: median = 272 seconds, range 190–299; experts: median = 94 seconds, range 43–155; P = 0.01), but not the retract task (novices: median = 68 seconds, range 57–97; experts: median = 70 seconds, range 49–84; P = 0.83).

Figure 1.

Figure 1.

Back to Top | Article Outline

Error Frequency

Figure 2 shows the median baseline performance (first five repetitions) for novices and experts for error frequency for all four modules. Similar to completion time scores, experts generally made fewer errors than novices across all modules. However, differences in median error scores were significant for only the retract task (novice: median = 1, range 0.4–1.6; expert: median = 0.4, range 0.0–0.8; P = 0.04), and the traverse tube task (novice: median = 6, range 2.4–6.6; expert: 1.4, range 0.2–2.6; P = 0.03], but not the place arrow task (novice: median = 1.8, range 0.2–3.8; expert: median = 0.4, range 0.2–1.6; P = 0.12) or dissect task (novice: median = 10.4, range 3.0–17.2; expert: median = 5.6, range 1.8–6.2; P = 0.06).

Figure 2.

Figure 2.

Back to Top | Article Outline

Learning Curves

Time to Completion

Figure 1 shows the novices' time to completion curves for each of the modules. For all four tasks, the novices improved with practice. Across all four tasks, novices were able to obtain a level of performance that was equal to the baseline performance of experts. The rate of learning appeared to vary somewhat across modules, with novices either quickly (retract and dissect) or slowly (traverse tube) reaching the baseline performance of the experts. At the completion of 10 trials, nonsignificant differences existed between novices and experts on all tasks. For the retract task (on which there were no differences in baseline performance between novices and experts), novices were able to exceed the performance of experts. However, this difference was not significant.

Back to Top | Article Outline

Error Frequency

Figure 2 shows the novices' learning curves for each of the modules concerning error frequency. In contrast to completion times, error frequency continued to vary considerably across trials and no consistent improvement was apparent despite repeated practice.

Back to Top | Article Outline


Despite low power due to small sample size, the SurgicalSIM laparoscopic surgical simulator was able to detect differences between the performance of experts and novices on three of four tasks with regard to time to completion, and on two tasks when measuring error frequency. These findings provide partial support concerning the validity of the SurgicalSIM simulator, in terms of the motor skills it teaches and the metrics it uses to measure performance.

Not all of the data, however, support this conclusion. For example, on the retraction module, novices actually completed the task slightly faster than experts at baseline and novices' performance tended to drop below experts with training (although differences were not significant). The reason for this occurrence is not clear. One possible explanation for this observation is the speed versus accuracy tradeoff; novices may sacrifice accuracy to improve their completion times while experts recognize that avoidance of errors is more important than speed. The present study does support this theory; although the experts were slightly slower than the novices on the retraction module (70 versus 68 seconds), they did commit fewer errors (0.4 versus 1 error/trial). Ro et al. also found that experts did not always outperform novices when measuring the validity of a different laparoscopic simulator. The authors postulated that the experts in their study may have been misled by the lack of haptic feedback, while the novices did not have enough experience with laparoscopic surgery to know the difference.11 The relative low fidelity of the machine may have resulted in a lack of face validity that had a more apparent effect on the experts.11,12 During actual laparoscopic surgery, proper retraction requires that the operator maintain a fine balance of adequate tension on the tissues to facilitate dissection without excessive traction that may inadvertently damage the structures; this is accomplished through the visual and tactile clues that the operator sees and feels during the procedure. The lack of haptic feedback on the SurgicalSIM simulator may indeed have hindered the experts on this task. It may also reduce its transferability to the real operating room environment. It should be noted that although the simulator does not offer tactile feedback, it does provide some visual clues to the trainee, such as the tissues transform when they are touched with the instruments.

The novices' learning curves demonstrate an improvement in time to completion with practice for all of the modules, attaining or at least approaching the experts' baseline performance. Some studies have attempted to determine the number of repetitions needed to master a given module on the MIST-VR13,14 and other simulators,15 ranging from 2 to 32 repetitions. Brunner et al.16 found that novices in their study continued to improve their scores after even thirty repetitions. They concluded that it is better to base expected performance standards on the needs and abilities of each individual learner. Setting arbitrary numbers of repetitions to be performed will result in over-training some and under-training others. Expert scores may be a better standard than number of repetitions for this reason, and it may motivate the learner as well.16–18

We found that despite performing forty repetitions of a given module, the number of errors committed by the novices did not consistently improve with practice. Some authors have suggested that error rates may be more of a reflection of innate ability than an acquired skill of the learner.19,20 However, Aggarwal has suggested that the real problem may lie in the design of the error metrics. There is often considerable difficulty in clearly defining exactly what constitutes a surgical error and the metrics may reflect that difficulty.21 Although the novices in this study were encouraged to take their time and look over their simulator-generated results after each repetition, they were not given any direct feedback or instruction from the researchers (authors) as they progressed through the 40 repetitions. This may have contributed to the finding that novices' error frequency did not steadily improve with practice. Educators agree that feedback is one of the most valuable features of simulation-based education and is necessary for effective learning.22

In this study, we used the average value for the first five repetitions of a given module to measure baseline performance. Although some authors have considered only the first one or two repetitions to be the baseline performance, we agree with Maithel23 that even expert participants need more than one repetition to get oriented to the simulator and to “warm up” to their actual baseline. This is depicted in Figures 1 and 2. Experts' performance improved over the first few repetitions; however, performance appeared to plateau for some tasks before the completion of the five repetitions. Whether the experts' would have continued to improve their completion times while keeping the number of errors committed to a minimum with additional practice is unknown. It should also be noted that while each of our five experts met our criteria for participation (having performed a minimum of 200 laparoscopic procedures), there was a great deal of variability in their performance times and the number of errors they committed, and this may be reflective of differences in skill levels among them.

This study is not without its limitations. The small size of the study groups limits the power to find statistically significant differences between the experts and the novices. This is most obvious in Figure 2 where despite large median differences in error frequency between the two groups on all tasks, only two of the tasks were statistically significant. However, the sample size used in this study was comparable to many in the published literature.24 In addition, novices in this study were medical students who responded to a request for volunteers. As such they may have been more motivated than the average student who is inexperienced with laparoscopy. It should be noted, however, that medical students were not informed that their performance would be compared with expert surgeons, potentially reducing the competitive “John Henry” effect where members of a control group perform beyond their expectations because they perceive that they are in competition with an experimental group. Finally, for novices, all repetitions of the tasks were performed in a single massed session rather than a series of sessions. One report suggests that intermittent practice may be more beneficial than massed practice.10 Therefore, a significant reduction in errors over trials might be accomplished with spaced rather than massed practice. However, additional research is needed to explore this issue.

It is important to note that the transferability of the skills learned on the SurgicalSIM to the operating room has not yet been determined. Practice on other commercially available laparoscopic simulators has been correlated with improvement in operating room performance as measured by objective data9 (decreased errors and time needed to complete a laparoscopic cholecystectomy), as well as subjective performance with a global assessment scale score.8,25 Additional studies are necessary to assess this aspect of validity of the SurgicalSIM surgical simulator.

Back to Top | Article Outline


1. Hunter J: Advanced laparoscopic surgery. Am J Surg 1997;173:14–18.
2. Gallagher A, McClure N: An ergonomic analysis of the “fulcrum effect” in the acquisition of endoscopic skills. Endoscopy 1998;30:617–620.
3. Jones D, Brewer J: The influence of three-dimensional video systems on laparoscopic task performance. Surg Laparosc Endosc 1996;6:191–197.
4. Crothers I, Gallagher A, McClure N et al.: Experienced laparoscopic surgeons are automated to the “fulcrum effect”: an ergonomic demonstration. Endoscopy 1999;31:365–369.
5. Woodrum D, Pamela B, Yellamanchilli R et al.: Construct validity of the LapSim laparoscopic surgical simulator. Am J Surg 2006;191:28–32.
6. Gallagher A, Satava R: Virtual reality as a metric for the assessment of laparoscopic psychomotor skills: learning curves and reliability measures. Surg Endosc 2002;16:1746–1752.
7. Taffinder N, Sutton C: Validation of virtual reality to teach and assess psychomotor skills in laparoscopic surgery: results from randomized controlled studies using the MIST VR laparoscopic simulator. Stud Health Technol Inform 1998;50:124–130.
8. Grantcharov T, Kristiansen V, Bendix J et al.: Randomized clinical trial of virtual reality simulation for laparoscopic skills training. Br J Surg 2004;91:146–150.
9. Seymour N, Gallagher A, Roman S et al.: Virtual Reality Training Improves Operating Room Performance. Ann Surg 2002;236:458–464.
10. Mackay S, Morgan P, Datta V et al.: Practice distribution in procedural skills training. Surg Endosc 2002;16:957–961.
11. Ro C, Toumpoulis I: The LapSim: a learning environment for both experts and novices. Stud Health Technol Inform 2005;111:414–417.
12. Aggarwal R, Grantcharov T, Moorthy K et al.: A competency-based virtual reality training curriculum for the acquisition of laparoscopic psychomotor skill. Am J Surg 2006;191:128–133.
13. Gallagher A, McClure N, McGuigan J et al.: Virtual reality training in laparoscopic surgery: a preliminary assessment of minimally invasive surgical trainer virtual reality (MIST VR). Endoscopy 1999;31:310–313.
14. Grantcharov T, Bardram L, Funch-Jensen P, Rosenberg J: Learning curves and impact of previous operative experience on performance on a virtual reality simulator to test laparoscopic surgical skills. Am J Surg 2003;185:146–149.
15. Scott D, Young W, Tesfay S et al.: Laparoscopic skills training. Am J Surg 2001;182:137–142.
16. Brunner W, Korndorffer J, Sierra R et al.: Laparoscopic virtual reality training: Are 30 repetitions enough? J Surg Res 2004;122:150–156.
17. Brunner W, Korndorffer J, Sierra R et al.: Determining standards for laparoscopic proficiency using virtual reality. Am Surg 2005;71:29–35.
18. Korndorffer J, Scott D, Sierra R et al.: Developing and testing competency levels for laparoscopic skills training. Arch Surg 2005;140:80–84.
19. Valentine R, Rege R: Integrating technical competency into the surgical curriculum: doing more with less. Surg Clin North Am 2004;84:1647–1667.
20. Macmillan A, Cuschieri A: Assessment of innate ability and skills for endoscopic manipulations by the Advanced Dundee Endoscopic Psychomotor Tester: predictive and concurrent validity. Am J Surg 1999;177:274–277.
21. Aggarwal R, Grantcharov T, Eriksen J et al.: An evidence-based virtual reality training program for novice laparoscopic surgeons. Ann Surg 2006;244:310–314.
22. Issenberg S, Mcgaghie W, Petrusa E et al.: Features and uses of high fidelity medical simulations that lead to effective learning: a BEME systematic review. Med Teacher 2005;27:10–28.
23. Maithel S, Sierra R, Korndorffer J et al.: Construct and face validity of MIST-VR, Endotower, and CELTS. Surg Endosc 2006;20:104–112.
24. Sutherland L, Middleton P, Anthony A et al.: Surgical simulation: a systematic review. Ann Surg 2006;243:291–300.
25. Scott D, Bergen P, Rege R et al.: Laparoscopic training on bench models: better and more cost effective than operating room experience? J Am Coll Surg 2000;191:272–283.
© 2007 Society for Simulation in Healthcare