Ophthalmologic surgical success partly depends on the surgeon’s acquired microsurgical competence and broad experiences. To ensure safe and high-quality treatment, it is generally agreed that surgical training should be prearranged and that skills should be assessed at regular intervals.1–3 Training in the operating room by the apprenticeship model is protracted and potentially precarious.4 Moreover, dynamics such as medico-legal considerations and reduced resident working hours and operating room time have produced a need for other types of operative training.5 Therefore, many training programs have established coursework before training in the operating room.
Although surgical wet-labs are critical aids, their usefulness is limited by practical considerations such as logistical set up, inadequate tissue modeling, and specimen acquisition and disposal. Furthermore, it is difficult to objectively assess surgical skills in vivo and in the laboratory because of the difficulties associated with reliability and validity measurements. Moreover, it is challenging to acquire the necessary repetitions of training situations. Set forth by the Accreditation Council of Graduate Medical Education, the minimum operative requirement for ophthalmology residents as primary cataract surgeon is 86 cases.
Although current ophthalmology residents graduate with ∼120 to 150 microsurgical cases to their credit, some reports suggest that surgical maturity is not reached until the surgeon has attained 400 or even 1000 cases.6,7 Surgical simulators offer a theoretical potential for acquiring, improving, maintaining, and measuring these skills.8–16
The commercially available EyeSi virtual reality (VR) intraocular surgical simulator allows residents to practice and develop intraocular techniques in a controlled, nonstressful environment, as well as allows experienced ophthalmologists the opportunity to maintain those skills.5,16 Virtual simulators have been developed in surgical subspecialties other than ophthalmology, such as otolaryngologic and laparoscopic gastrointestinal surgery and have been shown not only to be an objective surgical skills assessment tool and effective training tool, but also to decrease the morbidity and risks of iatrogenic trauma.1,12,17 No developed ophthalmic simulator system has been used as widespread aids in residency or training programs, partially because their applicability to the learning process of ophthalmologic surgical training has not been established.10,15,16
VR surgical simulators require validation before implementation into a curriculum, particularly to determine if it is able to differentiate between different levels of skill. Though an early prototype was tested successfully and demonstrated its potential applications as a skills assessment tool, we assessed the EyeSi simulator with the endeavor to establish the validity of this simulator as an educational and assessment apparatus, by comparing scores between experts and novices in intraocular surgery.14 Our hypothesis is that performance scores ascertained from the EyeSi navigation task by expert intraocular surgeons are significantly higher than the performance scores of the novice intraocular surgeons.
This prospective study was conducted at Madigan Army Medical Center. The protocol and Health Insurance Portability and Accountability Act-compliant informed consent forms were approved by the Madigan Army Medical Center institutional review board. Each subject gave written informed consent to participate in the study. A group of 25 ophthalmology residents and attending physicians from the Departments of Ophthalmology at the University of Washington and at Madigan Army Medical Center were enrolled in this study. Each of the candidates performed standardized navigational tasks of on the EyeSi microsurgical simulator. At the time that this study was conducted, the manufacturer provided only software for basic navigational tasks. To reduce familiarity bias, no prior training was permitted; however, all the participants were familiarized with the controls of the simulator, its features, the task to be performed, and the possible errors of the procedure. To attain hands-on practice in the assignment before scoring runs, all participants, including novice and experts, performed a trial-run consisting of three repetitions of only the first level of navigation training.
Microsurgical staff ophthalmologists, with a median of 9.7 (range 6.0–13.4) years microsurgical experience, comprised the “expert” group (n = 7), whereas interns, residents, and nonmicrosurgical staff ophthalmologists with a median of 0.2 (range 0–0.6) years of current, contiguous microsurgical experience were selected for the “novice” group (n = 18). Each staff ophthalmologist included in the novice group had been inactive as a microsurgeon for at least 6 (median 8.25, range 6–10) years.
The EyeSi simulator (VRMagic, Germany) is a VR simulator developed for training in intraocular surgical skills. It consists of a model surgical microscope, a computer processor, and a model head to simulate the experience of real intraocular surgery, represented by stereovision through the microscope and reactive virtual tissue (Fig. 1). When performing virtual eye surgery, the surgeon looks into a viewer wired to show simulated images and uses tools inserted into an artificial eye beneath the viewer, all positioned to mimic real surgery. Through the viewer, one sees exactly what one would see through a microscope during real surgery; the handheld instruments seem to be the appropriate tools.
Using a basic microdexterity module provided with the simulator, subjects completed five iterations of a four-level navigational task (Navigational Training, levels 1–4; Fig. 2). Each candidate was asked to bimanually introduce an intraocular light source and probe, and then impale several spheres floating in the vitreous. At the first level, the subject located and impaled the spheres arranged in a simple helical pattern. The complexity of the task increased with each increment in level so that by the fourth level, the spheres were smaller, closer to the retina, and randomly scattered. Manufacture-provided default performance thresholds served as performance gates for successfully advancing to the next level of difficulty. Testing was terminated when each participant successfully and consecutively completed five iterations of each cycle of the four levels of difficulty, not necessarily at a single sitting.
Throughout the task, several parameters were measured by the simulator, including time to complete the task, total path length traveled by tool tip during task, and surgical error, which comprised retinal contacts, phototoxicity, and lens contacts. All parameters were calculated by the simulator as points deducted. The “time error” score equaled the points deducted for not achieving the minimum time to completion possible for each level. The “odometer error” score equaled the points deducted for not achieving the minimum distance traveled possible for each level. The “other error” score equaled the sum of the points deducted for each retinal contact, retinal phototoxicity, or lens contact. The “total error” score represented the sum of all error scores, including time error, odometer error, and other error. Data of the five consecutive runs were averaged for each individual and for each separate parameter. Results were compared between the expert and novice surgeons.
The means, standard deviation (SD), and 95% confidence intervals were computed based on the recorded scores from each task. Mean time error score, odometer error score, other error score, and total error score for each group were then plotted for each iteration of the task and the curves were compared. Data were analyzed using StatView (version 4.57). The repeated measures analysis of variance test was used to compare the differences between the baseline and final performance of the expert and novice group over all levels by iteration. Post hoc analysis was performed using Student unpaired t test at each iteration to analyze the differences between group performances. Proportions were compared using χ2 analysis (or Fisher exact test, where appropriate). For all tests, a P value of less than 0.05 was considered significant.
Twenty-five surgeons and surgical residents/interns participated in this study. Table 1 shows the demographics of the participants. The mean age of the surgeons in the expert group is 41 years, ranging from 33 to 48 years of age. The mean age of the participants in the novice group is 32.6 years, ranging from 27 to 50 years. Therefore, there is a significant difference in age distribution among groups (t test, 0.008). The median year of microsurgical experience in the expert group is 9.7 years, whereas the novice group is 0.2 years. Thus, there is a significant difference in number of years of microsurgical experience (t test, 0.0005). The expert group is composed of 14.3% women, whereas the novice group is 22.2% women. Hence, sex is relatively equally represented among groups (Fisher exact test, >0.9999). Handedness is relatively equally represented among groups, with 85.7% right-handedness in the expert group and 94.4% right-handedness in the novice group (Fisher exact test, 0.49).
Expert intraocular surgeons completed each task with significantly fewer attempts (1.029 versus 1.506 over all levels) than the novice surgeon cohort (t test, 0.0096).
Table 2 shows the error scores, as calculated across each task. Using Repeated Measures analysis of variance test, a difference in the time error score was found between the two groups from baseline and final across all iterations (P < 0.05). Post hoc analysis showed that this difference could be attributed to a significant difference between the mean scores of the expert and novice group at each iteration.
In terms of the odometer error score, a statistically significant difference between groups across all iterations was seen. However, post hoc analysis for each iteration revealed that the statistically significant difference between groups exists only at baseline and is lost by the second iteration.
The other error score was neither significantly different between the groups across all iterations, nor at any particular iteration.
The expert group had a smaller SD for time error, odometer error, and total error for the task. Additionally, the novice group took a statistically significant 2.5 times more attempts to pass the first level of the task than the expert group (P value Student t test, 0.0204). As both the expert group and the novice group perform more iterations of the task, the SD decreases (Table 2).
Plotting the error score versus iteration produces a learning curve for each group. For time error score results, there is a significant difference between the groups throughout all iterations of the task (Fig. 3). As the number of iterations increases, the curve for the expert group remains relatively flat with minimal improvement in scores, whereas in comparison the novice group shows a steep decline and reaches a relative plateau around the third iteration.
The curve for the odometer error score results shows a significant difference between the expert and novice group at baseline (Fig. 4). The curve for the expert group remains flat throughout the iterations, whereas the curve for the novice group rapidly improves, attaining the performance level of the experts.
The performance of a subset of staff ophthalmologists (n = 4) that had been inactive from microsurgery for a median of 8.25 (range 6–10) years were analyzed separately and labeled the “intermediate” group. Intermediate surgeons were then compared with novices (residents or interns) and surgical experts (practicing ophthalmic microsurgeons). At the first iteration, intermediate surgeons out-performed novice surgeons demonstrating statistically significant lower error scores. Moreover, at the first iteration, intermediate surgeons performed at expert surgeon level; there was no statistically significant difference between intermediate and expert group performance. On the basis of the final performance, however, intermediate surgeons performed more similarly to novice surgeons. By the final iteration there was no significant difference between intermediate and novice group performance.
When tested using the EyeSi surgical simulator, expert surgeons showed a greater facility with microsurgical tasks, but with repeated practice, novice surgeons showed significant improvement in all performance scores. Expert intraocular surgeons performed significantly faster and successfully completed the task with fewer attempts than the novice surgeon cohort. After several iterations, novices showed significant improvements in task completion time, shorter instrument path lengths, and equivalent attempts as experts to complete the task. Main improvements occurred after the first 2 iterations, indicating rapid familiarization and training effect.
On the task, the expert group performed significantly better than the novice group, at least after the first iteration for most measured parameters. Given that expert surgeons possess statistically significant more years of intraocular surgical experience compared with novice surgeons, the results indicate that the EyeSi is able to discern between differing levels of experience and is therefore a valid measure of basic skill. One could conclude that, with practice, the EyeSi could improve novice performance to expert levels.
Additionally, Table 2 shows, from baseline to final iteration of the task, an overall decrease in variance around the mean error scores. This pattern of improvement suggests that the simulator objectively increased microsurgical performance. However, this relationship was more apparent for the novice group than the expert group, as the variance of expert performance remained relatively constant with a minimal amount of variation. One could foresee constructing a curriculum in which novices attempt to match expert performance not only by raw score, but also by matching the expert’s low pattern of variance.
The results demonstrate that time performance is a valid and predictable measure of improvement. Time error scores improved significantly from baseline to final iteration in both expert and novice groups; however, this significant improvement may not necessarily indicate an improvement in quality surgical skills. Throughout the study, subjects quickly grasped how to hasten the completion of the module, thereby improving performance scores, using schemes such as delaying the start of the simulator’s internal stopwatch and bypassing the initial step of focusing the scope, working instead when out of focus, without penalty. In reality, these methods may produce only an apparent improvement in performance, but may correlate more strongly with lack of attention to precision than efficient surgical practices. Moreover, these tactics may be the result of the small improvement in expert surgical performance observed in the graphic depiction of their relatively flat learning curve. On the basis of these observations, the manufacturer of the EyeSi has modified the module’s software to require more critical focusing before being able to impale a sphere, and by making “failed to maintain critical focus” a separate error score. Additionally, the clock now begins immediately at commencing the simulation. Nevertheless, time error is a valid factor to measure surgical skill, though should not be the only measure of it.
The odometer error score, on the other hand, evaluates intraocular instrument path length while finishing the task and is meant to directly represent the participant’s efficiency and purposefulness of movements. Represented by a steep learning curve, the results show that only the novice group demonstrated a significant improvement in odometer error score from baseline to final iteration, and may indicate that the novice group rapidly comprehends the importance of economy of movement as their performance nears the novice group rapidly comprehends the importance of economy of movement as their performance nears the expert benchmark. The flat learning curve of the expert group possibly reflects a mastered skill. Economy of movement, reflecting surgical efficiency that comes from purposeful, controlled, and precise movements, is a difficult concept to teach and attain; however, it seems that this skill is rapidly acquired using the EyeSi.
Moreover, a statistically significant difference in odometer error scores between the two groups throughout all iterations is expected. However, there existed only a significant difference among the odometer error scores between the two groups at baseline which indicates that odometer error scores, overall, are not a valid marker of skill. Possibly, the subjects were more focused on successfully completing the task in a timely fashion and less on meticulousness with the instrument tip. Additionally, it is possible that the manufacture-provided default performance gates for advancing to the next level of difficulty in this simple navigational task used in this study were too easily mastered by the novice surgeons. Perhaps, a more complicated task would highlight the differences between the novice and expert surgeons, thereby making the odometer error score a more valid indicator of performance. Further refinements of software modules and the addition of procedure-based software modules may enhance the validity of this platform.
On the task, other errors, which consisted of the sum of the points deducted from each retinal contact, retinal phototoxicity, or lens contact, were statistically lower for the expert group, at least after the first iteration (Figs. 5 and 6). Although the expert outperformed the novice group on most other measured parameters, there was a statistically insignificant trend by the final iterations that the novice group performed better with fewer errors, despite the expert group scoring a significantly lower other error score initially. This difference is possibly a consequence of a time accuracy tradeoff, where perhaps the expert group forfeited meticulousness for speed, rather than an effect of a limited validity of these metrics.
Intermediate surgeons—those microsurgeons that have been inactive microsurgeons for a median of 8.25 years—tended to out-perform novice surgeons and perform at expert surgeon level in most tasks (at least initially); however, perform more similarly to novice surgeons based on final performance and learning curve. The intermediate surgeon’s performance pattern suggests that a particular skill will be lost if not used on a regular basis. Further studies on the EyeSi may elucidate the rate at which a particular skill can be regained.
The EyeSi VR simulator provides a useful and valid objective assessment tool for differentiating microsurgical skills. Data indicate differences between expert and novice surgeon performance and show improvement in performance with training. These distinctions between expert and novice skill level could potentially lead to the construction of a surgical curriculum so that novice surgeons are challenged to attain expert performance benchmarks. Manufacturers of microsurgical simulators are continually making refinements in their equipment which should improve the ability of this type of simulation to accurately assess skill level. Additional research is required to confirm or deny the similarity between actual and simulated surgical tasks and the relevance of VR surgical simulation to surgical skill assessment and training.
1. Aggarwal R, Darzi A. Surgical education and training in the new millennium, Surg Endosc
2. Emkin JL, McDougall EM, Clayman RV. Training and assessment of laparoscopic skills. J Soc Laparoendosc Surg
3. Villegas L, Schneider BE, Callery MP, et al. Laparoscopic skills training. Surg Endosc
4. Bridges M, Diamond DL. The financial impact of teaching surgical residents. Arch Surg
5. Haluck RS, Krummel TM. Computers and virtual reality
for surgical education in the 21st century. Arch Surg
6. Martin KR, Burton RL. The phacoemulsification learning curve: per-operative complications in the first 3000 cases of an experienced surgeon. Eye
7. Ng DT, Rowe NA, Francis IC, et al. Intraoperative complications of 1000 phacoemulsification procedures: a prospective study. J Cataract Refract Surg
8. Laurell CG, Soderberg P, Nordh L, et al. Computer-simulated phacoemulsification. Ophthalmology
9. Helveston EM, Coffey AC. Surgical practice kit: ophthalmic suture simulator. Arch Ophhtalmol
10. Hikichi T, Yoshida A, Igarashi S, et al. Vitreous surgery simulator. Arch OphthalmoI
11. Hunter LW, Jones LA, Sagar MA, et al. Ophthalmic microsurgical robot and associated virtual environment. Comput Bioi Med
12. Kuppersmith RB, Johnston R, Jones SB, et al. Virtual reality
surgical simulation and otolaryngology, Arch Otolaryngol Head Neck Surg
13. Peugenet F, Dubois P, Rouland JF. Birtual reality versus conventional training in retinal photocoagulation: a first clinical assessment. Comput Aided Surg
14. Rossi JV, Verma D, Fujii GY, et al. Virtual vitreoretinal surgical simulator as a training tool. Retina
15. Sinclair MJ, Peifer JW, Haleblian R, et al. Computer-simulated eye surgery: a novel teaching method for residents and practitioners. Ophthalmology
16. Verma D, Wills D, Verma M. Virtual reality
simulator for vitreoretinal surgery. Eye
17. Schijven M, Jakimowicz J. Face-, expert, and referent validity of the Xitact LS500 laparoscopy simulator. Surg Endosc