Accurate three-dimensional (3D) computer anatomical models have been proliferating as a consequence of the availability of digital resources derived either from mapping the surface of real objects (such as the human skeleton) or by tracing out regions of interest from images of serial slices of the human body. Such models would appear to have obvious educational advantages over the standard three-view book presentation, since the learner can control the position of the object just as if it were a real 3D object in his or her hands.
Surprisingly, these advantages may be more imagined than real. Implicitly, these models assume that the learner can accurately assimilate and remember spatial information from multiple view-points. However, some evidence suggests that humans actually synthesize spatial information presented in oblique orientations by first rotating back to a standard (top, side, front) view, then learning the visual information.1 If so, presenting information in multiple orientations may place a heavy load on the individual's ability to rotate the figure. This load will be seen most acutely in those with relatively poorer spatial ability, since mental rotation is a critical element of spatial ability.2 Consequently, studying projections of a 3D object in an orientation in which an object is best visualized and least obscured (called canonical or key viewpoints) may be all that is necessary. Once learned, the object in memory can be mentally rotated to enable recognition of new objects.
This hypothesis was borne out in the first of a series of studies,3 which contrasted one group where wrist-bone anatomy was presented in multiple views rotating at 15° increments every 20 seconds (the multiple-view group) with a second group that saw only the palmar and dorsal, posterior/anterior (p/a) views for the same time (the key-view group). The test was based on presentations of a skin-covered wrist in rotated positions, with an arrow pointing to a particular bone that was to be identified. All the rotated positions had been seen by the multiple-view group during learning. The multiple-view presentation had a small benefit for learners with good spatial ability, but it substantially handicapped learners with poor spatial ability. Anecdotal evidence from this study showed that participants in the multiple-view group were using a strategy of memorizing key views; 88% reported that they first mentally rotated the image to a standard position, then memorized the names.
To confirm the self-reported strategy of mental rotation and memorizing key views empirically, a second study4 allowed students in one group to control the presentation through multiple views, while those in the control group were restricted to controlling the object in either of the p/a views. Consistent with the post-hoc observations of the first study, learners spent most of their time examining the anterior and posterior views, with small variation around these key views. In this study, there was a small advantage for the multiple-view presentation, although the strategy adopted by the group amounted to spending most of the time on key views, with a significant but small variation around the 0° and 180° presentations. Although this might be interpreted as evidence of the superiority of active learning, an alternative explanation is that, consistent with theory, participants extracted most information from the key views, but deliberately induced a small “wiggle” around the key view to gain some sense of the third dimension.
Both these studies are notable in that they controlled for many of the confounding variables that plague studies comparing media. All presentations and tests were done on the computer for both groups, time of presentation was controlled, adjustment was made for spatial ability, and the test was consistent with the instruction. They lead to two conclusions: (1) When presented in a fixed sequence, multiple views had a small advantage over p/a views for high-spatial-ability students, but a disadvantage for low-ability students. This conclusion is consistent with the self-reports that, when presented with an oblique orientation, students mentally rotate back to the canonical form, with a consequent increase in cognitive load and reduction of performance. (2) When learners can control the presentation, they consistently perform better than do students who have only p/a views, but actually spend most of the time at or near the standard orientations.
What is unclear from the superficially opposite conclusions of the two studies is whether the superiority of the multiple-views/active-control group in the second study derives from active learning, or whether students actually gained information from “wiggling” the views around the canonical presentations to detect the depths of the objects, while not losing the straight-on presentation. The present study addressed this question by permitting both groups to have active control over the orientation, but in one group, it was restricted to ±10° around the key views; in the other, it is unconstrained.
The design was a two-group design with three repeated measures (performance on the test at three occasions) and multiple covariates (described below).
Eighty-seven first-year medical students were randomized in a single-blind fashion to either a multiple-view (MV) group, where students could study the model by positioning it in any of 36 possible angles of rotation, or a key-view and wiggle (KV+W) group, in which they could control six views near 0° and 180° on computer workstations. The study was conducted during orientation week in a two-hour period. All students gave informed consent.
The carpal bones were chosen as the object of study because they represent a spatially complex area, and understanding their relationships to the skin is important in the examination and treatment of fractures. The MV and KV+W models were identical in all respects except in the number of viewing angles from which the model could be seen. The MV model was capable of rotating horizontally under subject control in a complete circle at 10° intervals, for a possible 36 different viewing angles. The KV+W model under subject control rotated only 10° on either side of the 0° and 180° views, for a possible six different viewing angles. The hand model was created in 3D modeling and rendering software, using commercial wireframes from Viewpoint Data Labs and Curious Labs.
Students viewed the skeletal hand on three successive occasions (set 1, set 2, set 3) for three minutes each in a single session. Students in the MV group had complete control over the horizontal orientation of the model using the mouse; the computer recorded each increment so that the amount of time spent on each view was recorded. Students in the KV+W group had control of the six orientations around 0° and 180°.
A number of learner characteristics were documented: Medical College Admission Test percentile results (MCAT Verbal Reasoning, Biological Science Knowledge, Physical Science Knowledge, and Writing Sample), a standardized test of spatial ability (mental block rotation—internal constancy alpha coefficient .88, test—re-test reliability .70,8 prior computer use, use of 3D computer games, gender, and age.
Carpal spatial knowledge was measured by 50 multiple-choice questions, which required determination of carpal bone intersections by a pin from various viewing angles. Some of the questions were administered each time the models were used; ten questions after the set 1, and 20 questions after sets 2 and 3. In order to determine whether spatial understanding was achieved for the object as a whole, the questions were described as types “near” and “far” based on proximity to key views. Near questions depicted the hand from the palmar (anterior) aspect (0°) or dorsal (posterior) aspect (180°) ±10° and were identical to viewpoints present in the key view model. Far questions depicted the hand from many different perspectives present in the multiple-view model only. The entire carpal spatial knowledge measure had previously been validated and had an internal consistency alpha coefficient .88.1 A post-test survey was administered to measure the processes by which spatial learning was occurring.
The data were initially analyzed with a repeated-measures ANCOVA, with one grouping factor (KV+W or MV) and six repeated measures (test occasions 1, 2, 3, and “near/far”). Initially, three covariates (play 3D computer games, spatial ability, writing ability) that were significant in univariate tests were included; however, only the spatial ability test was significant in the multivariate analysis. Similarly, although there was a main effect of near/far and test occasion (1,2,3), there were no interactions, so the analysis was repeated with the total test score across all occasions using the spatial ability test as covariate.
Significant differences in several critical baseline variables were found. Students in the MV group were more likely to play 3D computer games, have higher spatial ability, and have lower writing ability (all p < .05) compared with the KV+W group.
In the unadjusted data, the MV group showed a better spatial understanding of the carpal bones than did the KV+W group (see Figure 1, mean difference 6.6%). This difference appeared to be evident for the test overall, on far questions, on near questions, and on all three occasions. From the ANCOVA, there were main effects of near/far (F = 10.82, p < .001) and of test occasion (1,2,3) (F = 13.72, p < .0001). There was a significant effect of spatial ability as a covariate (F = 10.39, p < .002). However, upon controlling for spatial ability, the effect of KV+W versus MV was not significant (F = 1.55, n.s.), suggesting that the observed difference was due to differences in the baseline spatial ability, rather than the experimental manipulation.
In the post-test pictorial questionnarie, most of students (88%) claimed that they remembered a key view and rotated this mental image to answer the questions. In the MV group, more time was spent studying viewpoints that were similar to the key views, compared with oblique views (see Figure 2). Moreover, the time spent studying key viewpoints increased over the three times the model was used (paired t-test sets 1 and 3, p = .001).
Discussion and Conclusions
As in the previous study, students who had control over the entire range of rotations chose to spend the majority of their time examining the key (p/a) views of the wrist, with some small variation around these views, presumably to get some sense of the third (depth) dimension. Moreover, when this strategy was contrasted with a strategy that included only the key views with small variation, there was no advantage of the multiple views (once spatial ability was taken into account).
These results support the conclusion that certain key or canonical viewpoints of an object are critically important for spatial learning. Learner control of multiple orientations provided no particular advantage over access to orientations close to the key views. The results also confirm previous findings that a learner's spatial ability is an important predictor of success in learning anatomy.
There is one important potential limitation to these studies, which relates to the nature of the stimulus. It is possible that the findings are constrained by the stimulus materials, since the wrist bones fall naturally into two planes. The research should be replicated in other domains where the materials are more volumetric.
Taking the three studies together, our conclusion is that the potential for dynamic display of multiple orientations provided by computer-based anatomy software may offer minimal advantage to some learners and, based on previous research, may disadvantage learners with poorer spatial ability. The studies consistently show that most spatial information is gathered from the key views, whether under active or passive control, and time spent seeing the display in other orientations is time removed from essential learning. This conclusion is consistent with theories of mental representation of spatial objects, which suggest that objects are remembered in a canonical or familiar orientation and that unfamiliar orientations are recognized by mental rotation from these key views. Since spatial ability is strongly related to the ability to perform mental rotations, it follows that presenting information in oblique views may handicap, rather than aid, learners with poorer spatial ability.
One other empirical finding from this program of research should be noted. The second and third studies suggest that the opportunity to slightly change or wiggle the key-view orientation to permit some interpretation of the third dimension, without distracting from the canonical view, may be an optimal strategy for learning 3D objects.
1. Jolicoeur P, Milliken BW. Identification of disoriented objects: effect of context and prior presentation. J Exp Psychol: Learn Mem Cogn. 1989;15:200–10.
2. Shepard RN, Metzler J. Mental rotation of three-dimensional objects. Science. 1971;171:701–3.
3. Garg A, Norman GR, Spero L, Maheshwari P. Do virtual computer models hinder anatomy learning? Acad Med. 1999;74(10 suppl):S87–S89.
4. Garg A, Norman GR, Spero L. How medical students learn spatial anatomy. Lancet. 2001;357:363–4.
5. Vandenburg SG, Kuse AR. Mental rotations: a group test of three-dimensional spatial visualization. Percept Motor Skills. 1978;47:599–604.