Fixation disparity is a small ocular deviation in minutes of arc within the Panum fusional area of one eye or of both eyes during normal binocular vision.1–3 Fixation disparity can be measured by presenting two nonius lines dichoptically, and their positions can be changed by adding lenses or prisms. The smallest amount of prism needed to correct the fixation disparity is called the associated phoria.4–6 Associated phorias can be measured clinically by a variety of tests including the Mallett Test, the American Optical (AO) Vectographic Slide, the Wesson Card, the Sheedy Disparometer, and the Saladin Card.6–8
The Polatest is another test that measures associated phorias. It is used primarily in European countries to prescribe prismatic corrections. The interpretation of the results is referred to as measuring and correcting methodology (MKH-Haase method).9–13 Fig. 1 shows the distance MKH-Haase associated phoria charts. The near charts are scaled versions of the distance charts.
One of the factors used in determining whether any training or optical prescription has changed the associated phoria is whether the difference in the value after intervention exceeds the repeatability of the measurement on separate days without any treatment. Repeatability data for the associated phoria tests are limited. The repeatability of the horizontal associated phoria measured using the near Mallett Unit has been reported to range from good to perfect and the repeatability for the Disparometer was reported to range from good to bad, depending on how the tests were familiar to the subjects.14 The test-retest correlation for the horizontal associated phoria measured with the Saladin Card was also reported to be good.15 To our knowledge, test-retest repeatability for the MKH-Haase charts or other common associated phoria tests has not been studied.
The objective of this study was to investigate the test-retest repeatability of horizontal and vertical associated phoria measurements using a variety of tests. This information will help determine the precision of the test and provide guidance in determining whether the condition has changed with time or treatment.
The i.Polatest (version 1.2 by Carl Zeiss Vision GmbH, Aalen, Germany) has a number of targets for measuring associated phorias at distance (6 m) and near (40 cm). They are the Cross, Pointer, Double Pointer, and Rectangle tests (Fig. 1). The Cross and Double Pointer tests measured both horizontal and vertical associated phoria, whereas the Pointer Test measured only the horizontal value and the Rectangle Test measured only the vertical value.
Distance horizontal and vertical associated phorias were also measured with the Mallett Unit (Imperial Optical Co, Mississauga, ON) and the AO Vectographic Slide (Stereo Optical Co, Inc, Chicago, IL). The near associated phorias were measured with the Mallett Unit (Imperial Optical Co); the Near AO Vectographic Card (Optometric Research Institute Inc, Memphis, TN); the Saladin Near Point Balance Card, version 1 (Michigan College of Optometry, Big Rapids, MI); the Disparometer (Vision Analysis, Columbus, OH); and the Wesson Card, 5th ed. (Bernell, Mishawaka, IN).
Subjects were recruited through University of Waterloo electronic and print media. All subjects were naïve about the clinical procedures used in the study. Ages ranged from 18 to 35 years with a mean of 26 years. Most of the subjects were not optometry students. The few optometry students who did participate were in their first year and unfamiliar with the tests or procedures. Subjects were divided into two groups, asymptomatic and symptomatic, based on answering yes to three or more questions shown in Table 1. This questionnaire was not validated but used to ensure that the history was consistent. Thirty-four symptomatic subjects and 40 asymptomatic subjects participated in this project; however, only 30 subjects in each group completed all of tests. The remaining subjects were not tested with the Wesson Card because the test was unavailable at the beginning of the study.
Additional inclusion criteria for both groups were corrected visual acuity in each eye of at least 6/6, absence of ocular diseases, nonstrabismic at both 6 m and 40 cm, and stereoacuity of less than or equal to 60 seconds of arc using the Randot Circles Test (Stereo Optical Co, Inc). The study was approved by University of Waterloo’s Office of Research Ethics.
All tests were administered by the first author. The initial session included the questionnaire and a partial vision assessment. In addition to determining the inclusion criteria, fusional vergences and accommodative functions were also assessed for later analyses. Subjects who met the inclusion criteria were asked to return after a minimum of 2 hours. This break allowed them to rest from the initial assessment.
Distance associated phorias were measured before near. The test sequence at distance and near was determined by random block design. However, the MKH-Haase charts were presented in the recommended sequence of the Cross Test, the Pointer Test, the Double Pointer Test, and finally the Rectangle Test.16
The MKH-Haase protocol requires two measurements for each chart. For the first measurement, view 1, the Polaroid axes are 45 and 135 degrees for the right and left eyes, respectively. For the second, view 2, the axes are flipped 90 degrees. View 1 was always presented first. Which part of the target is viewed by each eye with the two configurations depends on the chart. Fig. 1 shows the view 1 presentations. At distance, the views could be switched by either flipping the axes of each polarized lens in the trial frame or reversing the polarization of the targets on the MKH-Haase distance charts. This latter option was unavailable for the near charts.
To measure the horizontal associated phoria, the subject reported whether a vertical line was aligned with the central fixation lock or bisected the break in the horizontal line of the Cross Test. If not, prisms (supplied with the Polatest) were inserted in steps of 0.25Δ to align the targets. If alignment occurred for multiple prisms, then the minimum value was selected as the associated phoria. If there was no alignment before the reversal, then the prism that produced the reversal was recorded as the associated phoria. The prisms were removed and the procedure was repeated for view 2. Vertical associated phorias were measured next if the target was present on the chart using the same procedure.
This procedure varied from the recommended MKH-Haase method. The recommended protocol is to start the associated phoria measurements with the prism from the previous chart in place and alter the value if necessary. We started with no prism for each chart to determine the repeatability with minimal bias from the previous measurement.
Associated phoria measurements on the other tests followed the same procedure, except that the axes of the Polaroid lenses were always at 45 and 135 degrees for the right and left eyes, respectively. When possible, the prismatic power was balanced between the two eyes. Horizontal and vertical associated phorias were measured by the Saladin Card, the Disparometer, and the Wesson Card using the trial frame without generating the fixation disparity curve.
The tests were repeated within 10 to 15 days after the first session by the same examiner. The testing sequence for the second visit was identical to the subject’s first session.
We first compared the view 1 and view 2 results. The mean difference, using the paired t test, for both groups was never larger than 0.05Δ for any test and the differences were not significant (p > 0.05). Because there were no significant differences, the values for the two presentations were averaged for further analysis. This is different from the MKH-Haase procedure. The recommended procedure is to measure associated phoria for view 1 and then present view 2 with any prism in place. The power would be modified as needed. We started each view without any prism to be consistent with measuring the associated phorias across charts.
The repeatability between sessions was examined by determining the mean difference between session 1 and session 2. The 95% confidence interval for difference of means was used to determine whether the differences were statistically different from zero. The second method was a linear regression between session 1 and session 2. This analysis determined whether any between-session results varied as a function of the associated phoria magnitude. If the linear regression was statistically significant based on the rejection level of p less than or equal to 0.05, a Deming linear regression (assuming equal variance) was performed to estimate the y-intercept and slope because both session results are subject to random errors.17 The coefficient of repeatability (COR) was the third index used to evaluate between-session repeatability. This value is 1.96 times the SD of the mean difference between sessions and is the 95% limits of agreement for repeatability.18,19 Sigma Plot (version 12.5, Systat, Chicago, IL) was used to analyze the data.
Repeatability of Horizontal Associated Phoria Tests at Distance
Table 2 summarizes the results at distance for both groups. None of the mean differences between sessions were statistically different from zero. The correlation coefficients were strong and significant for all tests. The y-intercepts were not statistically different from zero for any test. However, the slopes of most of the tests were statistically different from 1.0 (values in boldface in Table 2). Slopes for most tests were less than 1.0 except for the Double Pointer Test for the asymptomatic group.
Fig. 2 shows the between-session linear regression for the AO distance horizontal associated phoria. These results are illustrative of the other tests where the slope was significantly less than 1.0. The shallower slope resulted from a reduction in magnitude of the higher associated phorias at the second session. The slope of the Double Pointer Test was greater than 1.0 because one participant had higher exo associated phoria value (by 1.00Δ) at the second session than at the first session.
The limits of agreement for the symptomatic group were larger than those for the asymptomatic group for all the tests. It was within ±0.875Δ for the asymptomatic group and within ±1.375Δ for nearly all the tests of the symptomatic group. The exception was the AO Slide where their limits were within ±1.875Δ (Table 2).
Repeatability of Horizontal Associated Phoria Tests at Near
Table 3 summarizes the results for the near tests. None of the symptomatic group’s mean differences were statistically different from zero. The correlation coefficients were statistically significant and greater than 0.6, except for the Disparometer. None of the y-intercepts were statistically different from zero. The slopes for most functions were statistically identical to 1.0 with the exception of the Mallet Unit where the slope was statistically greater than 1.0 (value in boldface in Table 3). The steeper slope for the Mallett Unit was attributed to one participant whose exo value increased by 1.50Δ at the second session.
Fig. 3A shows the linear regression of the Disparometer symptomatic group. The main reason behind the low correlation was the results of the two subjects shown in the lower right corner. If these two subjects are excluded, then the correlation coefficient increases to 0.72 and is statistically significant (p < 0.001). The slope changes to 0.5, which is significantly less than 1.0. The flatter slope for the remaining subjects was because of a reduction in the second session associated phoria for some subjects with larger values at the first session. The y-intercept was not significantly different from zero.
The limits of agreement for the symptomatic group were ±2.00Δ for most tests. The exception was the Disparometer, where the limits were −5.75 to 4.25Δ with all subjects included.
For the asymptomatic group, the mean differences between sessions were also statistically identical to zero for most of the tests. The Disparometer was the exception. Its mean phoria value at the second session was significantly less eso (more exo) by 0.36Δ. The between-session correlations for the asymptomatic group on the MKH-Haase charts, the Mallett Unit, and the Wesson Card were good and statistically significant. The correlation coefficient of 0.32 for the Cross Test was statistically significant, but this value was low relative to the other tests. The y-intercepts and slopes for these tests were not statistically different from zero and 1.0, respectively.
There was no significant between-session correlation for the AO Card, the Saladin Card, and the Disparometer. Fig. 3B shows the linear regression for the Disparometer associated phorias. Similar to the symptomatic group, there happened to be two subjects who had an eso value at the first session but exo values at the second visit. If they are excluded, then the correlation coefficient increases to 0.6 and is statistically significant (p < 0.001). The slope changes to 0.6, but it is significantly lower than 1.0. The y-intercept is not significantly different from zero.
Fig. 4 shows the linear regression for the Saladin Card for the asymptomatic group. It shows that most of the participants (i.e., 92%) had a zero associated phoria for both sessions with only few of them having any associated phoria at either visit. The Saladin Card results are illustrative of the Cross Test where the correlation was low and the AO Card where the correlation was not significant.
The asymptomatic group’s 95% limits of agreement were within ±1.00Δ for the Mallett Unit, the AO Card, and the Saladin Card and within ±2.00Δ for most of the other tests. The exception was the Disparometer, where the limits of agreement were −2.375 to 1.75Δ. The asymptomatic group’s limits were lower than those for the symptomatic group for all tests but the Wesson Card. The reason for the larger limits of agreement for the asymptomatic group was four subjects had relatively large differences (e.g., 1.50Δ) between sessions.
Repeatability of Vertical Associated Phoria Tests at Distance and Near
Tables 4 and 5 summarize the results of vertical tests at distance and near. None of the mean differences were statistically different from zero. There was no significant correlation between sessions for most of the tests. The lack of significant correlations occurred because nearly everyone had a vertical associated phoria within ±0.25Δ of zero for both sessions.
There were a few tests where the linear correlations were statistically significant. For these tests, the range of values was ±1.00Δ such that it was likely that this range was sufficient for achieving a significant correlation. Although the y-intercepts were statistically identical to zero, the slopes were significantly lower than 1.0 for most of these tests (values in boldface in Tables 4 and 5) because the associated phoria values at the second session were generally lower in magnitude than the first session. The exception was the distance Rectangle test for the asymptomatic group where the slope was greater than 1.0.
The slopes for the distance AO Slide and the near Mallett Unit linear regressions were negative because a few subjects with no associated phoria at the first session had a right hypo associated phoria at the second session. This resulted in a negative slope because right hypo values were negative.
Similar to the horizontal associated phorias, the asymptomatic group’s vertical limits of agreement and COR were lower on all tests. At distance, the limits of agreement tests were within ±0.50Δ for the asymptomatic group and ±0.875Δ for the symptomatic subjects. At near, the limits of agreement were within ±0.25Δ for the asymptomatic group and within ±0.50Δ for the symptomatic group.
Repeatability was determined using the mean between-session difference, linear regression, and COR. Each parameter has advantages and disadvantages in specifying repeatability and visualizing the results,18,19 and thus all three analyses are presented.
The distance horizontal associated phorias showed good repeatability for most tests based on all three parameters. Nevertheless, there were many tests where the slope of the regression was less than 1.0. This result was attributed to a decrease in the larger associated phorias at the second session for a minority of the subjects. This finding suggests that repeated measures on separate days may be required to establish a baseline for subjects with larger distance horizontal associated phorias.
The limits of agreement for the symptomatic group tests were generally within ±1.375Δ and ±0.50Δ for the asymptomatic group. The Cross Test and AO Slide were exceptions. The lack of central fusion lock in the Cross Test could be responsible for the wider limits in the asymptomatic group. However, the symptomatic group results did not support this hypothesis. Their Cross Test limits of agreement were not relatively larger than the other tests. Either a central lock is a factor for repeatability in only asymptomatic patients or there may be another unknown factor influencing the asymptomatic group’s repeatability. The AO Slide target had a central fusion lock and hence this was not a factor contributing to its wider limits in both groups. A possible explanation could be the contrast of the peripheral fusion lock. All other distance targets had high-contrast peripheral contours, whereas the projected AO Slide contours were lower in contrast. Fusion may have been more unstable with the lower-contrast edges.
Most of the near horizontal associated phoria tests also showed good repeatability. The exception was the Disparometer. There was a shift in the exo direction at the second session for both groups that reached statistical significance for the asymptomatic group. In addition, the linear correlations between sessions were not significant for both groups. The lack of statistically significant correlation was attributed to a change from a relatively large eso associated phoria to an exo associated phoria at the second session for a small number of subjects. Even if these subjects were excluded, the larger associated phorias would still decrease at the second session. This decrease was similar to Pickwell et al.’s report14 that the Disparometer associated phoria values decreased with repetition in naïve subjects. Training effects may have been the reason for Pickwell et al.’s finding that the repeatability of the Disparometer was good for experienced subjects.
The lower repeatability of the Disparometer could be attributed to the test design. There are two features that may interact with each other to increase between-session variability. One factor is the lack of a central fusion lock and the other is the location of the nonius lines.20 The effect of no central fusion lock on the Disparometer repetition is uncertain. Previous studies have reported that fixation disparities with central fusion targets were less variable and less in magnitude than tests without central fusion locks.21 Wildsoet and Cameron20 confirmed that a central lock reduced the between-subject variability in the Disparometer fixation disparity measures, but the central target actually increased the intersubject variability for the associated phoria. The reason for the increase in intersubject associated phoria variability is unclear, but it may be related to the type of fixation disparity curve.22
The effect of the nonius lines being located behind the fixation plane on the repeatability is also uncertain. It is possible that the two options for focusing produced the higher between-session variability. The slight exo shift that occurred at the second session suggests that some subjects may have focused on the white background lines at the first session20 but chose the nonius lines as the fixation plane at the second visit. An interaction between the physical location of the nonius lines and the lack of a central fusion lock could explain why the other associated phoria test without a central lock (e.g., the Cross Test) did not have a similar degree of between-session variability.
Some of our repeatability findings for the Saladin Card horizontal associated phoria were similar to the results of the study by Corbett and Maples.15 They also reported that the mean difference between sessions was not statistically different from zero. They found, however, a moderate (∼0.5) correlation between sessions using optometry students as subjects. The correlation in our study was only significant for the symptomatic group. Their limits of agreement were wider than ±2.00Δ in our study by about 1.50Δ. Different prismatic step sizes used in the two studies is the most likely explanation for difference in the limits. We used a step size of 0.25Δ, whereas Corbett and Maples used a 1Δ step. The coarser step size would be expected to produce larger limits of agreements. The larger prismatic step combined with more experienced subjects in Corbett and Maples’ study could explain why their nominally asymptomatic group had a significant correlation between sessions.
Our finding that the mean difference between horizontal associated phorias measured with the near Mallett Unit was not significantly different from zero agreed with Pickwell et al.’s14 findings of no significant differences between the Mallett horizontal associated phorias measured over three visits for either experienced or inexperienced subjects.
Vertical associated phoria tests at both distances showed good repeatability in terms of the COR. The linear correlations were nonsignificant for most tests because the values for the both sessions were zero for nearly everyone. The Rectangle Test, the Mallett Unit, the Saladin Card, and the Wesson Card were exceptions. The one common factor in these exceptions was that the range of vertical associated phorias was greater for these tests (e.g., ±1.00Δ).
The COR and the limits of agreement for most horizontal and vertical associated phoria tests were lower for the asymptomatic group relative to the symptomatic group’s values. One might expect this finding based on the finding (or expectation) that the asymptomatic group was more likely to have associated phorias near zero at both sessions so that the between-session differences would be smaller. The exception was the Wesson Card horizontal associated phoria. The limits of agreement were similar for both groups because of the large between-session differences for four participants in the asymptomatic group. The reason for the large differences for these subjects is uncertain.
Intuitively, clinicians might expect to find larger limits of agreement and lower repeatability in patients with nonstrabismic binocular vision problems. However, our subject groups are based on symptoms and not clinical findings, although it is possible that patients with “normal” clinical findings may be symptomatic. Because the subject classifications were not based on clinical tests, we divided the symptomatic group into two subgroups to determine whether the general lower repeatability was primarily attributed to subjects with abnormal binocular vision based on clinical tests. The first subgroup included those participants who had at least one clinical abnormal binocular vision test result based on Morgan’s classification.6 The other subgroup included those symptomatic participants who did not have any clinical binocular vision abnormal values. There were 15 participants in the first subgroup and 20 in the second subgroup. Six participants in the second group had at least one clinical accommodative problem such as accommodative infacility or accommodative insufficiency. The COR and limits of agreement were generally smaller for the group with the abnormal clinical test results than the second group without abnormal clinical findings. The exception was the Sheedy Disparometer limits of agreement. The subgroup with the abnormal clinical results showed a much wider range than the second subgroup because the first group contained the two participants who had the largest between-session differences.
Based on the repeatability results of this study, most of the horizontal associated phoria tests, including the MKH-Haase charts, showed good repeatability for both subject groups at distance and near. Of the horizontal distance tests, the AO Slide was the least repeatable test. The lower repeatability was probably attributed to the lower contrast of the peripheral fusion locks. The exception of the good repeatability at near was the Disparometer. Its relatively low repeatability may have been caused by an interaction between the physical locations of the nonius lines relative to the fixation plane and the lack of a central fusion target. The repeatability of the vertical associated phoria measures was similar across tests. This result occurred because most subjects had no vertical associated phoria on most tests at either session. Additional studies using subjects who have vertical binocular vision problems would be useful in examining the repeatability of the vertical associated phorias.
Department of Ophthalmology and Visual Science
Eye, Ear, Nose, and Throat Hospital
Shanghai Medical College, Fudan University
83 Fenyang Rd
We thank Dr. Natalie Hutchings for providing the Polatest. Funding for the project was provided by King Saud University in Riyadh, Saudi Arabia; the Cultural Bureau of Saudi Arabia, Ottawa; and the Canadian Optometric Education Trust Fund.
Received December 14, 2014; accepted April 8, 2015.
1. Ogle KN, Mussey F, Prangen AD. Fixation disparity
and the fusional processes in binocular single vision. Am J Ophthalmol 1949; 32: 1069–87.
2. Ogle KN. Researches in Binocular Vision. Philadelphia, PA: Saunders; 1950.
3. Ogle KN. Fixation disparity
and oculomotor imbalance. Am Orthopt J 1958; 8: 21–36.
4. Brownlee GA, Goss DA. Comparisons of commercially available devices for the measurement of fixation disparity
and associated phorias. J Am Optom Assoc 1988; 59: 451–60.
5. Hofstetter HW. Dictionary of Visual Science and Related Clinical Terms, 5th ed. Boston, MA: Butterworth-Heinemann; 2000.
6. Scheiman M, Wick B. Clinical Management of Binocular Vision: Heterophoric, Accommodative, and Eye Movement Disorders, 3rd ed. Philadelphia, PA: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2008.
7. Eskridge JB, Amos JF, Bartlett JD. Clinical Procedures in Optometry. Philadelphia, PA: Lippincott; 1991.
8. Rutstein RP. Anomalies of Binocular Vision: Diagnosis & Management. St. Louis, MO: Mosby; 1998.
9. Haase H. Binocular testing and distance correction with the Berlin Polatest
. J Am Optom Assoc 1962; 34: 115–25.
10. Kommerell G, Gerling J, Ball M, de Paz H, Bach M. Heterophoria and fixation disparity
: a review. Strabismus 2000; 8: 127–34.
11. Gerling J, de Paz H, Schroth V, Bach M, Kommerell G. Can fixation disparity
be detected by the measurement and correctional techniques H.J. Haase (MKH)? Klinische Monatsblatter Fur Augenheilkunde 2000; 216: 401–11.
12. Brautaset RL, Jennings JA. Associated phoria
and the measuring and correcting methodology after H.-J. Haase (MKH). Strabismus 2001; 9: 165–76.
13. London R, Crelier RS. Fixation disparity
analysis: sensory and motor approaches. Optometry 2006; 77: 590–608.
14. Pickwell LD, Gilchrist JM, Hesler J. Comparison of associated heterophoria measurements using the Mallett test for near vision and the Sheedy Disparometer. Ophthalmic Physiol Opt 1988; 8: 19–25.
15. Corbett A, Maples WC. Test-retest reliability of the Saladin card. Optometry 2004; 75: 629–39.
16. Scorth V. Binocular Correction: Aligning Prisms According to the Haase Approach. De Groot Drukkerij, Netherlands: Zijdar Book; 2012.
17. Cornbleet PJ, Gochman N. Incorrect least-squares regression coefficients in method-comparison analysis. Clin Chem 1979; 25: 432–8.
18. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986; 1: 307–10.
19. Bland JM, Altman DG. Comparing methods of measurement: why plotting difference against standard method is misleading. Lancet 1995; 346: 1085–7.
20. Wildsoet CF, Cameron KD. The effect of illumination and foveal fusion lock on clinical fixation disparity
measurements with the Sheedy Disparometer. Ophthalmic Physiol Opt 1985; 5: 171–8.
21. Ukwade MT. Effects of nonius line and fusion lock parameters on fixation disparity
. Optom Vis Sci 2000; 77: 309–20.
22. Carter DB. Fixation disparity
with and without foveal fusion contours. Am J Optom Arch Am Acad Optom 1964; 41: 729–36.
Keywords:© 2015 American Academy of Optometry
Polatest; MKH-Haase Binocular Vision Charts; associated phoria; fixation disparity; symptomatic binocular vision