# Evaluation of Image Quality Metrics for the Prediction of Subjective Best Focus

Purpose. Seven existing and three new image quality metrics were evaluated in terms of their effectiveness in predicting subjective cycloplegic refraction.

Methods. Monochromatic wavefront aberrations (WA) were measured in 70 eyes using a Shack-Hartmann based device (Complete Ophthalmic Analysis System; Wavefront Sciences). Subjective cycloplegic spherocylindrical correction was obtained using a standard manifest refraction procedure. The dioptric amount required to optimize each metric was calculated and compared with the subjective refraction result. Metrics included monochromatic and polychromatic variants, as well as variants taking into consideration the Stiles and Crawford effect (SCE). WA measurements were performed using infrared light and converted to visible before all calculations.

Results. The mean difference between subjective cycloplegic and WA-derived spherical refraction ranged from 0.17 to 0.36 diopters (D), while paraxial curvature resulted in a difference of 0.68 D. Monochromatic metrics exhibited smaller mean differences between subjective cycloplegic and objective refraction. Consideration of the SCE reduced the standard deviation (SD) of the difference between subjective and objective refraction.

Conclusions. All metrics exhibited similar performance in terms of accuracy and precision. We hypothesize that errors pertaining to the conversion between infrared and visible wavelengths rather than calculation method may be the limiting factor in determining objective best focus from near infrared WA measurements.

*MSc

^{†}PhD, MD

^{‡}PhD

Institute of Vision and Optics, University of Crete, Heraklion, Crete, Greece.

Received April 23, 2009; accepted November 17, 2009.

It is well documented that the second order aberrations (i.e., defocus and astigmatism) represent the main aberrations present in the average healthy eye.^{1,2} The contribution of higher order aberrations becomes apparent only when defocus and astigmatism are compensated by an appropriate spherocylindrical correction. In clinical practice, subjective refraction is considered the “gold standard” for the determination of the most appropriate spherocylindrical prescription. In principle, however, an appropriate spherocylindrical correction that minimizes wavefront aberration (WA) can be calculated when WA is known for a given eye. Nonetheless, it has been demonstrated in several reports^{3–5} that, in the general case, the spherocylindrical correction that minimizes the root mean square (RMS) WA does not correspond to the correction that is perceived by subjects as optimal, in the sense that it is not strongly correlated to the refraction that maximizes visual acuity.^{3} More advanced and elaborate metrics than the RMS WA have been evaluated in terms of their effectiveness in predicting best focus^{5} and visual acuity.^{3} These metrics are based on either the flatness of the wavefront (e.g., the fraction of the pupil that is within λ/4) or the image plane properties of the optical system, such as the Strehl ratio (and variations of it), and the spatial characteristics of the point spread function (PSF).^{4}

Still, the calculation of the optimal spherocylindrical correction based on WA of an eye remains a challenging problem, because even metrics of increased complexity fail to give accurate predictions when tested against experimental data.^{5} Furthermore, the impact of the Stiles and Crawford effect (SCE), the longitudinal chromatic aberrations of the eye, and the task performed during subjective refraction (optimization of acuity—small letter recognition) in predicting subjective spherocylindrical corrections by optimizing WA metrics has not been investigated. For example, a weighting modulation transfer function (MTF) centered at higher spatial frequencies than the peak of the mean contrast sensitivity function could presumably account for the task performed during subjective refraction. Moreover, we introduced a metric evaluating preferably the vertical and horizontal orientations of the optical transfer function to take into account the “oblique effect.” The term oblique effect refers to the decline in contrast sensitivity at oblique, in contrary to vertical and horizontal, orientations.^{6,7} The purpose of this study was to introduce such a new set of image quality metrics and to evaluate their accuracy and precision in predicting the spherical correction.

## MATERIALS AND METHODS

### Subjects

Of all patients who were referred for refractive surgery for a period of 5 months, we selected 41 (82 eyes) that had visual acuity with cycloplegia better than −0.10 logMAR, no previous ocular or systemic pathology, and cycloplegic pupil diameter in the range of 6 to 6.5 mm.

The rationale for selecting patients with cycloplegic pupil diameter in the range mentioned above is the low repeatability of wavefront measurements with our instrument for pupil diameters larger than 7 mm. Contact lens wearers had discontinued contact lens use at least 2 weeks before the examination. The study was performed under board approval of the Institute of Vision and Optics of the University of Crete.

### Procedure

Fifteen minutes before all measurements, the pupil was dilated using 1 drop of tropixal (0.5%) and 1 drop of phenylephrine (10%), while accommodation was paralyzed using 1 drop of cyclogyl (1%), according to a standard preoperative evaluation of refractive surgery patients. Pupil diameter was measured using an infrared pupilometer. Subjective cycloplegic refraction that optimized visual acuity with maximum plus refractive correction in place was measured by an experienced clinician. To this end, an eye chart located at 4 m and illuminated with white light was used. Moreover, trial lenses in steps of 0.25 D were used, while the astigmatic correction was determined by the cross-cylinder method. The accuracy of the astigmatic axis was 5 degrees when the subjective refractions were corrected for the effectivity of the spectacle lenses at a vertex distance of 12 mm.

Immediately after subjective cycloplegic refraction, the monochromatic WA of each eye was measured by means of a Shack-Hartmann wavefront analyzer (Complete Ophthalmic Analysis System; Wavefront Sciences). In short, an infrared super luminescent light-emitting diode (840 nm) was focused on the retina after proper alignment of the subject's line of sight with the instrument, and the emerging wavefront was sampled by a square lenslet array (33 × 44, 1452 lenslets). The diameter of each lenslet was 144 μm. The variability of the measurements with the particular device was assessed in a previous study.^{8}

For each eye, 30 consecutive Shack-Hartmann images were acquired and analyzed for a 6 mm pupil diameter. For each of the 30 images, the WA was fitted up to fourth-order Zernike polynomials, expressed to the notation recommended by the Optical Society of America.^{9} Although infrared light was used, the aberrations were calculated with respect to the peak spectral sensitivity of the eye (550 nm) by using an appropriate correction formula, incorporating both the longitudinal chromatic aberration of the eye and the retinal thickness at the fovea.^{8} The average of each individual Zernike term was used to reconstruct the WA that was subsequently used to calculate the (previously described) image quality metrics^{10} summarized in Table 1 and the new metrics described in the following sections. Before the comparative analysis with the metric-based refractions, all cycloplegic spherical refractions were shifted by 0.25 D to account for the chart distance (−0.25 D).

### Estimation of Objective Refraction by Optimization of Optical Metrics

For the average of 30 wavefront measurements, an initial estimate of the power vector components was made using the paraxial curvature approximation^{5} by using the following equations:

Subsequently, a spherocylindrical prescription was calculated by using the following equations^{11}:

Based on these values, a spherocylindrical wavefront was calculated and added on the measured wavefront to provide an initial estimate for the spherocylindrical correction. After this step, the value of each of the metrics was calculated for an additional spherical correction ranging from S −0.75 to S +0.75 (in steps of 0.10 D), essentially “scanning” a range of 1.5 D about the initial estimate. To quantify optical quality, we calculated 10 metrics for the above range of spherical correction.

In addition to the seven metrics summarized in Table 1, three new metrics were used:

- Weighted MTF (weightedMTF) at 18 cycles per degree (cpd): This metric is similar to the volume under the MTF in visually relevant spatial frequencies, only different in the sense that the center of the weighting function was at 18 cpd. This metric was introduced in a previous report in an attempt to predict accommodative response.
^{12}The weighting function for weightedMTF was - (where W
_{o}= 0.8, α = 0.0072, and f_{o}= 18). - Very high frequency MTF (vhfMTF): This metric was defined for the evaluation of the volume under the MTF at the range of spatial frequencies characterizing the two-dimensional Fourier transformations of 6/6 Snellen letters. A 6/6 Snellen letter is created by lines and gaps that have angular dimensions of 1 arc min (corresponding to half a period); quite obviously, the fundamental frequency of such a square-wave modulation lies at 30 cpd (2.5 periods in 5 arc min). The vhfMTF metric is the weighted integral under the MTF, where the weighting function is centered at 36 cpd (ranging from 18 to 48). It was defined on the basis of spatial frequencies characterizing the letters used in subjective optimization of refraction. The weighting function for vhfMTF was
- (where W
_{o}= 0.8, α = 0.0017, and f_{o}= 36). - OrientMTF: This metric is essentially identical to the vhfMTF in the sense that it is a weighted sum under the MTF at the same (high) spatial frequencies. Moreover, the MTF was weighted with an appropriate angular function to preferably take into account the horizontal and vertical orientations of the MTF. The angular weighting function was (1 + Z
^{4}_{4}) where Z^{4}^{4}is the forth radial and angular-order Zernike polynomial. The radial parameter in this case was the normalized (in respect to the maximum) spatial frequency. A similar angular weighting function was introduced by Watson and Ahumada^{13}to approximate a standard model for foveal detection of spatial contrast.

The polychromatic metrics were derived from their monochromatic counterparts by calculating the polychromatic PSF as the weighted sum of the monochromatic PSFs at various wavelengths. The spectral sensitivity of the eye V(_{λ}) was used as a weighting function, while the longitudinal chromatic aberration of the eye was used to calculate the defocus at the various wavelengths.^{14}

Finally, to compensate for the visual effectiveness of rays entering at different distances from the pupil's center,^{15} the SCE was additionally incorporated in all single valued metrics besides the paraxial curvature, the RMS, the PFSc, and the PFSt. The SCE was incorporated as a Gaussian apodizing filter in the pupil function:

where r is the pupil radius and β parameter was assumed to be 0.12/mm^{2}.^{16,17}

### Statistical Analysis

To assess the effectiveness of the single valued metrics in predicting the subjective cycloplegic spherical correction, each calculated refraction was compared with the corresponding subjective cycloplegic refraction. The agreement between the subjective cycloplegic spherical refraction and the objective spherical refraction was estimated by correlation analysis and Bland-Altman plots.^{18,19} The correlation coefficient defines the strength of a relation between two variables, but it is not an indicator of their congruity. On the other hand, Bland-Altman plots provide a simple but informative analysis on the degree of agreement between two methods of clinical measurement and permit the investigation of any possible relationship between the measurement error and the true value. The lack of agreement can be summarized by evaluating the bias, the value determined by the subjective method minus the value determined by the objective methods. Finally, a comparison between the monochromatic and their polychromatic counterparts, as well as between the single valued metrics incorporating SCE and their counterparts not including SCE, was performed using Wilcoxon rank sum test. A p-value of <0.05 was considered as statistical significant.

## RESULTS

Seventy eyes of the 82 eyes were included in the study (12 eyes were excluded because of obviously problematic wavefront maps, for example, several missing points either in the center or in the periphery); mean age was 29.5 ± 7.7 years (range, 19 to 55 years). Subjective spherical refraction ranged from 0 D to −9.75 D (mean value ± SD, −3.78 ± 2.27 D) and astigmatic error from 0 D to −2.5 D (mean value ± SD, −0.49 ± 0.46 D).

The measurement error (bias) in predicting spherical refraction based on the single valued metrics varied. All metrics predicted subjective cycloplegic spherical refraction with a bias of 0.36 D or less, except paraxial curvature that predicted subjective cycloplegic spherical refraction with an accuracy of −0.68 D. Table 2 presents the bias in predicting subjective cycloplegic spherical refraction by the various single valued metrics. The polychromatic single valued metrics exhibited a bias that was significantly greater when compared with their monochromatic counterparts (p < 0.01). Moreover, single valued metrics incorporating the SCE exhibited a statistical significant improvement in predicting subjective cycloplegic spherical correction compared with their non-SCE counterparts (p < 0.01), except from the Strehl ratio, the orientMTF, and the vhfMTF. Finally, the mean difference between the subjective cycloplegic and objective spherical refraction as predicted by the RMS, the PFSc, and the PFSt was 0.36 D, 0.24 D, and 0.28 D, respectively. These three metrics along with the paraxial curvature were the weakest predictors of the subjective cycloplegic spherical refraction, meaning that they showed the largest amount of bias, when compared with the other monochromatic single valued metrics that incorporated the SCE.

With respect to the monochromatic metrics incorporating the SCE, the Strehl ratio exhibited the smallest bias (0.18 D), followed by the weightedMTF and the vhfMTF (0.19 D), the orientMTF and the Strehl ratio computed in frequency domain (0.20 D), and finally the visual Strehl ratio computed in frequency domain (0.24 D; Fig. 1; Table 2).

Despite the bias observed in all metrics, predicted spherical refraction showed a statistically significant correlation with subjective cycloplegic spherical refraction for all single valued metrics (coefficient of determination ≥0.96, p < 0.05; Fig. 2).

In regards with the SD of differences, all metrics exhibited similar behavior (∼0.50 D). The SD of the differences between the subjective cycloplegic and the objective spherical refraction ranged from 0.44 to 0.53 D. The following figure (Fig. 3) shows the Bland-Altman plot for one metric (monochromatic Strehl ratio computed in the Fourier domain considering the SCE).

## DISCUSSION

The appropriate definition of a metric that is optimized at the subjective best focus for each eye is not merely a task of specific practical interest. Identification of the mathematical conditions that describe retinal imagery at best focus could lead to a better understanding of the link between quality of the optics and quality of vision in the human eye. The fact that visual image quality is a subjective quantity in the sense that it may be regarded in a different manner among different individuals has led to the argument that a global metric of visual image quality may not exist. Our results, as well as previous reports,^{3–5,10,20,21} suggest that none of the metrics we evaluated predicted subjective best focus with a SD smaller than—or comparable with—the combined error margins of clinical refraction and wavefront aberrometry. Although these SDs seem to limit the applicability of wavefront-based calculations of refraction for clinical practice, it must be kept in mind that they are comparable with the intraobserver agreement for subjective refraction.^{22} The gold standard in this study for comparing metrics' predictions to subjective best focus were single measurements of clinical refraction performed by one clinician. Although possibly more accurate estimates of the subjective best focus would be achieved if multiple subjective refractions were averaged, the methodology we applied evaluates the level of agreement between refractions calculated by a wavefront aberrometry and refractions taken in a typical clinical setting. Our calculations were performed using Zernike polynomials up to the fourth order based on the assumption that higher orders (because of their small magnitude) do not contribute significantly in the calculated refraction. To validate this assumption, we evaluated the difference in the calculated refraction when fifth and sixth order terms were included in the WA. The fifth and sixth order terms were simulated using the statistical model of Thibos et al.^{2} These calculations confirmed that including these terms would not affect the results.

Our findings on the SD of the error were similar to values reported in previous studies.^{5} Moreover, in accordance with previous studies,^{4,5} the optimization of the RMS WA was found to be a poor predictor of subjective refraction. Our results demonstrated that although the paraxial curvature exhibited the smallest SD of the differences between subjective and objective refraction (0.44 D), it was characterized by a marked bias (−0.68 D). It is reasonable to expect that refraction based on paraxial curvature would be less accurate than those predicted by the rest of the metrics, because this metric neglects higher order aberrations (other than spherical aberration) present in the system.^{4} Although orders higher than the fourth may be of small significance, this may not necessarily be true for the third-order aberrations (e.g., coma). On the accuracy of refraction based on the paraxial curvature, there is confounding evidence in the literature. Thibos et al.^{5} have found that the error of refraction based on the paraxial curvature was surprisingly small, whereas Guirao and Williams^{4} have reported an error similar to the error reported herein.

Previous investigators^{5} have suggested that the implementation of polychromatic metrics and incorporation of the SCE might improve the accuracy and the precision of the results. In this study, we evaluated more elaborate (and presumably more accurate) metrics, such as polychromatic metrics and metrics where the SCE is incorporated, without a dramatic improvement of the results. The best results were observed with monochromatic metrics incorporating the SCE. However, the improvement of the accuracy was marginal. In other words, the refraction predicted by the SCE metrics correlated better to the refraction predicted by their non-SCE counterparts rather than to the subjective refraction.

In addition to the incorporation of the SCE in existing metrics, we defined and evaluated metrics that were, in a way, modeling the task used to optimize subjective focus, i.e., subjective optimization of angular resolution. These metrics (such as the vhfMTF and the orientMTF) are weighted integrals under the MTF at high spatial frequencies centered at 36 cpd, matching the Fourier power spectra of letters having angular dimensions equal to 5 arc min. The underlying hypothesis was that because the subject was able to identify these letters, its optical system should be able to transfer efficiently the characteristic frequencies in the letter's spectrum. We hypothesized that keeping the above- mentioned portion of the MTF above a certain threshold would be a mathematical argument of relevance to the task performed. Moreover, given that the MTF at high spatial frequencies is very sensitive to defocus, these metrics would define defocus within a narrow range. Although not necessarily describing optimal image quality in a broad sense, the refraction predicted by these metrics would reflect the refraction necessary for the optical system to transfer the spatial frequencies of the resolved Snellen letters.

As mentioned in the “Methods” section, all subjects had best spectacle-corrected visual acuity better than −0.1 logMAR (mean 0.009, SD 0.0396). Intriguingly, for some of the subjects, the difference between the calculated and the manifest refraction is so great, which should be examined on the basis of measurement and/or interpretation error. The variability of the WA measurement itself could not lead to errors of this magnitude. The most significant source of error seems to lie in the conversion of the WA measured in the infrared to the equivalent WA at the center of the visible spectrum. In this conversion, an extrapolation of the longitudinal chromatic aberration for the human eye was involved,^{14} as well as an apparent increase of the axial length of the eye because of the increased relative reflectance of the retinal pigment epithelial (RPE) layer compared with the photoreceptor layer.^{5,23} A thickness of 125 μm was assumed between the photoreceptor layer and the RPE layer, presumably representing the stronger reflectance in the near infrared.^{24} However, intersubject variability of fundus pigmentation may have resulted to different contributions to total reflectance from the photoreceptor layer and the RPE layer. Moreover, there is confounding evidence as to whether such an axial correction is necessary. Llorente et al.^{25} found that the Thibos et al.^{14} model works well in giving a correction, suggesting that the distance correction is unnecessary, while Williams et al.^{26} have suggested that the photoreceptors act as waveguides on the return path so that the distance between them and the reflecting layer is irrelevant. Nonetheless, other reports have illustrated a decrease in cone directionality and an increase of the diffuse background reflection with wavelength, suggesting a change on the average overall reflectance depth.^{27,28} Anyhow, this apparent axial length difference (125 μm between photoreceptor and RPE layers) corresponds with a refractive difference of approximately 0.35 D. The error associated with the above-mentioned assumptions may need to be further investigated in future studies. Moreover, it is not clear that the WA should be converted to the wavelength of 550 nm for all subjects. Given that luminance contrast created by the L and M cones is responsible for structure (i.e., letter) identification using foveal vision, it is possible that refraction should be optimized for a wavelength between the peak of the spectral sensitivity of the L and the M cones depending also on the L/M ratio of each subject. If this hypothesis is true, it could explain another (approximately) 0.1 D of intersubject variability originating from the longitudinal chromatic aberration between best focus for L and M cones.

Another issue that has not been mathematically treated in this or previous similar studies is depth of focus. All metrics resulting from WA measurements are optimized using an appropriate spherical correction to place the far point of each given eye to infinity. On the other hand, subjective cycloplegic refraction as performed in this study intends to make the eye less myopic by one-half of its depth of focus. This fact may lead all metrics to optimize with a systematic bias toward more negative refractions than observed clinically, thus decreasing the bias value. This is not easy to overcome, because depth of focus in vision is another subjective quantity, possibly more difficult to describe mathematically than best focus. This might also explain the difference in accuracy compared with a previous study,^{5} in which wavefront measurements were taken after correcting the spherocylindrical error using trial lenses to emphasize the effect of high-order aberrations.

Finally, a cylindrical wavefront was added to the original wave aberration of the eye to correct for the astigmatic component. The cylindrical wavefront approximation was based on the second- and fourth-order astigmatic Zernike coefficients and the power vector analysis that has been shown to provide a reliable prediction of manifest cylinder correction.^{11} However, residual astigmatism may influence the calculation of the spherical correction as the metric-optimizing sphere might be shifted to compensate for the actual rather than the calculated astigmatism. Ideally, sphere cylinder and axis should be optimized simultaneously for each metric. However, optimization of all three variables for all subjects and all eyes could not be performed within a practical time frame using the available computing power.

## CONCLUSIONS

All different metrics resulted to effectively similar performance with no striking superiority in either accuracy or precision of any metric. In accordance to previous studies, our results indicate that it is not possible to define a global metric capable of predicting subjective best focus by simple processing of WA data recorded in the infrared. This finding is more likely to occur because of measurement and interpretation error (namely conversion between measurement and visually relevant wavelengths) rather than calculation or definition errors.

## ACKNOWLEDGMENTS

The project is co-funded by the European Social Fund and National resources.

Harilaos S. Ginis

Institute of Vision and Optics

University of Crete

GR-71003 Voutes Heraklion

Crete, Greece

e-mail: ginis@ivo.gr

## REFERENCES

**Keywords:**

single valued metrics; spherical refraction; wavefront; aberrations; Stiles Crawford effect; polychromatic