Optical parameters of the eye and vision tests are routinely and easily measured in clinical practice—examples include aberrometry and visual acuity, respectively. However, such tests fail to underpin the full extent of how a patient perceives their own vision in their everyday life. Two individuals may have identical visual acuity but have markedly different perceptions of their vision. This perception is a complex composite of the many optical features of the eye coupled with psychological features. This led to the development of the 10-item Quality of Vision (QoV) questionnaire in 2010.1
The QoV questionnaire was developed to provide a patient-reported measure of quality of vision across a range of patient groups including refractive correction (refractive surgery, spectacles, and contact lenses) and cataract. This resulted in a 10-item questionnaire across three rating scales. The questionnaire was developed, validated, and scored using Rasch analysis, with scores ranging from 0 to 100 (higher scores indicate worse quality of vision).1 This questionnaire has become a valuable tool for clinical trials and studies, particularly in refractive surgery (intraocular lens [IOL] and laser platform evaluation), cataract surgery, and contact lenses.2–4 Subjective patient-reported measures are now a mandatory requirement by many funding bodies and regulatory authorities in clinical trials and new product development.
The questionnaire uses photographs to simulate visual symptoms (e.g., glare and halos), with patients reporting how often they experience the symptom (Frequency), how severe they experience the symptom (Severity), and how bothered they are by the symptom (Bothersome). The purpose of this article is to evaluate the interchangeability of the three rating scales of the questionnaire: Frequency, Severity and Bothersome. Questionnaires should be short and concise to minimize respondent burden yet long enough to provide useful information. Longer questionnaires are associated with a lower response rate.5 This will indicate if any of the three rating scales are predictive of one another and whether respondents need to complete all three rating scales. This finding also has implications for symptom measurement in general and is not exclusive to the QoV questionnaire. To achieve this, we will pool the data from various published data on the QoV in the literature.1–4
The responses from 1930 completed questionnaires were analyzed from four studies, as follows:
Study 1 (900 questionnaires): 150 subjects were spectacle wearers, 150 were contact lens wearers, 300 had undergone laser refractive surgery (consisting of laser in situ keratomileusis, laser-assisted subepithelial keratectomy, and photorefractive keratectomy surgeries for various refractive errors), 150 had cataract, and 150 had undergone lens implantation surgery (consisting of monofocal, multifocal, and pseudoaccommodative IOLs). The mean age was 34 years, and the age range was 21 to 78 years.1
Study 2 (500 questionnaires): 100 subjects who had bilateral laser-assisted subepithelial keratectomy for myopia or hyperopia completed the QoV questionnaire preoperatively and at postoperative intervals of 5 days, 2 weeks, 1 month, and 3 months. Of the 100 subjects enrolled in the study, 68 had surgery for myopia (mean age, 32.8 years; range: 22–44 years), and 32 had surgery for hyperopia (mean age, 28.4 years; range, 21–41years).2
Study 3 (418 questionnaires): 209 subjects who were undergoing cataract surgery completed the QoV questionnaire preoperatively and 3 months postoperatively (mean age, 74.2 years; range, 48–93 years). Subjects consisted of patients undergoing first and second eye cataract surgery, with and without ocular comorbidities. All subjects had monofocal IOL implantation after cataract surgery.3
Study 4 (112 questionnaires): 112 patients with cataract (mean age, 68.1 years; range, 33–90 years) and no ocular comorbidities completed the questionnaire.4
The data from each of the four studies were pooled together by listing the QoV scores consecutively and performing the Bland-Altman limits of agreement (LoA) method to assess the interchangeability between the three rating scales of the QoV questionnaire.6,7 If one plots a scatter graph of the data of two comparative measurements (e.g., Frequency data vs. Severity data), perfect agreement would exist if all the data points from each measurement lie on the line of equality (y = x). The LoA technique describes by how much the two measurements differ, and if this difference is small enough, the measurements may be used interchangeably. In brief, the technique is performed by determining the mean difference (d) between the two measurements, and the standard deviation (s) of their differences. It would be expected that most differences would lie within d – 1.96s and d + 1.96s considering differences follow a normal distribution. Provided the differences within d ± 1.96s are not clinically important, the two measurements could be used interchangeably.7 All data were managed and statistically tested, and graphs were produced using MedCalc (MedCalc version 188.8.131.52; Acacialaan 22, B-8400, Ostend, Belgium).
The mean difference, standard deviation of the differences, and the LoA with their associated 95% confidence intervals were calculated for the three rating scales of the QoV questionnaire for the pooled data, and results are displayed in Table 1. The standard graphical representation of the LoA (Bland-Altman plot) between the three rating scales of the QoV questionnaire are displayed in Figs. 1–3.
All three comparisons yielded wide LoA between the three rating scales. Similar differences and agreement limits were found between the Frequency versus Severity and Severity versus Bothersome rating scales. The least agreement and largest differences were found between the Frequency versus Bothersome rating scales.
This study included pooled data from four independent studies encompassing a wide spectrum of clinical presentations providing an embracive spread of QoV scores for different patient groups. The wide LoA found in this study indicate that the three rating scales of the QoV questionnaire (Frequency, Severity and Bothersome) are in fact measuring different aspects of the latent trait, quality of vision. Users should continue to use all three rating scales of the questionnaire to achieve a comprehensive assessment of subjective quality of vision. If one only wishes to assess the frequency of visual symptoms, then the Frequency rating scale may be used alone but under the assumption that this will not indicate the severity of the symptoms or how bothered the patient is about the symptom. By way of example, patients may only experience symptoms at specific times, such as glare and halos when driving at night, yet this may be very severe and bothersome to the patient, even to the extent that they may cease night driving. Hence, the use of the Frequency rating scale alone would underestimate this patient’s overall quality of vision. Similar scenarios can be conceived with the opposing effect of very frequent symptoms but not very severe or bothersome symptoms.
The narrowest limit of agreement was between the Frequency and Severity rating scales (−10.4397 to 16.1537). To put these numbers into context, in a recent study assessing QoV scores before and after laser-assisted subepithelial keratectomy refractive surgery, the preoperative Frequency rating scale QoV score was 6.78, and the postoperative scores at time intervals of 5 days, 2 weeks, 1 month, and 3 months were 70.71, 50.55, 7.15, and 2.45 respectively. All these scores had a statistically significant difference from one another except for the preoperative score to the score at 1 month (6.78 vs. 7.15, p = 0.819).2 As can be seen from Table 1 and Fig. 2, the widest limit of agreement was found between the Frequency and Bothersome rating scales (−19.1831 to 30.1179).
These findings have important implications for symptom measurement in general including questionnaires that have similar symptom subscales.8–12 Questionnaire developers should appreciate that symptom frequency, severity, and bothersome behaves differently, and they should be considered in the design of new questionnaires. Although different latent traits in other areas of healthcare may behave differently to quality of vision, it would be advisable to empirically test the LoA between the rating scales of other questionnaires or item banks. Part of the reason for this study was to ensure that it is necessary to measure all three aspects of quality of vision. In general, a short questionnaire, but not too short, is likely to provide a more precise measurement estimate as opposed to a long, cumbersome questionnaire that will be influenced more by the effects of fatigue and diminishing concentrations levels.13,14 Questionnaire development involves striking a balance between having enough items to provide precision in the measurement estimate yet not too long to cause respondent burden. One must also consider the advantages and disadvantages surrounding the use of pooled data in this study. The advantages include a large sample size with a wide spectrum of patient groups such as cataract and postrefractive surgery. However, the disadvantages with this approach include the differences in location, populations, and format of delivery such as interview versus self reported.
In conclusion, this study highlights the need to use all three rating scales of the QoV questionnaire to provide a comprehensive assessment of subjective quality of vision. Caution is advised to users wishing to assess only the frequency of symptoms because this may significantly underestimate the severity or how bothersome the symptom may be to the patient.
School of Medicine
Flinders Medical Centre and Flinders University
Adelaide, South Australia
Received January 5, 2013; accepted April 4, 2013.
1. McAlinden C, Pesudovs K, Moore JE. The development of an instrument to measure quality of vision: the Quality of Vision (QoV) questionnaire. Invest Ophthalmol Vis Sci 2010; 51: 5537–45.
2. McAlinden C, Skiadaresi E, Pesudovs K, Moore JE. Quality of vision after myopic and hyperopic laser-assisted subepithelial keratectomy. J Cataract Refract Surg 2011; 37: 1097–100.
3. Skiadaresi E, McAlinden C, Pesudovs K, Polizzi S, Khadka J, Ravalico G. Subjective quality of vision before and after cataract surgery. Arch Ophthalmol 2012; 130: 1377–82.
4. Cabot F, Saad A, McAlinden C, Haddad NM, Grise Dulac A, Gatinel D. Objective assessment of crystalline lens opacities level by measuring ocular light scattering with a double-pass system. Am J Ophthalmol 2013; 155: 629–35.
5. Edwards P, Roberts I, Sandercock P, Frost C. Follow-up by mail in clinical trials: does questionnaire length matter? Control Clin Trials 2004; 25: 31–52.
6. Altman DG, Bland JM. Measurement in medicine: the analysis of method comparison studies. The Statistician 1983; 32: 307–17.
7. McAlinden C, Khadka J, Pesudovs K. Statistical methods for conducting agreement (comparison of clinical tests) and precision (repeatability or reproducibility) studies in optometry and ophthalmology. Ophthalmic Physiol Opt 2011; 31: 330–8.
8. Browall M, Sarenmalm EK, Nasic S, Wengström Y, Gaston-Johansson F. Validity and reliability of the Swedish version of the Memorial Symptom Assessment Scale (MSAS): an instrument for the evaluation of symptom prevalence, characteristics, and distress. J Pain Symptom Manage 2012; 44: November 29, 2012, ePub ahead of print: doi:10.1016/j.jpainsymman.2012.07.023.
9. Shumaker SA, Wyman JF, Uebersax JS, McClish D, Fantl JA. Health-related quality of life measures for women with urinary incontinence: the Incontinence Impact Questionnaire and the Urogenital Distress Inventory. Continence Program in Women (CPW) Research Group. Qual Life Res 1994; 3: 291–306.
10. Cohen HA, Rozen J, Kristal H, Laks Y, Berkovitch M, Uziel Y, Kozer E, Pomeranz A, Efrat H. Effect of honey on nocturnal cough and sleep quality: a double-blind, randomized, placebo-controlled study. Pediatrics 2012; 130: 465–71.
11. Jansen ME, Begley CG, Himebaugh NH, Port NL. Effect of contact lens wear and a near task on tear film break-up. Optom Vis Sci 2010; 87: 350–7.
12. Eggleston A, Farup C, Meier R. The Domestic/International Gastroenterology Surveillance Study (DIGEST): design, subjects and methods. Scand J Gastroenterol Suppl 1999; 231: 9–14.
13. Pesudovs K, Burr JM, Harley C, Elliott DB. The development, assessment, and selection of questionnaires. Optom Vis Sci 2007; 84: 663–74.
14. McAlinden C, Gothwal VK, Khadka J, Wright TA, Lamoureux EL, Pesudovs K. A head-to-head comparison of 16 cataract surgery outcome questionnaires. Ophthalmology 2011; 118: 2374–81.
Keywords:© 2013 American Academy of Optometry
quality of vision (QoV) questionnaire; Rasch analysis; patient-reported outcomes; refractive surgery; limits of agreement