Plastic & Reconstructive Surgery:
Reply: The Rasch Model: “Litmus Test” de rigueur for Rating Scales?
Cano, Stefan J. Ph.D.; Klassen, Anne F. D.Phil.; Scott, Amie B.Sc.; Cordeiro, Peter G. M.D.; Pusic, Andrea L. M.D., M.H.S.
Peninsula College of Medicine and Dentistry, Plymouth, United Kingdom (Cano)
McMaster University, Hamilton, Ontario, Canada (Klassen)
Memorial Sloan-Kettering Cancer Center, New York, N.Y. (Scott, Cordeiro, Pusic)
Correspondence to Dr. Pusic, Plastic and Reconstructive Surgery Service, Department of Surgery, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, N.Y. 10021 firstname.lastname@example.org
We would like to thank Dr. Sunil Otiv for his letter1 in response to our article “The BREAST-Q: Further Validation in Independent Clinical Samples.”2 We would agree that Rasch measurement theory has much to offer clinical outcomes assessment in plastic surgery outcomes research, not only in the development and validation of instruments in patient-reported outcomes, but also for clinician-reported and observer-reported outcomes.3,4
Figure. No caption a...Image Tools
We would also like to thank Dr. Otiv for his explanation of how the Rasch model compares to classical test theory. This elaboration is particularly significant, as direct comparisons of these approaches are sparse,5,6 probably because they use different methods, produce different information, and apply different criteria for success and failure. Unlike classical test theory, the aim of Rasch measurement theory is to determine the extent to which observed rating scale data satisfy the measurement model. When the data do not fit the model, we examine the data carefully to try and explain the misfit. This central tenet distinguishes the Rasch measurement theory diagnostic paradigm from other psychometric methods that are based on statistical modeling.4 In this regard, we feel this approach has much in common with clinical practice. Except, of course, as psychometricians we are concerned with items and response options as opposed to signs and symptoms.
To take Dr. Otiv's point a little further, we would propose that Rasch measurement theory offers tangible clinical benefits (Table 1), including (when data fit the Rasch model) (1) the ability to construct linear measurements from ordinal-level data, thereby addressing a major concern of using patient-reported outcome instruments as outcome measures7; (2) providing item estimates that are free from the sample distribution and person estimates that are free from the scale distribution, thus allowing for greater flexibility in situations where different samples or scales are used8; and (3) enabling estimates suitable for individual person analyses rather than only for group comparison studies, which speaks directly to surgeons, as the patient is the central unit of interest.9
We would also agree with Dr. Otiv that patient-reported outcome instrument validity is key and requires considerable attention. The current U.S. Food and Drug Administration's scientific requirements for patient-reported outcomes in clinical trials10,11 highlight the importance of establishing validity. In particular, the U.S. Food and Drug Administration emphasizes appropriate conceptual frameworks and definitions as being fundamental. These are best achieved using detailed qualitative assessments, which should include evaluating the extent to which a scale's items represent the construct to be measured; establishing the most appropriate item phrasing, structuring, and context; and ensuring consistency in meaning by cognitive debriefing.10 However, traditionally, new scales are developed through the generation of a large pool of items, followed by grouping the items into potential scales, and then (either statistically or thematically) decisions are made as to what construct each group seems to measure, with the subsequent removal of unwanted or irrelevant items. The limitation of this approach is that the scale content, rather than the construct intended for measurement, defines what the scale measures. This makes interpreting its scores in a clinically meaningful way very difficult.12
In developing the BREAST-Q, we selected a range of qualitative methods, including in-depth patient and clinician interviews, literature review, panel meetings, and cognitive debriefing.13,14 However, in addition to these methods, we also strove to develop explicit descriptions of each BREAST-Q scale, to maximize their utility as clinically interpretable tools. As such, the BREAST-Q was developed “bottom-up” (from a construct definition) rather than “top-down” (from a method of grouping items) to ensure that substantive, clinically grounded hypotheses determined scale content. This involved several rounds of iterative qualitative inquiry using the methods described above to establish clinical validity. This approach provides the optimal foundations to fully understand the measurement performance of each of the new scales.15,16
Using detailed qualitative inquiry together with Rasch Measurement Theory to develop the content of the BREAST-Q means that we have a good understanding of the empirical item order across each scale. Thus, we know which items are associated with each and every possible scale score. For example, we previously used the BREAST-Q Reconstruction: Satisfaction with Breasts scale in a multicenter, cross-sectional study of 672 postmastectomy women. We found that women's satisfaction with their breasts was significantly greater among those who received silicone implants (mean score, 64) compared with those who received saline implants (means score, 57).17 We are able to translate these scores as follows: women in the silicone group scored higher up the scale and therefore typically were satisfied with the “look” and “feel” of their reconstructed breasts, whereas women in the saline group scored toward the middle of the scale, and were satisfied with “size” and “look” of their breasts but not how well they “match” or “feel natural.” The ability to provide qualitative statements for each BREAST-Q scale score begins to make their meaning concrete and thus provides a clear base for clinical interpretation.
In relation to Dr. Otiv's four questions about the study,2 we make the following remarks. In his first question, Dr. Otiv highlights the mismatch in the BREAST-Q Augmentation Module: Physical Well-Being scale regarding the classical test theory (Cronbach α) and Rasch measurement theory (Person Separation Index) reliability statistics (0.83 and 0.34, respectively). In fact, the Person Separation Index is sensitive to scale-to-sample mistargeting. In this instance, we interpret this result as reflecting that physical well-being is very high in this surgical group and there is a ceiling effect. As we allude to in the article, there are ways to further build on this finding. For example, one route forward would be to expand the content of this scale to attempt to overcome the issue. Although, clinically speaking, this may be counterintuitive, because low physical morbidity would be expected in this group. Dr. Otiv's second question also relates to the reliability statistic, and he provides some interpretation based on the Winsteps program. However, in our study, we used RUMM 2030,18 which uses the Person Separation Index whose values range from 0 to 1, is analogous to the Cronbach α, and can be handled interpretatively in a similar way, bearing in mind the importance of targeting.
In his third question, Dr. Otiv asks about person and item standard errors. We did not report the latter because of space restrictions, but this information is available from the authors on request. In terms of the former person standard errors, these can be generated through the Q-Score package, which is freely available with the BREAST-Q (http://webcore.mskcc.org/breastq/scoreBQ.html).
Dr. Otiv's final question relates to Dr. Cano's views relating to the relative benefits of Rasch measurement theory and classical test theory. We hope that our position as stated in this letter clears up that issue. Dr. Cano has also expanded on his views elsewhere.12,19 In short, as a research group, we strongly advocate the use of Rasch measurement theory because of its clear clinical benefits over other psychometric methods. As such, we also support Dr. Otiv's four key areas for future debate and expansion surrounding the use of Rasch measurement theory in the development and validation of rating scales in plastic surgery,1 and we believe journals such as Plastic and Reconstructive Surgery are ideally placed to hold such debates. This is because plastic surgeons are key stakeholders in high-stakes clinical outcomes research. As such, they increasingly rely on rating scales to deliver high-quality, reliable, valid, and interpretable measurement.
Stefan J. Cano, Ph.D.
Peninsula College of Medicine and Dentistry, Plymouth, United Kingdom
Anne F. Klassen, D.Phil.
McMaster University, Hamilton, Ontario, Canada
Amie Scott, B.Sc.
Peter G. Cordeiro, M.D.
Andrea L. Pusic, M.D., M.H.S.
Memorial Sloan-Kettering Cancer Center, New York, N.Y.
The BREAST-Q is owned by Memorial Sloan-Kettering Cancer Center and the University of British Columbia. Drs. Cano, Klassen, and Pusic are co-developers of the BREAST-Q and receive a portion of the revenues generated when the BREAST-Q is used in industry-sponsored clinical trials.
1. Otiv S. The Rasch model: “Litmus test” de rigueur for rating scales? Plast Reconstr Surg. 2013;131:283e–285e.
2. Cano SJ, Klassen AF, Scott AM, Cordeiro PG, Pusic AL. The BREAST-Q: Further validation in independent clinical samples. Plast Reconstr Surg. 2012;129:293–302.
3. Hobart JC, Cano SJ, Zajicek JP, Thompson AJ. Rating scales as outcome measures for clinical trials in neurology: Problems, solutions, and recommendations. Lancet Neurol. 2007;6:1094–1105.
4. Andrich D. Rating scales and Rasch measurement. Expert Rev Pharmacoecon Outcomes Res. 2011;11:571–585.
5. Cano SJ, Warner TT, Thompson AJ, Bhatia KP, Fitzpatrick R, Hobart JC. The cervical dystonia impact profile (CDIP-58): Can a Rasch developed patient reported outcome measure satisfy traditional psychometric criteria? Health Qual Life Outcomes 2008;6:58.
6. Cano SJ, Barrett LE, Zajicek JP, Hobart JC. Beyond the reach of traditional analyses: Using Rasch to evaluate the DASH in people with multiple sclerosis. Mult Scler. 2011;17:214–222.
7. Grimby G, Tennant A, Tesio L. The use of raw scores from ordinal scales: Time to end malpractice? J Rehabil Med. 2012;44:97–98.
8. Wright B. Solving measurement problems with the Rasch Model. JEM. 1977;14:97–116.
9. Andrich D. Rasch Models for Measurement. Beverly Hills, Calif: Sage Publications; 1988.
12. Cano SJ, Hobart JC. The problem with health measurement. Patient Prefer Adherence 2011;5:279–290.
13. Klassen AF, Pusic AL, Scott A, Klok J, Cano SJ. Satisfaction and quality of life in women who undergo breast surgery: A qualitative study. BMC Womens Health 2009;9:11–18.
14. Pusic AL, Klassen AF, Scott AM, Klok JA, Cordeiro PG, Cano SJ. Development of a new patient-reported outcome measure for breast surgery: The BREAST-Q. Plast Reconstr Surg. 2009;124:345–353.
15. Stenner AJ, Smith M. Testing construct theories. Perceptual Motor Skills 1982;55:415–426.
16. Fisher W, Stenner A. Integrating qualitative and quantitative research approaches via the phenomenological method. Int J Multiple Res Approaches 2011;5:89–103.
17. McCarthy CM, Klassen AF, Cano SJ, et al.. Patient satisfaction with postmastectomy breast reconstruction: A comparison of saline and silicone implants. Cancer 2010;116:5584–5591.
18. Andrich D, Sheridan B. RUMM 2030. Perth, Western Australia, Australia: RUMM Laboratory Pty Ltd; 1997–2012.
19. Hobart J, Cano S. Improving the evaluation of therapeutic intervention in MS: The role of new psychometric methods. In: Monograph for the UK Health Technology Assessment Programme. Southhampton, UK: NIHR Evaluation, Trials and Studies Coordinating Centre; 2009:1–200.
Letters to the Editor, discussing material recently published in the Journal, are welcome. They will have the best chance of acceptance if they are received within 8 weeks of an article's publication. Letters to the Editor may be published with a response from the authors of the article being discussed. Discussions beyond the initial letter and response will not be published. Letters submitted pertaining to published Discussions of articles will not be printed. Letters to the Editor are not usually peer reviewed, but the Journal may invite replies from the authors of the original publication. All Letters are published at the discretion of the Editor.
Letters submitted should pose a specific question that clarifies a point that either was not made in the article or was unclear, and therefore a response from the corresponding author of the article is requested.
Authors will be listed in the order in which they appear in the submission. Letters should be submitted electronically via PRS' enkwell, at www.editorialmanager.com/prs/.
We reserve the right to edit Letters to meet requirements of space and format. Any financial interests relevant to the content of the correspondence must be disclosed. Submission of a Letter constitutes permission for the American Society of Plastic Surgeons and its licensees and asignees to publish it in the Journal and in any other form or medium.
The views, opinions, and conclusions expressed in the Letters to the Editor represent the personal opinions of the individual writers and not those of the publisher, the Editorial Board, or the sponsors of the Journal. Any stated views, opinions, and conclusions do not reflect the policy of any of the sponsoring organizations or of the institutions with which the writer is affiliated, and the publisher, the Editorial Board, and the sponsoring organizations assume no responsibility for the content of such correspondence.
The Journal requests that individuals submit no more than five (5) letters to Plastic and Reconstructive Surgery in a calendar year.
©2013American Society of Plastic Surgeons