Body contouring procedures, such as liposuction and lifts to the upper arm, abdomen, thigh, and lower body, are increasingly popular forms of plastic surgery. The number of body contouring procedures in the United States was 694,318 in 2014, comprising 39.3% of all aesthetic procedures and up from 239,832 in 1997.1 Although most of these procedures were performed for cosmetic reasons, many patients seek body contouring to remove excess skin after massive weight loss. Whether performed for cosmetic purposes or after weight loss, body contouring has the potential to improve a patient’s body image and health-related quality of life (HRQL).2–7
In many countries, body contouring procedures are considered cosmetic and patients are expected to pay out-of-pocket for treatments. In some countries, body contouring to remove excess skin after massive weight loss is considered reconstructive and access to treatment is provided to patients who meet specified criteria.8 Given the profound impact that excess skin after weight loss can have on appearance and HRQL, evidence-based information about patient-centered outcomes of body contouring after massive weight loss is needed. Patient-centered information is also needed to ensure that cosmetic body contouring procedures, including nonsurgical treatments, are safe and effective.
A limitation in the pursuit of such data is the lack of a rigorously developed patient-reported outcome (PRO) instrument designed to measure HRQL and other concerns common to both weight loss and body contouring patients.9 Research into weight loss and body contouring that has used a PRO instrument has tended to use instruments that are generic (eg, SF-3610), obesity specific (eg, Moorehead-Ardelt Questionnaire11), or body contouring specific (eg, BREAST-Q reduction module12–14).
To address the lack of a PRO instrument for patients undergoing weight loss and body contouring, our team followed international recommendations15–19 to develop the BODY-Q. We previously described the development of the BODY-Q conceptual framework and set of scales, which involved a literature review, 63 patient interviews, 22 cognitive patient interviews, and input from 9 experts (phase 1).20–22 The BODY-Q measures 3 domains (appearance, HRQL, and experience of healthcare) via 18 independently functioning scales. In this article, we describe the psychometric results for each scale based on an international field test.
In Canada, research ethics approval was obtained at McMaster University (Hamilton Integrated Research Ethics Board) and the University of British Columbia (Behavioural Research Ethics Board). In the United States, ethics approval was obtained through the IRB Company Incorporated (Buena Park, Calif.). In the United Kingdom, National Health Service permission was obtained, with study sponsorship provided by the Royal Free London NHS Foundation Trust.
The following recruitment strategies were used.
St Joseph’s Healthcare Bariatric Program, Hamilton (Canada)
Patients exploring or seeking bariatric surgery and post-bariatric surgery patients were recruited between November 2013 and July 2014. Data were collected using either a handheld tablet, with data entered directly into a secure web-based application (ie, Research Electronic Data Capture, REDCap23), or a questionnaire booklet. All participants completed the appearance and HRQL scales, and, to ensure some familiarity with clinic staff, only patients who were post-bariatric surgery were asked to complete the experience scales. Participants were also invited to provide an e-mail to participate in 2 additional study components as follows: a test–retest (TRT) survey sent after 1 week; and a 6-month follow-up. Those who agreed to either component were sent an e-mail at the appropriate time, with the URL link to access the survey via REDCap. Up to 2 e-mailed reminders, spaced by 2 weeks, were sent to nonrespondents. The 6-month follow-up group also received a phone call reminder.
St George’s University Hospital, London (England)
Patients who had body contouring between February 2004 and May 2014 were sent an information letter and questionnaire booklet composed of the appearance and HRQL scales. The experience scales were excluded due to potential recall bias given the length of time elapsed since surgery for many patients. Nonrespondents were sent up to 2 postal reminders and 1 phone call as necessary. Excluded from the denominator were patients whose questionnaire was returned to sender, patients with a terminal illness, and patients who had deceased.
Cosmetic Surgery Clinics, Hamilton, Vancouver, Mississauga (Canada), and Atlanta (United States)
Patients exploring or seeking body contouring surgery, and patients who had had body contouring, were recruited in clinics in Hamilton, Vancouver, and Atlanta between December 2013 and December 2014. Data were collected using either an iPad directly into REDCap, or using a questionnaire booklet. All participants completed the experience scales and the Body scale plus any appearance scale relevant to their body contouring procedure and/or areas of the body with excess skin (eg, Arms scale for brachioplasty patients and/or patients with excess skin on upper arms). Only patients who previously had bariatric surgery were asked to complete the Physical scale and Symptoms checklist. The remaining HRQL scales were completed by all participants.
The Atlanta and Mississauga clinics also invited former body contouring patients to participate. The Atlanta clinic staff sent an e-mail invitation with URL link to all patients treated in the previous three years. One reminder was sent 1 month later. For the Mississauga clinic, patients who participated in the earlier cognitive interview phase of the study22 were sent an e-mail invitation with URL link. Two reminders were sent spaced by 1 week.
Private Medical Clinic, Aberdeen (Scotland)
Patients attending a weight loss program that included a collagen stimulation treatment (nonsurgical body contouring) were recruited between October 2014 and February 2015. Participants completed the body, abdomen, skin, body image, psychological and social scales. At the end of the survey, participants were invited to provide an e-mail address if they were willing to complete the experience scales (2-week follow-up). Those who complied with this request were invited to complete the initial set of scales again (6-week follow-up). For both follow-ups, 2 e-mail reminders were sent spaced by 1 week.
We used Rasch Measurement Theory (RMT) analysis24 to identify the subset of items for each scale that represented the best indicators of outcome. Decisions about which items to retain were based on the following set of statistical and graphical tests, explained in more detail elsewhere25:
- Thresholds for item response options: For each scale, we examined thresholds between response options (eg, between very satisfied and somewhat satisfied) to determine if a scale’s response categories scored with successive integer scores increased as intended.
- Item fit statistics: We examined 3 indicators of fit to determine, for each scale, if the items worked together to map out a clinically important construct in the form of a hierarchy: (1) log residuals (item–person interaction); (2) χ2 values (item–trait interaction); and (3) item characteristic curves. Fit residuals should be between −2.5 and +2.5, and χ2 values should be nonsignificant after Bonferroni adjustment. We interpreted fit statistics together and in relation to clinical usefulness.
- Dependency: Residual correlations between pairs of items were inspected to identify any that were 0.30 or higher as high residual correlations can artificially inflate reliability.26
- Targeting: For each scale, we examined person and item locations to determine if items were evenly spread over a reasonable range that matched the range of the construct reported by the sample.
- Stability: Differential item functioning (DIF) was examined to determine if items in a scale worked the same across subgroups within the sample. Subgroups that were examined included age group (<40, 40–49, 50–59, ≥60 years), sex, patient type (weight-loss and body contouring), country (Canada, United States, and United Kingdom), and method of data collection (online and paper). Chi-square values significant after Bonferroni adjustment were used to indicate items with potential DIF.
- Person separation index (PSI): We computed the PSI for each scale. PSI measures error associated with the measurement of people in a sample and is comparable to Cronbach α.27 Higher values indicate greater reliability.
In addition to the RMT analyses, for each scale, we computed Cronbach α27 and, for the Test-retest (TRT) data, interclass correlation coefficients.28 We also examined the proportion of participants with scale level missing data and with scores at the floor and ceiling. For the Symptom checklist, we computed the proportion of participants who chose each response option for the scale’s 10 obesity-related symptoms.
The Rasch logit scores were transformed into 0 (worse) to 100 (best) scores. These scores were used to conduct the following tests of construct validity:
- For each item on the obesity-specific Symptom checklist, mean body mass index (BMI) would be incrementally higher according to frequency of symptoms reported (eg, lowest for never, …, highest for all the time).
- Scales measuring similar constructs (eg, appearance scales) would correlate more strongly with each other than with scales measuring dissimilar constructs. Correlations between BMI and number of symptoms experienced would be stronger with appearance and HRQL scales than with patient experience scales. Patient characteristics (age, sex, and ethnicity) would correlate weakly with BODY-Q scale scores.
- BODY-Q scores for appearance and HRQL would vary across clinical groups in the sample. BODY-Q scores would be lowest (worse) for patients who had not started their weight loss journey (ie, prebariatric surgery) and would be highest (best) for cosmetic patients who had had body contouring surgery.
- Scores for appearance and HRQL will be incrementally lower in participants who report having more versus less excess skin.
Finally, to examine responsiveness, paired t tests and effect sizes (ie, mean time 1 – mean time 2/standard deviation at time 1) were computed to determine statistical and clinical significance of change in scores for participants who completed BODY-Q scales on 2 occasions.
Table 1 shows details about the scales. Each scale has 10 or fewer items and Flesch-Kincaid grade levels under 6 (range, 0–5.3). The response rate (Table 2) varied by method of recruitment as follows: face-to-face, 94%; postal, 40%; and email, 14%. Table 3 shows sample characteristics. The 734 participants provided 965 assessments, with 616 (64%) of the assessments collected via REDCap.
The RMT analysis provided evidence of reliability and validity for the BODY-Q scales. Thresholds were ordered for 134 of 138 items. Four items in the Information scale evidenced disordered thresholds. For these items, we simplified the scoring by collapsing across 2 categories as follows, with subsequent RMT analysis using the rescored data: very dissatisfied, 0; somewhat dissatisfied, 0; somewhat satisfied, 1; very satisfied, 2.
Table 4 shows the item fit statistics for each scale organized by item location. Item fit was within −2.5 to +2.5 for 115 of 138 items, and all items had nonsignificant χ2P values after Bonferroni adjustment. The item residual correlations were above 0.30 for 8 pairs of items, with 1 pair above 0.40 (range, 0.31–0.47). In subtest analyses, the correlated items were found to have marginal impact on scale reliability (≤0.02 difference in PSI value). The targeting of person measurements and the distribution of item locations defined a continuum of measurement for each scale. DIF was detected for 17 items on one or more variable (exception age group). When items were split on each variable with DIF and the new person locations were correlated with the original person locations, DIF had a negligible impact (Pearson correlations ≥ 0.97).
The reliability statistics and other statistics of scale performance are shown in Table 5. PSI values were above 0.70 for 16 of 18 scales with extremes included and all 18 scales with extremes excluded. Cronbach α values were 0.90 and higher for all 18 scales. For the TRT, 170 of 354 participants recruited from the Hamilton bariatric clinic provided an e-mail and agreed to participate and 44 complied (12% response rate). The respondents did not differ from the 310 nonrespondents by age, sex, ethnicity, BMI, number of obesity-specific symptoms, and BODY-Q scale scores. The TRT value was 0.87 or higher for 16 of the 17 scales and was 0.65 for the Physical scale. Scale-wise missing data were up to 5%. The proportion of participants to score at the floor and ceiling were more than 50% for 3 of the experience scales (exception Information).
The Symptom checklist (Table 6) was completed by 431 weight loss participants of whom 110 completed it again 6 months later, providing a total of 541 assessments. The most common symptom was feeling tired during the day followed by back and joint pain. For all 10 symptoms, the mean BMI was lowest for the group of participants who never experienced the symptom and was highest for 7 symptoms for the group of participants who experienced the symptom all the time.
Correlations between scales (Table 7) tended to be higher, as predicted, for scales measuring similar constructs, with the exception of Body Image and Physical where correlations were stronger with appearance over HRQL scales. Higher BMI and more obesity-specific symptoms correlated with scale scores for appearance and HRQL. Patient characteristics correlated only weakly with BODY-Q scale scores.
Figures 1 and 2 show the mean scores for appearance and HRQL scales by clinical group, respectively. Mean scores differed significantly across patient group for all appearance and HRQL scales (P ≤ 0.003). The lowest (worst) scores were for participants waiting for bariatric surgery, all of whom were obese (BMI ≥ 38), and the highest (best) scores were for cosmetic surgery patients who had had body contouring. For all scales, participants with a lot of excess skin reported the lowest mean scores, whereas participants who reported having a little excess skin reported the highest mean scores (Table 8).
E-mails for the 6-month follow-up were provided by 299 of 354 participants from the Hamilton bariatric clinic (121 complied; 34% response) and from 42 of 49 weight loss participants from Aberdeen (13 complied; 31% response). The 134 respondents did not differ from the 269 nonrespondents in terms of age, sex, ethnicity, or BMI, but on the BODY-Q scales, they did report higher (better) scores on the Doctor scale (P = 0.03 on Mann–Whitney U Test). Most participants (N = 100) had lost weight since the initial assessment. Participants from Hamilton lost more weight on average than the Scottish patients (31.1 versus 15.2 pounds, P = 0.001 on independent sample t test). Table 9 shows results for paired t tests. Significant improvement was reported on 7 of the 8 appearance scales (exception Skin) and 4 of the 5 HRQL scales (exception sexual), and significant worsening was reported for 2 (doctor, office staff) experience scales. Effect sizes were small to moderate in size.
The BODY-Q is a comprehensive PRO instrument designed for weight loss and/or body contouring patients. The BODY-Q scales were reliability and validity in a large international sample of patients, and the ability to detect change after weight loss. BODY-Q scales worked the same (without bias) across patients who varied by age, sex, country, patient type, and use of paper versus electronic data collection. The BODY-Q revealed that satisfaction with appearance, and HRQL is lower for patients who report more obesity-related symptoms, higher BMI, and more excess skin, and being pre- versus postoperative body contouring.
BODY-Q experience scales measure issues identified as important to weight loss and body contouring patients. These 4 scales provide specific versus generic29,30 indicators of quality. BODY-Q experience and outcome scales could be used to provide patient-centered information for quality improvement purposes, similar to the use of the BREAST-Q.12,13 For example, the BREAST-Q was used in a national audit of close to 8000 breast reconstruction and mastectomy patients treated in NHS and independent hospitals in England, Wales, and Scotland31 and was included in the American Society of Plastic Surgeons Tracking Operations and Outcomes for Plastic Surgeons (TOPS) program launched in 2002 as a Health Insurance Portability and Accountability Act compliant, secure and confidential national database of plastic surgery procedures and outcomes.32
The BODY-Q addresses the lack of rigorously designed PRO instruments for use in cosmetic body contouring. In a U.K. Department of Health literature review of PRO instruments for cosmetic surgery, only 3 met international recommendations for PRO instrument development and validation, ie, BREAST-Q,12,13 FACE-Q,33 and Skindex.34 Recently, the Body-QoL,35 a PRO instrument developed in Chile was published. Compared with the BODY-Q, Body-QoL is more limited in scope as its focus is on body contouring patients only, and it contains a limited number of scales that measure satisfaction with body (ie, the abdomen), sex life, self-esteem and social performance, and physical symptoms.
Our study has some limitations. First, 7% of online participants opted out of the survey before reaching the end, which could be due to the length of the survey and/or the fact that REDCap does not include a feature to let respondents know how far they have progressed. Second, although there are advantages to internet surveys, the response rate to e-mailed invitations was much lower than postal and face-to-face recruitment. Third, TRT reliability for the Scar scale is needed from patients who are post body contouring. Fourth, some participants (mainly obesity class III) scored at the floor on some appearance scales. Finally, our sample was primarily cross-sectional, which was suitable for PRO instrument development. Longitudinal studies of weight loss and cosmetic patients are now needed to measure change in satisfaction with appearance and HRQL with weight loss, the development of excess skin, and body contouring.
To conclude, the BODY-Q provides a means to collect evidence-based outcomes data from the patient perspective. As with the BREAST-Q12–14 and FACE-Q,36,37 the BODY-Q is available free-of-charge to nonprofit users. We encourage the plastic surgery community to use these PRO instruments. Such data are needed to inform patient selection and education, comparative effectiveness research, and healthcare policy decisions.
We are grateful for grant funding provided by the Plastic Surgery Foundation in the form of a National Endowment for Plastic Surgery grant and the operating grant we received from the Canadian Institutes for Health Research. In addition, Anne Klassen held a CIHR Mid-Career Award in Women’s Health.
1. American Society of Plastic Surgeons. 2014 Plastic Surgery Statistics. Available at: http://www.surgery.org/sites/default/files/2014-Stats.pdf
. Accessed October 1, 2015.
2. de Zwaan M, Georgiadou E, Stroh CE, et al. Body image and quality of life in patients with and without body contouring surgery following bariatric surgery: a comparison of pre- and post-surgery groups. Front Psychol. 2014;5:1310.
3. Azin A, Zhou C, Jackson T, et al. Body contouring surgery after bariatric surgery: a study of cost as a barrier and impact on psychological well-being. Plast Reconstr Surg. 2014;133:776e–782e.
4. van der Beek ES, Geenen R, de Heer FA, et al. Quality of life long-term after body contouring surgery following bariatric surgery: sustained improvement after 7 years. Plast Reconstr Surg. 2012;130:1133–1139.
5. van der Beek ES, Te Riele W, Specken TF, et al. The impact of reconstructive procedures following bariatric surgery on patient well-being and quality of life. Obes Surg. 2010;20:36–41.
6. de Brito MJ, Nahas FX, Barbosa MV, et al. Abdominoplasty and its effect on body image, self-esteem, and mental health. Ann Plast Surg. 2010;65:5–10.
7. Song AY, Rubin JP, Thomas V, et al. Body image and quality of life in post massive weight loss body contouring patients. Obesity (Silver Spring). 2006;14:1626–1636.
8. Soldin M, Mughal M, Al-Hadithy N; Department of Health; British Association of Plastic, Reconstructive and Aesthetic Surgeons; Royal College of Surgeons England. Department of Health; British Association of Plastic, Reconstructive and Aesthetic Surgeons; Royal College of Surgeons EnglandNational commissioning guidelines: body contouring surgery after massive weight loss. J Plast Reconstr Aesthet Surg. 2014;67:1076–1081.
9. University of Oxford Department of Public Health. Patient-reported outcome measurement: a structured review of patient-reported outcome measures for patients undergoing cosmetic surgical procedures. Available at: http://phi.uhce.ox.ac.uk/pdf/Cosmetic%20Surgery%20PROMs%20Review2013.pdf
. Accessed October 1, 2015.
10. Available at: http://www.qualitymetric.com
. Accessed October 1, 2015.
11. Moorehead MK, Ardelt-Gattinger E, Lechner H, et al. The validation of the Moorehead-Ardelt Quality of Life Questionnaire II. Obes Surg. 2003;13:684–692.
12. Pusic AL, Klassen AF, Scott AM, et al. Development of a new patient-reported outcome measure for breast surgery: the BREAST-Q. Plast Reconstr Surg. 2009;124:345–353.
13. Cano SJ, Klassen AF, Scott AM, et al. The BREAST-Q: further validation in independent clinical samples. Plast Reconstr Surg. 2012;129:293–302.
14. Cano SJ, Klassen AF, Scott AM, et al. A closer look at the BREAST-Q(©). Clin Plast Surg. 2013;40:287–296.
15. Aaronson N, Alonso J, Burnam A, et al. Assessing health status and quality-of-life instruments: attributes and review criteria. Qual Life Res. 2002;11:193–205.
16. US Food and Drug Administration. Guidance for industry patient-reported outcome measures: Use in medical product development to support labeling claims. Available at: http://www.fda.gov/downloads/Drugs/.../Guidances/UCM193282.pdf
. Accessed October 1, 2015.
17. Patrick DL, Burke LB, Gwaltney CJ, et al. Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: part 1–eliciting concepts for a new PRO instrument. Value Health. 2011;14:967–977.
18. Patrick DL, Burke LB, Gwaltney CJ, et al. Content validity–establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO Good Research Practices Task Force report: part 2–assessing respondent understanding. Value Health. 2011;14:978–988.
19. US Food and Drug Administration. Clinical outcome assessment qualification program. Available at: http://www.fda.gov/Drugs/DevelopmentApprovalProcess/DrugDevelopmentToolsQualificationProgram/ucm284077.htm
. Accessed October 1, 2015.
20. Reavey PL, Klassen AF, Cano SJ, et al. Measuring quality of life and patient satisfaction after body contouring: a systematic review of patient-reported outcome measures. Aesthet Surg J. 2011;31:807–813.
21. Klassen AF, Cano SJ, Scott A, et al. Satisfaction and quality-of-life issues in body contouring surgery patients: a qualitative study. Obes Surg. 2012;22:1527–1534.
22. Klassen AF, Cano SJ, Scott A, et al. Assessing outcomes in body contouring. Clin Plast Surg. 2014;41:645–654.
23. Harris PA, Taylor R, Thielke R, et al. Research electronic data capture (REDCap)–a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42:377–381.
24. Rasch G. Probabilistic Models for Some Intelligence and Attainment Tests. 1993.Chicago, Ill: MESA Press.
25. Hobart JC, Cano SJ. Improving the evaluation of therapeutic intervention in MS: The role of new psychometric methods. Health Technol Assess. 2009;13:1–200.
26. Wright BD, Masters GN. Rating Scale Analysis: Rasch Measurement. 1982.Chicago, Ill: MESA Press.
27. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297–334.
28. Streiner D, Norman G. Health Measurement Scales: A Practical Guide to Their Development and Use. 2008.4th ed. Oxford: Oxford University Press.
29. Available at: www.pressganey.com
. Accessed August 31, 2015.
30. Jenkinson C, Coulter A, Bruster S. The Picker Patient Experience Questionnaire: development and validation using data from in-patient surveys in five countries. Int J Qual Health Care. 2002;14:353–358.
32. Pusic AL, Klassen AF, Scott AM, et al. Development and psychometric evaluation of the FACE-Q satisfaction with appearance scale: a new patient-reported outcome instrument for facial aesthetics patients. Clin Plast Surg. 2013;40:249–260.
34. Chren MM, Lasek RJ, Quinn LM, et al. Skindex, a quality-of-life measure for patients with skin disease: reliability, validity, and responsiveness. J Invest Dermatol. 1996;107:707–713.
35. Danilla S, Dominguez C, Cuevas P, et al. The Body-QoL(®): patient reported outcomes in body contouring surgery patients [corrected]. Aesthetic Plast Surg. 2014;38:575–583.
36. Klassen AF, Cano SJ, Scott A, et al. Measuring patient-reported outcomes in facial aesthetic patients: development of the FACE-Q. Facial Plast Surg. 2010;26:303–309.
37. Pusic AL, Klassen AF, Scott AM, et al. Development and psychometric evaluation of the FACE-Q satisfaction with appearance scale: a new patient-reported outcome instrument for facial aesthetics patients. Clin Plast Surg. 2013;40:249–260.