The objective of breast reconstructive surgery is to restore the shape of the breast and thereby restore the aesthetic appearance for improving self-image of the patient. Aesthetic outcome is therefore a primary outcome measure of breast reconstructive surgery. Valid and reliable tools to assess aesthetic outcomes after plastic reconstructive surgery are scarce. Several studies report aesthetic outcomes, in the form of an assessment by the patient, by the surgeon, or by an independent professional.1–12 Some authors use questionnaires, whereas others use photographs to assess aesthetic outcomes. However, measures for the assessment of aesthetic outcomes of breast reconstruction vary widely between studies and are often ill defined. To enable comparison between outcomes, for instance between individual surgeons or between different surgical techniques, it is important to have a valid and reliable scoring system.
Although satisfaction with aesthetic outcome in itself is a subjective parameter, a standardized scoring tool can be useful to objectify the rating of aesthetic outcomes. To our knowledge, no formally validated and reliable scoring method to assess aesthetic outcome of breast reconstruction surgery exists today. Visser et al.13 introduced a method for scoring aesthetic outcome after breast reconstruction with the use of 5 standardized photographs, which are then rated using a 5-point Likert scale with respect to volume, shape, symmetry, scars, and nipple areola complex.
The aim of this study is to determine if this method, which we have named the Aesthetic Items Scale (AIS), is a valid and reliable tool to assess aesthetic outcome after breast reconstructive surgery.
Women with a proven high risk of breast cancer who underwent either a bilateral or a contralateral prophylactic mastectomy with subsequent implant-based breast reconstruction in the VU University Medical Center in Amsterdam, between January 1999 and February 2012, were eligible for the study. The breast reconstruction surgery was either a direct-to-implant procedure or a 2-stage tissue expander/implant procedure. Fifty women gave informed consent to have photographs taken for aesthetic evaluation. This population was used to assess the AIS. The Medical Ethical Committee at the VU University Medical Center approved the study.
AIS for Evaluating Aesthetic Outcome after Breast Reconstruction
The methodology of the AIS was introduced and described previously by Visser et al.13 For this measurement, 5 standardized photographs of the breast area are made using a wide-angled digital camera. The photographs are taken of the breast region between shoulder level and the level of the umbilicus. Patients are instructed to place their hands on their buttocks and are placed in front of a uniform background. The photographs are taken from 5 angles: a frontal view, from each lateral side, and at an angle of 45 degrees between frontal and lateral view at each side. The 5 photographs are combined in an overview sheet and presented on a screen for assessment. The breasts are evaluated with respect to volume, shape, symmetry, scars, and nipple areola complex. For each of these items a 5-point Likert scale is used for scoring. This scale ranges from “very dissatisfied,” “dissatisfied,” “neutral,” “satisfied,” to “very satisfied.”
We labeled the summed score of the 5 items as the Total Aesthetic Score (TAS).
Reliability and Agreement
Patients assessed the images of their own breasts directly after the photographs were taken. To examine reliability of the AIS, a panel of 5 plastic surgeons specialized in breast reconstruction and a panel of 3 mammography nurses (1 referent radiographer of the Dutch mammography screening program and 2 radiology assistants) independently assessed the aesthetic outcomes. These evaluators were blinded for patient information and the 5 photographs were presented on 1 overview sheet per patient on a normal computer screen. The evaluators needed between 1 to 2 hours to evaluate the total series of 50 patients.
In addition to the measurement by the AIS, an overall rating for the aesthetic appearance was given on a scale of 1–10 for both reconstructed breasts, named the Overall Aesthetic Rating (OAR). Patients were asked, “considering aesthetic outcome, how would you rate your breasts on this scale?” By comparing the TAS with the OAR, we can appraise whether overall satisfaction with the aesthetic outcome is based on the 5 items that form the AIS or whether other factors play a role. Furthermore, we asked patients to comment on what was the most important reason why they did not gave the highest rating to their breast(s).
To determine interobserver reliability, we calculated intraclass correlation coefficients (ICCs) using a two-way random model. The ICC is used to assess the conformity of measurements made by multiple observers measuring the same quantity. We consider ICC values < 0.5 to indicate low agreement, values from 0.5 to 0.65 moderate agreement, values from 0.66 to 0.80 substantial agreement, and values > 0.8 high agreement. ICCs were calculated for each of the items of the AIS, the TAS, and the OAR. The ICCs were determined separately for the panel of plastic surgeons and for the panel of mammography nurses.
In addition, we assessed specific interobserver agreement for each item. Using the R statistical programming language (version 3.2.1, R Foundation, Boston, Mass.), we compared the scores between each individual observer within a panel. We then calculated the percentage of scores that were identical between observers [full agreement (FA)] and the percentage of scores that differed maximally 1 category between observers (FA ± 1 category).
To assess whether plastic surgeons, mammography nurses, and patients evaluate aesthetic outcome similarly, we looked at the correlation between the average scores of plastic surgeons, the average scores of mammography nurses, and those given by the patient.
The Pearson’s correlation coefficient (ρP) was used to determine correlations between the TAS and the OAR. Correlations between the scores of the separate items of the tool and the overall rating were determined using Spearman’s correlation (ρS). We consider a value > 0.80 as an indication of a high correlation. ICCs and correlation coefficients were calculated using SPSS (version 22.0, SPSS Inc., Chicago, Ill.).
Intraclass Correlation Coefficient
Interobserver reliability represented by the ICC was higher among plastic surgeons than that among mammography nurses. The ICCs for plastic surgeons on separate items ranged between 0.56 for “shape” and 0.82 for “nipple areola complex.” The ICCs for mammography nurses ranged between 0.37 for “volume” and 0.62 on the item “scars.” In the plastic surgery panel, the ICCs for the TAS and the OAR were 0.79 and 0.74, respectively. In the mammography nurses panel, these values were 0.55 and 0.53, respectively (Table 1).
Plastic surgeons, mammography nurses, and patients score rather similar on the average score per item of the AIS and on the Likert scale of the AIS. We do see that the plastic surgeons always score a bit higher than the mammography nurses (with the exception of the item “nipple”; Fig. 1).
Specific Rater Agreement
Specific agreement between raters was better among plastic surgeons than that among mammography nurses on the items of the AIS. FA ranged from 44% to 56% in plastic surgeons and from 30% to 39% in nurses. Agreement with a difference of maximally 1 category (agreement ± 1 category) ranged from 88% to 93% in surgeons and 74% to 92% in nurses (Table 2).
Correlation between TAS and OAR
The correlation between TAS and OAR was high for plastic surgeons and mammography nurses, both individually and for the average of the panels. Pearson’s correlations (ρP) ranged from 0.81 to 0.95. In patients, the TAS and the OAR were correlated less strongly (ρP = 0.69; Table 3).
Agreement between Rating by Professionals and Patients
Agreement between the rating of the patient and the rating by professionals was assessed using the ICC. The agreement between plastic surgeons and patient was generally higher than that between mammography nurses and patient. Intraclass correlation between the average rating of plastic surgeons and the patient was 0.6 for both the TAS [95% confidence interval (CI), 0.35–0.76] and the OAR (95% CI, 0.38–0.76). A total overview is given in Table 4.
Correlation between AIS (per Item) and OAR
In plastic surgeons, volume, shape, and symmetry correlated strongly with the overall rating (ρS = 0.73, ρS = 0.86, and ρS = 0.80, respectively). The association between the items scars and nipple areola complex and the overall rating given by the plastic surgeons is less strong (ρS = 0.61 and ρS = 0.58, respectively). For the mammography nurses, we found strong associations between volume and shape and the overall rating (ρS = 0.73, ρS = 0.87), whereas symmetry, scars, and nipple areola complex correlated moderately or low with the overall rating (ρS = 0.68, ρS = 0.58, and ρS = 0.53, respectively). In patients, shape and symmetry correlated moderately with the overall rating (ρS = 0.63 and ρS = 0.66), whereas the correlation between volume, scars, and nipple areola and the overall rating was low (ρS = 0.35, ρS = 0.47, and ρS = 0.46, respectively; Table 5). In all individual plastic surgeons and mammography nurses, shape was associated most strongly with the overall rating, with ρS ranging from 0.71 to 0.89. Symmetry was strongly associated with the overall rating in 2 plastic surgeons and 1 mammography nurse (ρS = 0.77, ρS = 0.70, ρS = 0.74), and in 2 mammography nurses, volume and overall rating were strongly related (ρS = 0.79, ρS = 0.77; Table 6).
Why Patients Did Not Rate Their Breast(s) with a 10 on the OAR
Twenty-nine patients expressed the main reason why they did not rate their breast with a maximum score of 10. Fifteen patients stated that their opinion was influenced by dissatisfaction with the sensation of the nipple. Of these women, 9 patients were awaiting further modifications on their nipples (eg, tattooing), 4 patients were dissatisfied with the location of the nipples, and 2 patients were dissatisfied with the size of the nipples. Five patients mentioned that irregularities of the reconstructed breast affected their opinion. Six patients stated that their rating was influenced by the differences in shape and size between the left and right breast. Although we specifically asked patients to rate their breasts based on the photographs, and to address additional items that may be evaluated from photographs, 6 patients stated that rigidity affected their rating of the aesthetic outcome, 5 patients said that the shape of the breasts were affected by movement, and several patients took pain, hypersensitivity, and numbness into consideration when evaluating aesthetic outcome. Nipple location and the need for corrections to the nipple were each mentioned by 2 surgeons as factors that should be taken into consideration when evaluating aesthetic outcome. Three plastic surgeons suggested that the satisfaction with the inframammary fold should be noted. Two plastic surgeons felt that some photographs could not be evaluated properly due to poor quality. One plastic surgeon and 1 mammography nurse commented on the décolleté, the position of the implant, and difference between the left and right breast, due to use of different operation techniques (Table 7; see figure, Suppplemental Digital Content 1, http://links.lww.com/PRSGO/A392).
Because aesthetic outcome is a subjective measure, it has proven difficult to objectify the outcome of breast reconstructive procedures. In this study, we aimed to determine the reliability and validity of the AIS to assess the aesthetic outcome of breast reconstruction. Our results indicate that the AIS is a reliable tool to assess aesthetic outcome by experienced professionals, but it also shows that the concept “aesthetic outcome” differs between professionals (plastic surgeons and mammography nurses) and patients.
Agreement between experienced plastic surgeons is higher than that between mammography nurses. Hence, for professional rating, it is better to use experienced plastic surgeons. We compared the summed score of the AIS with an overall grading of aesthetic outcome (OAR). The reliability of both measures is similar in plastic surgeons and both scores are highly correlated. Hence, it seems that the AIS represents the same concept as “overall aesthetic outcome” in plastic surgeons. Shape, volume, and symmetry seem to be the most important items determining aesthetic outcome. However, agreement on these individual items is only moderate (ICC, 0.56–0.64). Agreement on “nipple” is high (ICC, 0.81) but is not strongly related to overall aesthetic outcome. It is often the case that a total score is better than the scoring of separate items. Because reliability of individual items (except nipple) is only moderate, one could argue that giving one overall score is preferable. Subcategorizing in subitems could give a false sense of “detailed accuracy” of the measurement. Nipple areola complex could be assessed separately, as this item is not strongly related to the OAR.
Patients assess aesthetic outcome differently than professionals. This difference is apparent from the low to moderate ICC scores between the ratings of patients and professionals.
In patients, the TAS does not correlate highly with the OAR. Only shape and symmetry seem to be moderately related to the overall aesthetic outcome. Hence, in patients, the AIS does not represent the concept of “overall aesthetic outcome” very well. The additional remarks suggest that other factors play a role in the patients’ satisfaction with the aesthetic outcome.
We chose to look at interobserver reliability only because we think that an aesthetic outcome measure is especially relevant to compare outcomes between different cohorts of patients and between different studies. Intraobserver reliability will become relevant if the AIS is also used as a follow-up tool in time.
When evaluating aesthetic outcome of breast reconstructions, the use of static photograph is still a preferred method, due to its simplicity and low costs. Use of real-time digital video has been proposed,2 but a lower interobserver reliability was found when evaluating aesthetic outcome of breast reconstructions compared with this study. However, the authors reported a better rater agreement when using video footage compared with photographs to assess overall aesthetics (0.55 versus 0.51) and shape (0.51 versus 0.49). Video also showed a greater degree of correlation with patient self-assessment scores in comparison to photography (0.31 versus 0.28), but agreement is low in both cases and the difference is marginal.2 Using 3-dimensional imaging and even 4-dimensional breast scanning including dynamic assessment of the breast reconstruction is upcoming14; however, its applicability still has to be proven.
Due to the subjectivity of the AIS, it is impossible to find reliability oucomes comparable with results of objective parameters, but with an ICC of 0.8 for the TAS, agreement of the total score of the AIS can be considered as good.15 The surgeons judge the photos from a technical and detailed perspective and judge the items very precisely. They assess the surgical performance and surgical skills. For this reason, this tool seems to be good for professionals with the purpose of judging surgical outcomes. The authors would like to stress that it is not a tool to evaluate patient satisfaction, though patients seem to interpret the AIS in this way and are unable to distinguish the difference. Hence, there is also a demand for a simple tool to evaluate aesthetic outcome in daily practice. Such a tool should serve to improve communication and understanding between patient and professional, when they speak about the aesthetic outcome.
We conclude that the AIS is a valid and reliable method for evaluating aesthetic outcome of implant-based breast reconstruction, when used by experienced plastic reconstructive surgeons. Hence, this tool is suitable to compare aesthetic outcomes between groups of patients. However, the rating of professionals agreed only moderately with those of the patients. In patients, the AIS did not represent their overall rating of aesthetic outcome, suggesting that patients take different factors into account when evaluating aesthetic outcome. Professionals can use this method to evaluate their surgical results, but more questions should be asked to map the patient satisfaction with her breasts.
1. Brandberg Y, Arver B, Johansson H, et al. Less correspondence between expectations before and cosmetic results after risk-reducing mastectomy in women who are mutation carriers: a prospective study. Eur J Surg Oncol. 2012;38:38–43.
2. Gilmour A, Mackay IR, Young D, et al. The use of real-time digital video in the assessment of post-operative outcomes of breast reconstruction. J Plast Reconstr Aesthet Surg. 2014;67:1357–1363.
3. Asplund O, Nilsson B. Interobserver variation and cosmetic result of submuscular breast reconstruction. Scand J Plast Reconstr Surg. 1984;18:215–220.
4. Brandberg Y, Malm M, Blomqvist L. A prospective and randomized study, “SVEA,” comparing effects of three methods for delayed breast reconstruction on quality of life, patient-defined problem areas of life, and cosmetic result. Plast Reconstr Surg. 2000;105:66–74; discussion 75.
5. Cardoso MJ, Cardoso J, Santos AC, et al. Interobserver agreement and consensus over the esthetic evaluation of conservative treatment for breast cancer. Breast. 2006;15:52–57.
6. Charfare H, MacLatchie E, Cordier C. A comparison of different methods of assessing aesthetic outcome following breast-conserving surgery and factors influencing aesthetic outcome. Br J Med Pract 2010;3(1):310–5.
7. Edsander-Nord A, Brandberg Y, Wickman M. Quality of life, patients’ satisfaction, and aesthetic outcome after pedicled or free TRAM flap breast surgery. Plast Reconstr Surg. 2001;107:1142–53; discussion 1154.
8. Gui GP, Tan SM, Faliakou EC, et al. Immediate breast reconstruction using biodimensional anatomical permanent expander implants: a prospective analysis of outcome and patient satisfaction. Plast Reconstr Surg. 2003;111:125–38; discussion 139.
9. Harris JR, Levene MB, Svensson G, et al. Analysis of cosmetic results following primary radiation therapy for stages I and II carcinoma of the breast. Int J Radiat Oncol Biol Phys. 1979;5:257–261.
10. Mosahebi A, Ramakrishnan V, Gittos M, et al. Aesthetic outcome of different techniques of reconstruction following nipple-areola-preserving envelope mastectomy with immediate reconstruction. Plast Reconstr Surg. 2007;119:796–803.
11. Sacchini V, Luini A, Tana S, et al. Quantitative and qualitative cosmetic evaluation after conservative treatment for breast cancer. Eur J Cancer. 1991;27:1395–1400.
12. Gahm J, Jurell G, Edsander-Nord A, et al. Patient satisfaction with aesthetic outcome after bilateral prophylactic mastectomy and immediate reconstruction with implants. J Plast Reconstr Aesthet Surg. 2010;63:332–338.
13. Visser NJ, Damen TH, Timman R, et al. Surgical results, aesthetic outcome, and patient satisfaction after microsurgical autologous breast reconstruction following failed implant reconstruction. Plast Reconstr Surg. 2010;126:26–36.
14. Potter S, Harcourt D, Cawthorn S, et al. Assessment of cosmesis after breast reconstruction surgery: a systematic review. Ann Surg Oncol. 2011;18:813–823.
15. Cardoso MJ, Cardoso J, Amaral N, et al. Turning subjective into objective: the BCCT.core software for evaluation of cosmetic results in breast cancer conservative treatment. Breast. 2007;16:456–461.
Supplemental Digital Content
Copyright © 2017 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of the American Society of Plastic Surgeons. All rights reserved.