Click on the links below to access all the Data Supplements for this article.
Please note that Data Supplement files may launch a viewer application outside of your web browser.
Hysterectomy is second only to cesarean delivery as the most common major surgery performed on women of reproductive age in the United States.1 There is longstanding controversy as to whether hysterectomy is overused and whether, for certain conditions, medical treatment or more conservative surgery may be just as effective and associated with fewer adverse outcomes.2
Dysfunctional uterine bleeding is diagnosed when abnormal bleeding occurs unrelated to demonstrable local pathology, pharmacological agents, intrauterine contraception, or systemic disorders of hemostasis; it manifests with abnormal timing, volume, and/or duration of flow.3 It is frequently associated with symptoms such as fatigue, pelvic pain, and decreased quality of life. Currently in the United States, most affected women are initially offered medical therapy, but when this approach is not effective or not tolerated, surgery may be selected. Surgical options include hysterectomy and endometrial ablation, a procedure which is designed to destroy the endometrium while leaving the uterus otherwise intact.
We initiated the Surgical Treatments Outcomes Project for Dysfunctional Uterine Bleeding (STOP-DUB) in 1996. Its primary objective was to compare effectiveness of hysterectomy and endometrial ablation in women with dysfunctional uterine bleeding for whom medical therapy had not been effective and who were candidates for surgery that would end fertility.
MATERIALS AND METHODS
Further detail on study methods is provided in a separate publication.4 Before applying for funding and finalizing the study design, we identified the available evidence addressing our study question5–7 and performed an informal systematic review.8
Institutional review boards associated with the Coordinating Center, the Chair's Office, the American College of Obstetricians and Gynecologists, and 33 Clinical Centers in the United States and Canada approved the STOP-DUB protocol and subsequent modifications to the protocol. A Data and Safety Monitoring Committee reviewed study data, unmasked to treatment group, on an interim basis for adverse or beneficial treatment effects. The Surgical Treatments Outcomes Project for Dysfunctional Uterine Bleeding comprised two groups of patients: 1) patients who were randomized either to hysterectomy or endometrial ablation and 2) an observational cohort who at the time of enrollment were either “provisionally ineligible” for the randomized controlled trial (RCT), or who did not wish to be randomly assigned. Provisionally ineligible patients who became eligible for the RCT were assigned to one of the surgical treatment groups.
To be eligible, patients were required to be at least 18 years of age, premenopausal, with dysfunctional uterine bleeding for at least 6 months (characterized by one or a combination of excess duration, amount, or unpredictability of flow), and refractory to medical therapy for at least 3 months. A woman was excluded if she was postmenopausal or had bilateral oophorectomy, was pregnant, wished to retain her fertility, or refused to consider surgery (other exclusion criteria are covered more completely in a separate article).4 Women were considered “provisionally ineligible” and enrolled in the observational study if they could possibly become eligible over the next few months.4
Thirty-one of 33 certified Clinical Centers screened, and 25 enrolled, at least one woman into the RCT. After undergoing preeligibility screen by Clinical Center staff and providing initial informed consent, patients underwent a series of baseline clinical examinations and tests.4 Data were subsequently collected from randomly assigned patients by the Clinical Center at a preoperative visit, during study surgery, the preprocedural and postprocedural institutional stays, and at a 4-week (±2 weeks) follow-up visit after study surgery. Clinical Centers also collected hospital or surgicenter operative notes and pathology reports. Clinical Centers recorded ad hoc “intercurrent” follow-up visits and patient contact, and the visit purpose (eg, for adverse events, reoperation, dysfunctional uterine bleeding symptoms), for a minimum of 24 months after surgery.
All enrolled women provided informed consent to enrollment and follow-up in the observational study or randomized trial. A baseline telephone interview was conducted by the Coordinating Center before a woman's official enrollment, and at follow-up: 3, 6, and 12 months after surgery, and at 6-month intervals thereafter for at least 24 months and up to 5 years. Observational patients were followed at least 6 months postbaseline.
Telephone interviewers asked RCT patients at the start of the interview not to reveal the type of surgery to which they were assigned. Interviewers recorded demographic information, bleeding history and experience, other health experiences (eg, bodily and other pain, sleep, fatigue, urinary function), sexual function, health-related quality of life,9,10 Euro-QOL (version EQ-5D),11,12 employment, housework, and leisure activities, out-of-pocket costs for bleeding and symptoms, treatments received, and health provider visits. We used several sources of data to identify perioperative events, defined as the intraoperative period and the following 42 days.
Randomly assigned patients had an equal probability of being assigned to hysterectomy and endometrial ablation. The randomization schedule, developed by the Coordinating Center, was stratified by Clinical Center and within each Center on patient age (younger than 45 years compared with 45 years or older). Randomization used permuted blocks of size two, four, or eight, always starting with a block size of two, with the size randomly selected thereafter. The Coordinating Center reviewed data faxed by the Clinical Center to confirm patient eligibility before randomization and subsequently telephoned the Clinical Center with the computer-generated randomized assignment.
Total hysterectomy was performed using either a vaginal, laparoscopic, or abdominal approach, under general or regional anesthesia. Only women aged 45 years or older and assigned to hysterectomy were allowed oophorectomy. Endometrial ablation techniques allowed in STOP-DUB were restricted to resectoscopic endometrial ablation using electrodesiccation/coagulation or vaporization and nonresectoscopic endometrial ablation with a thermal balloon (ThermaChoice, Ethicon Womens Health and Urology, Sommerville, NJ). Patients assigned endometrial ablation were allowed to have tubal occlusion on request. The type of endometrial ablation and the hysterectomy approach used were designated prerandomization by each surgeon.
The primary outcomes of the randomized trial were the effect of surgery on the major problem (symptom) the woman identified as her reason for seeking treatment (“problem solved”), bleeding, pain, and fatigue, measured 1 year after surgery. Additional planned outcomes included the effect of surgery at time points after 1 year, changes in quality of life outcomes, surgical complications, additional surgery (reoperation), and resource utilization.
We originally conceived of STOP-DUB as an equivalence trial and based our sample size estimate (400 patients in each group) on the primary outcome of “major problem solved.” While recruitment was ongoing, we revised our original sample size estimate for several reasons. First, 1 year after the study start, the Data and Safety Monitoring Committee recommended that we add bleeding, pain, and fatigue as primary outcomes. Second, new information about endometrial ablation suggested that it was not likely to eliminate bleeding entirely, indicating that we could not expect equivalence of the two surgeries for the bleeding outcome. Finally, slower-than-expected recruitment limited the number of women we could follow for a minimum of 24 months, given 5-year funding.
Accordingly, in 1999, assuming a Type I error of 0.05 and 90% power for each index, we revised our planned sample size estimate to 242 randomly assigned patients, recruitment of 135 per group, and attrition of 10% at 1 year.4 Using the MOS 36-Item Short-Form Health Survey (SF-36)–derived measures of pain, fatigue (Energy and Vitality), and problem solved (General Health Perception), a sample size of 121 in each group would be adequate to detect a difference of 10 points on Pain (assuming a mean of 74.8 and standard deviation [SD] of 22.7) and at least 9 points on Energy and Vitality (mean 59.4, SD 19.7) or General Health Perception (mean 74.3, SD 19.4) and at least 7.1 points on the SF-36 Mental Health Index (all scaled 0–100 range, standard deviation estimates from U.S. norms for females aged 35 to 44 years).13
Data were double entered into Oracle 8.0 (Redwood Shores, Ca) at the Coordinating Center. For the primary analyses, which used SAS 9.1 (SAS Institute Inc, Cary NC), patients were analyzed in the treatment group to which they were assigned, regardless of treatment actually received. Subsequent analyses, including analyses of adverse events, took actual surgery type and reoperation into account.
Primary outcomes examined include derived measures on which our sample size estimate was based (SF-36 Energy and Vitality, General Health Perception, and Mental Health Index), as well as responses to direct questions related to the primary outcomes (bleeding, pain, fatigue, problem solved). Bleeding questions asked about experience “over the past 3 months.” Questions relating to the SF-36, as well as pain and fatigue, asked about “the past 4 weeks.” For adverse events, we compiled all events identified and associated with the index surgery (study surgery actually received), but not including reoperation. The three gynecologists on the Steering Group (M.G.M., J.N., T.Y.) grouped the events into broad categories. To check the validity of this approach, an independent gynecologist with no study affiliation compared the individual listing of events with the grouped data.
We compared treatment groups on outcomes with Pearson χ2 tests and Fisher exact tests for categorical outcomes, and t tests for continuous outcomes. We investigated treatment effect at all interview points using longitudinal data analysis methodology; specifically, we used mixed-model linear regression analysis, with clinical site as a random effect and repeated measures on patients modeled with exchangeable covariance matrices. These methods take into account correlations among a given participant's responses over time and potential correlation within clinical site. These models, which include time by treatment group interactions, provide estimates of time trends in treatment effects and group differences in these trends. No adjustments were made for multiple outcomes or comparisons in our analyses.
Between November 25, 1997, and June 7, 2001, at 31 STOP-DUB Clinical Centers, we screened 1,721 women, of whom 1,484 were classified as ineligible at the end of study recruitment. There were 1,969 reasons for ineligibility (100%), the main ones relating to a lack of willingness to comply with the study requirements (1,053 reasons or 53% of the total). The remaining reasons were related to the patient's clinical evaluation (368; 19.7%), provisional eligibility criteria (152; 7.7%); personal or health-related history (94; 4.8%), and other factors (302; 15.4%) (Munro MG, Clark MA, Langenberg P, for the STOP-DUB [Surgical Treatments Outcomes Project for Dysfunctional Uterine Bleeding] Research Group. Ovulatory status and other characteristics of patients participating in the surgical treatments outcomes project for dysfunctional uterine bleeding. Unpublished manuscript).
Beginning in January 1998, we randomly assigned a total of 237 women from 25 Clinical Centers, 123 to endometrial ablation and 114 to hysterectomy. Figure 1 shows the Consolidated Standards of Reporting Trials flow chart. Over the course of the STOP-DUB enrollment period, 41 of 237 entered the randomized trial by transferring from the observational study. More women assigned to endometrial ablation than to hysterectomy refused surgery or did not have the surgery that was assigned (14 of 123 compared with 6 of 114).
Patient follow-up for the randomized study ended June 10, 2003. Women were followed from enrollment to the end of follow-up, with the result that women assigned later had shorter follow-up. Those completing follow-up for each period in which they were enrolled were 36 of 47 (76.6%) women enrolled for 5 years, 135 of 141 (95.7%) enrolled for 4 years, 191 of 202 (94.6%) enrolled for 3 years, 225 of 237 (94.9%) enrolled for 2 years, and 225 of 237 (94.9%) enrolled for 1 year completed follow-up (Fig. 1).
Probably because of stratification by age and site, and small sample sizes, the two randomized groups were somewhat different in size. Patients in the two groups seem to be similar in their demographic and prognostic characteristics (Table 1). Because patients were randomly assigned, any baseline differences observed between the two groups are assumed due to chance. More than 80% of randomly assigned women named excessive bleeding as the major problem leading them to consider surgery. We found no baseline or 6-month differences between women missing 24-month data and those with 24-month data, for key demographic and outcome variables.
There were 18 bilateral oophorectomies (15 were assigned to hysterectomy and 3 were assigned to ablation but had hysterectomy). Seven of the eight unilateral oophorectomies were aged less than 45 years, and six of 18 with bilateral oophorectomies were aged less than 45 years. Numbers were too small to draw meaningful conclusions regarding this subgroup in comparison with the other women.
At 12 months, the intention-to-treat analysis indicated that women with hysterectomy had less bleeding, and less bodily and pelvic pain. These differences persisted to 48 months of follow-up (Table 2). Fatigue seemed similar for the two groups at 12 months of follow-up and at subsequent follow-ups. The vast majority of women reported that the major problem they named at baseline was solved by 12 months, although the percentage was somewhat smaller for women assigned to endometrial ablation compared with hysterectomy (87.9% compared with 93.2%); this beneficial effect of both treatments on problem solved persisted to 48 months.
When we analyzed women according to the treatment actually received, ie, 1) those who had endometrial ablation and no reoperation, 2) those who had endometrial ablation who had reoperation within 24 months, and 3) those who had hysterectomy, our findings were similar to those from the intention-to-treat analysis (data not shown). Both analyses indicated a somewhat smaller proportion of women indicating “problem solved” in the endometrial ablation group, but we do not interpret this finding to be of particular clinical significance. When bleeding was reported by women who had had hysterectomy (n=10), it seemed to be cases of spotting. Although endometrial ablation controlled the amount of bleeding, the frequency, predictability, and duration of bleeding remained issues at 24 months for about 15% of women having endometrial ablation only. There were no meaningful differences between the groups in pain, fatigue, and problem solved at 24 months.
We saw few differences between groups when women were analyzed as part of the group to which they were assigned (Fig. 2) for SF-36-based primary outcomes. At 6 months only, the hysterectomy group reported lower levels of pain and fatigue compared with the endometrial ablation group. When we compared the women as treated (those who had endometrial ablation and no reoperation, those who had endometrial ablation and reoperation within 24 months, and those who had hysterectomy), only the Pain Index was different among groups (data not shown). In this case, it seems that women who had reoperation reported more severe pain early in the study, but over time, and as they had additional surgery, the reported pain decreased. The endometrial ablation–reoperation group differed significantly on the Pain Index from the endometrial ablation group without further surgery at 6 months (P=.03) and 12 months (P=.07). The endometrial ablation–reoperation group differed significantly on pain from the hysterectomy group at 6 months (P=.001) and 12 months (P=.03).
The average EuroQOL “feeling thermometer scores” reported at 24 months were 75.2 and 77.8 (range 0–100) for endometrial ablation and hysterectomy, respectively (Table 3). We did not observe a differential treatment effect, whether for the “thermometer score” or the final score, based on no statistically significant interactions between treatment group and time (P=.16, .06, .06 for the thermometer, U.K. and U.S. scores respectively). There were small but significantly higher scores at 6 months for those assigned to hysterectomy.
There were almost four times as many adverse events reported for those receiving hysterectomy (n=48, 40.6%) compared with those receiving endometrial ablation (n=12, 10.9%) for the index surgery (Table 4). In addition, there were almost six times as many postoperative infections seen in the hysterectomy group.
We did not randomize patients to a specific type of endometrial ablation (resectoscopic endometrial ablation and nonresectoscopic endometrial ablation) or surgical approach for hysterectomy; this decision was left to the Study Gynecologists. STOP-DUB Study Gynecologists adhered to their pre-randomization designation of surgery type 80% of the time. Of the 111 patients receiving endometrial ablation, approximately equal proportions of women received nonresectoscopic endometrial ablation and resectoscopic endometrial ablation and at 24 months, similar proportions reported that their major problem had been solved. Of the 118 patients receiving hysterectomy, most received vaginal hysterectomy (70), with smaller numbers receiving abdominal hysterectomy (30) and laparoscopic hysterectomy (18). No differences in the primary outcomes were observed among hysterectomy subgroups.
By 24 months of follow-up, 27 of 110 women who received endometrial ablation initially had had reoperation; by 48 months this number had grown to 32, and by 60 months, to 34 (Fig. 3). Thirty-two of the 34 women with endometrial ablation and reoperation had hysterectomy, and two had a repeat endometrial ablation as their second surgery.
Both endometrial ablation and hysterectomy are effective in solving the problem that led women to seek surgery and in relieving pain, fatigue, and bleeding (although endometrial ablation is less effective for bleeding), whether we assessed SF-36 indices or responses to specific questions about bleeding, pain, or fatigue. Treatment benefits reached a threshold by approximately 24 months, regardless of the outcome assessed. When patients were analyzed “as treated,” we obtained results similar to those of the intention-to-treat analysis, with the possible exception of pain. In this case, women initially receiving endometrial ablation who had reoperation within the first 24 months after their initial surgery seemed to have had slower-to-improve pelvic pain. This may indicate that pain is a factor in deciding to have reoperation. It is possible that there is a true difference between treatment groups for certain outcomes that we were unable to detect due to inadequate power; however, our results are similar to those of other studies.
The Cochrane review comparing the relative effectiveness of hysterectomy and endometrial resection/ablation14 identified four trials that measured satisfaction after surgery.5–7,15 At 1 and 2 years after surgery, women who had hysterectomy were statistically significantly more likely to be very or moderately satisfied compared with those who had endometrial resection or ablation. At 3 and 4 years of follow-up, there were no significant differences between posttreatment satisfaction rates in groups (two studies). A statistically significant difference was detected for SF-36 general health perception measured 2 years after surgery, with women having a hysterectomy reporting significantly higher scores.6,15
We found no evidence of a differential effect by type of surgery for the EuroQOL feeling thermometer score or on the health utility scores using either the U.S. or U.K. algorithms, which agrees overall with the findings of the Cochrane review. We observed low values both on the visual analog scale and with the EQ5D algorithm. Both baseline and follow-up utility values observed in STOP-DUB randomly assigned women were lower than or similar to those reported in several other studies dealing with the treatment of menorrhagia.16–18 Notably, the STOP-DUB baseline health utility values were similar to those of patients with age-related macular degeneration with visual acuity of 20/200 or worse in the better eye, or who were legally blind,19 and were also comparable to those of patients with stage 3 malignant esophageal dysphagia.20 Thus, women considering surgery to relieve dysfunctional uterine bleeding report a poor quality of life, and surgery, although associated with improvement, does not resolve all problems affecting the women's health utility.
Our reoperation rate of approximately 31% (34 of 110) at 5 years was similar to, but lower than that of the Aberdeen Study21 which found 38% reoperation by 4 years in an intention-to-treat analysis, although nearly all STOP-DUB women had hysterectomy as their reoperation compared with about one half of the Aberdeen women. We do not know whether this difference between studies is related to patient or physician preferences, or other factors. The Cochrane review comparing endometrial resection/ablation with hysterectomy showed that, combining data across five studies, endometrial resection and ablation had a statistically significantly increased risk of reoperation at 1, 2, 3, and 4 years after the initial surgery.14
As expected, there were more adverse events with hysterectomy compared with endometrial ablation. Similar results were found by the Cochrane review of randomized trials.14 The exception was related to events associated uniquely with endometrial ablation, that is, uterine perforation, cervical laceration, and fluid overload.
Although our trial did not randomize endometrial ablation technique used, and our numbers were small, our findings on perioperative events associated with the various endometrial ablation techniques are consistent with results from a Cochrane review of randomized trials comparing various endometrial ablation techniques.22
Women with dysfunctional uterine bleeding and any bleeding pattern (ovulatory and anovulatory) were eligible for STOP-DUB, provided they had a normal endometrial cavity and a limited size and number of intramural or subserosal leiomyomas, with none that were submucosal. This population is somewhat different from the populations examined in the five previous RCTs.5–7,15,23 Three trials allowed inclusion of patients with submucosal leiomyomas,7,16,23 one does not seem to have excluded them,6 and one, like STOP-DUB, excluded patients with such lesions.5 Four of the five RCTs described their patients as having “menorrhagia,”5,6,15,23 and one trial focused on patients with dysfunctional uterine bleeding,7 although submucosal myomas were apparently present in 20% (21 of 104) of the patients assigned to endometrial resection. The reports from two trials15,23 indicate that the study population was restricted to women who were apparently ovulatory (“regular menstrual cycles between 21 and 35 days”), whereas the other three RCTs apparently enrolled all women meeting their criteria, regardless of ovulatory status.
We believe our results are applicable at least to U.S. and Canadian populations. Most women enrolled in STOP-DUB had low-to-middle household incomes, were not highly educated, and were insured. This mirrors the characteristics of women receiving hysterectomy in the United States.2 Although we were unable to find information that would allow us to assess whether the STOP-DUB study population is similar to U.S. women with dysfunctional uterine bleeding overall, we have no reason to believe they are different. Our multicenter study drew from a variety of practice types and locations. In STOP-DUB, 94.9%, 94.6%, and 95.7% of women eligible for follow-up were followed at 24, 36, and 48 months, respectively.
Our results indicate that both hysterectomy and endometrial ablation provide satisfactory results for women with dysfunctional uterine bleeding that has not responded to medical therapy. While almost one third of women having endometrial ablation will have reoperation within 5 years, hysterectomy is associated with more perioperative morbidity. It is reasonable to recommend that women select the type of surgery they receive for treatment of dysfunctional uterine bleeding based on their individual preferences and situations.