Journal Logo

Article

Defining the clinically important difference in pain outcome measures

Farrar, John T.a,*; Portenoy, Russell K.b; Berlin, Jesse A.a; Kinman, Judith L.a; Strom, Brian L.a

Author Information
doi: 10.1016/S0304-3959(00)00339-0
  • Free

Abstract

1. Introduction

Pain is inherently subjective and pain measurement relies primarily on the verbal report of patients (Bromm, 1984). Interest in the measurement and meaning of this subjective phenomenon has developed in a number of fields simultaneously, with the conclusion that ultimately the patient's report must be accepted as a valid representation of the perception of pain. However, the wide variation in the pain experience among individuals leads to a large variability in the pain scale ratings of patients who experience similar stimuli or interventions. Additionally, pain scale measurements are often interpreted in different ways by different researchers and clinicians, depending on the criteria they choose to apply.

By design, many pain intensity measures yield data that may appear to be normally distributed and continuous. The conventional method of analyzing normally distributed data is to perform a central tendency analysis, e.g. calculating a change in pain score for each patient and then comparing the mean value of the change between the treatment and placebo groups (Laska et al., 1991). A similar approach is often used for ordinal scales as well. When the underlying variable, e.g. change in pain score, is both continuous and normally distributed, it is well known that such comparisons of central tendency will provide greater statistical power than the comparison of proportions of patients achieving a certain threshold value. In these situations, the t-test or linear regression methods may provide sufficient power to detect even small differences in mean changes in pain (or percentage changes) between groups. However, situations might arise in which a particular treatment produces a substantial benefit in a moderate proportion of patients (say 50%), but no change, or even a perceived worsening, in others. This would lead to a bimodal distribution of change scores, violating the statistical assumptions underlying such tests. For bimodal data, when there is little or no difference in group mean values, there are situations in which there might be a statistically significant difference between groups in the analysis of the proportion of patients achieving a clinically important benefit. The relative frequency of these two situations remains to be determined.

Other limitations of the analysis of the central tendency include both potential statistical and clinical problems. Statistically, there is evidence that patients use the beginning, middle, and end of measurement scales preferentially (Huskisson and Scott, 1976; Serlin et al., 1995). While group data may superficially appear to be normally distributed and are often analyzed using statistics that assume a normal distribution, methods that do not make this assumption may be more appropriate (Huskisson and Scott, 1976; Price et al., 1983).

In addition, long standing clinical experience and experimental evidence suggest that patients tend to describe their pain as a percentage change throughout most of the scale, i.e. a change from 9/10 to 6/10 (absolute change 3, percentage change 33%) is likely to be equivalent to, rather than more than, the change from 6/10 to 4/10 (absolute change 2, percentage change 33%). Price et al. (1983) have recommended that the pain scale be used as a ratio scale.

Clinically, with no absolute reference standard for the measurement of pain, individuals vary in the subjective rating they indicate on scales such as the 0–10 numeric rating scale (NRS), 100 mm visual analog pain scale (VAS-PI), or any of the verbal descriptor scales. Thus, any given change in the VAS-PI score has no intrinsic meaning. In fact, from a clinical perspective, the wide inter-person variability in the report of symptoms and representation on pain scales makes it difficult to interpret changes in mean scores. For example, if the treatment group has a mean change of 3.5 and the placebo group a change of 2.5, both on a scale of 10, how should a clinician interpret the results? This group mean difference of 1.0 could reflect large changes in a few patients, small changes in many patients, or any combination of these outcomes. Thus, a central tendency analysis of a pain scale (e.g. the comparison of two group means or medians) provides a summary statistic that is difficult to interpret in the clinical treatment of individual patients. A clinically more informative statistic is the proportion of patients who achieve a clinically important improvement. This statistic provides the clinician with information about the likelihood of a good response in his or her patient.

Finally, when a clinician is treating a patient with pain, the primary issue is whether the medication prescribed will provide adequate pain relief or whether the patient will require additional ‘rescue’ medication for episodes of pain. Not requiring additional treatment is a clinically valid indicator of the efficacy of the initial therapy. Thus, an analysis of the proportion of patients who achieve this level of relief in a clinical trial would provide information that is more clinically relevant about the efficacy of the medication being evaluated.

Therefore, to study the patient's perception of efficacy, the ‘use of an additional rescue dose’ of medication is a clinically appropriate and easily measured outcome. However, this approach can only be used when the study medication has a relatively fast and consistent onset and a second effective treatment is available to be used as a rescue. This is not often an option outside of experimental acute pain models. Instead, changes in pain scores are used as a surrogate measure to assess the patient's improvement. If we assume that the question being addressed by the clinical trial is best answered by an analysis of the proportion of responders, an appropriate cut-off point needs to be defined to separate responders and non-responders, i.e. the change in pain score that defines a clinically important difference for the individual patient. In fact, in a number of analgesic studies, percentage changes, ranging from 33 to 50%, have been used in the definition of a positive outcome (Max, 1994; McQuay et al., 1995). Several authors have tried to define such cut-off points but have used different outcomes as the gold standard comparison.

A recent clinical trial of a new transmucosal drug delivery system for fentanyl citrate provided the data necessary for an analysis relating changes in pain scores to patients' assessments of an adequate treatment response. Fentanyl citrate is an effective and short-acting opioid medication that achieves peak serum values in 15–30 min and has efficacy for 2.5–3.5 h when given transmucosally (Streisand et al., 1991; Lichtor et al., 1999). This clinical trial demonstrated the efficacy of oral transmucosal fentanyl citrate (OTFC) in treating intermittent episodes of breakthrough pain in patients with well-controlled chronic cancer pain (Farrar et al., 1998). Extending the well-known concept of ‘time to rescue’ (Savarese et al., 1988), this study allowed patients to use additional rescue medication if they did not obtain adequate relief from the study drug after a pharmacologically appropriate 30 min time period. This action by the patient, of taking or not taking an additional rescue dose, is an objectively measurable assessment of the patient's integration of the benefits (efficacy) of the treatment versus its side-effects (risks), providing a patient-based evaluation of the effectiveness of the study medication. This measure is used as a gold standard in the current study to evaluate cut-off points in several pain intensity scales.

2. Methods

The analyses for this study were performed on data collected as part of a multiple cross-over, randomized, double-blind clinical trial that compared oral transmucosal fentanyl citrate (OTFC) with placebo for the treatment of cancer-related breakthrough pain (Farrar et al., 1998). All 130 study patients had chronic cancer-related pain controlled with long-acting opioid drugs and recurrent episodes of breakthrough pain that were treated with opioid rescue drugs in an outpatient setting. No patients had used OTFC before entering the study.

The study began with an open-label titration stage (Phase I) in which all patients were started at the lowest available dosage strength of OTFC (200 μg per unit). Clinically important pain relief was expected within 30 min, based on the pharmacology of the preparation and previous clinical experience (Streisand et al., 1991; Lichtor et al., 1999). For each breakthrough episode, if acceptable pain relief was not achieved by 30 min after the initial OTFC dose, a second dose of the OTFC or a dose of the patient's original intermittent opioid drug could be taken as an additional rescue. The dose per OTFC unit was gradually increased (maximum of 1600 μg per unit) until a single dose was found that adequately controlled 75% or more of the patient's episodes of the target breakthrough pain (defined as not requiring an additional rescue dose). This process provided documented Phase I episodes that were both satisfactorily treated (i.e. not requiring additional rescue medication) and unsatisfactorily treated (i.e. requiring additional rescue medication). By comparing the change in pain measures recorded for the satisfactorily treated episodes to the unsatisfactorily treated ones, a cut-off point that defined clinically important analgesia was calculated for each pain scale.

All patients in Phase I who achieved clinically important pain relief with a single OTFC unit >75% of the time were advanced to the randomized Phase II. The randomization was accomplished by giving each patient ten identically appearing sequentially numbered OTFC units in random order (seven active and three placebo). Up to ten successive episodes of breakthrough cancer pain were treated in a double-blind fashion.

The data collected were pain intensity and pain relief every 15 min for 1 h and a global medication performance rating at the end of 60 min or at the time a patient took an additional rescue dose, whichever came first. The current analysis focused on the values at 30 min, because this was the time point by which 95% of those patients who went on to take an additional rescue dose had done so. The standard summary statistics for these measures were evaluated, including absolute difference in pain intensity (PID, 0–10 scale), percentage difference in pain intensity (PID%, 0–100% scale), pain relief (PR, 0 (none), 1 (slight), 2 (moderate), 3 (lots), 4 (complete)), sum of the absolute difference in pain intensity (SPID, sum of four measurements over 60 min), percentage of maximum total pain relief (% Max TOTPAR over 60 min), and global medication performance (0 (poor), 1 (fair), 2 (good), 3 (very good), 4 (excellent)). If patients did not complete the full 60 min, i.e. took an additional rescue, the last observation was carried forward and summed for the hour to calculate the TOTPAR and SPID.

The primary analysis of the association between specific cut-off points and the use of additional rescue was performed using all available data (1268 episodes) from the 130 patients enrolled in Phase I, including patients who did not have both satisfactorily and unsatisfactorily treated episodes (unrestricted group). We used 2×2 tables to calculate the values for the specific cut-off point in each pain scale (e.g. PID%≥33%) compared with the reference standard of no use of additional rescue drug (positive response). Use of additional rescue drug is considered a negative response. Episodes of breakthrough pain were the unit of analysis. Sensitivity, specificity, and accuracy (see Table 1) were calculated. Our criterion for the best cut-off point was to find the highest overall accuracy coupled with the best balance of sensitivity and specificity. That is, given similar levels of accuracy, we would prefer a measure with a reasonable value for both sensitivity and specificity, so as not to over- or underestimate the efficacy. An identical analysis was performed that was restricted to the data on the 88 patients who had at least one episode that was satisfactorily treated and one that was unsatisfactorily treated.

T1-10
Table 1:
Definition of sensitivity, specificity, and accuracya

Finally, to evaluate the potential use of these cut-off points as outcome measures for a clinical trial, we used the Phase II data from the original trial to calculate the relative risk for each of the cut-off points. Specifically, logistic regression was used to estimate the relative risk between groups (Hosmer et al., 1997). Because the multiple episodes per patient are not fully independent, the P value and 95% confidence intervals were adjusted for clustering by patient, using a Generalized Estimating Equation routine in STATA v6.0 (Miller and Landis, 1991).

3. Results

In the titration Phase I, 130 patients recorded data on a total of 1268 episodes of pain, of which 349 episodes were treated with an additional rescue dose. Table 1 shows the 2×2 table for the calculation using the cut-off point of PID%≥33%, which resulted in a sensitivity of 73.4%, a specificity of 69.6%, and an accuracy of 72.3%. In other words, episodes of breakthrough pain in which the patient did not take an additional rescue dose after using the OTFC were associated with a 73.4% likelihood of the patient reporting an improvement of ≥33% on the pain intensity scale. Those episodes in which the patient did require an additional rescue dose had a 69.6% likelihood of the patient reporting a scale change of <33%. This process was repeated for a range of PID% cut-off points, which are presented in Table 2. Note that there is little change in accuracy for the PID% cut-off points between 10 and 33%, but a cut-off point of ≥33% also provided a high sensitivity and specificity that were nearly balanced.

T2-10
Table 2:
Sensitivity, specificity and accuracy of various cut-off points for the indicated pain measurement scales compared with the use of an additional rescue dosea

Using the same reasoning, the best cut-off points for other scales were calculated (Table 2). These cut-off points are: 33% or above for the % Max TOTPAR; 2 (moderate) or better for pain relief; 2 or better for absolute pain intensity difference; 2 or better for the SPID/h; and 2 (good) or better for the global performance of the medication. The same analysis in the group of 88 patients (restricted group) produced essentially identical results (data not shown).

Finally, of the initial 130 patients, 93 successfully identified an effective single OTFC dose, 92 agreed to enter the randomized Phase II, and 89 used at least one placebo and one active unit in Phase II. A change in the status of the patient's underlying cancer was the predominant reason for not completing the titration phase. Table 3 shows the results of the double-blind randomized placebo controlled Phase II data re-analyzed using each of the selected cut-off points. The optimal cut-off point analysis provides consistent estimates of the efficacy of OTFC for breakthrough cancer pain across all scales. Compared to the previously published relative risk (RR) of the gold standard outcome (Farrar et al., 1998), the RR presented in Table 3 are all slightly lower, consistent with the degree of accuracy of the association between the cut-off point and the gold standard.

T3-10
Table 3:
Using the optimal cut-off point values observed to re-analyze the randomized Phase II data of this trial provided consistent estimates of the efficacy of OTFC for breakthrough cancer pain across all scalesa

4. Discussion

This analysis estimated the levels of change in pain intensity, PID%, % Max TOTPAR, pain relief, absolute pain intensity, and global medication performance scores that are best associated with a patient's own evaluation of a clinically important difference for the treatment of breakthrough cancer-related pain. Our reference standard for a clinically important difference was whether a patient received enough relief in a pharmacokinetically appropriate time period to forego additional rescue analgesic therapy for that episode of pain. In addition, the scales calculated as a percentage change yielded a slightly better accuracy in predicting adequately treated episodes, with a balanced sensitivity and specificity. This result is consistent with previously published data that patients tend to represent change as a percentage (Price et al., 1983). The global medication performance of 2 (good) or better had excellent accuracy as well, supporting the use of a global treatment performance as an outcome. This elucidation of the cut-off points that indicate the same degree of clinical importance in different scales should allow more direct comparison across studies that use different measures as the primary outcome.

Use of these cut-off points to generate relative risks using the randomized Phase II data consistently demonstrated the benefit of the OTFC therapy, with relative risks varying from 1.64 for pain relief to 1.87 for the percentage change of PID%, depending on the scale and cut-off point used. This contrasts with the results as traditionally analyzed and presented in the original paper, i.e. a difference in the group means of 1 point on the VAS pain scale, P=0.001 (Farrar et al., 1998). As is apparent, the dichotomous outcome should be more helpful than the analysis of group mean values to clinicians' understanding of what to expect when treating patients, since it provides information about the percentage of patients likely to achieve a clinically important level of improvement. Presenting the effect size and appropriate statistical significance test from both analyses may be most appropriate to support the claim of efficacy for a study treatment. However, if the two results are not both statistically significant, then the authors of the study must carefully consider the implications for the interpretation of their results.

This analysis was possible because of a combination of features of both the disease process and study design. Specifically, breakthrough cancer pain is an acute exacerbation of pain above the constant chronic pain process. The intermittent and recurrent nature of the pain allowed for multiple treatments in the same patient over a relatively short time period. The rapid onset of action of the fentanyl from the OTFC allowed patients to make an early judgment about efficacy and the design allowed early re-medication if the dose was ineffective. The use of a titration phase in patients who had never before received OTFC, with patients starting at a low dose and then being titrated to a dose that was adequate for their pain, permitted the comparison of the change in pain scales reported for both satisfactorily and unsatisfactorily treated episodes. The randomized clinical trial phase provided a data set on the same patient population and allowed us to explore the impact on the effect size using different cut-off points in the comparison of the active drug and placebo. The need for additional medication to treat a specific episode of pain is action-oriented and therefore can be more objectively measured and reflects the patient's determination of the overall efficacy of the study medication. These features all provide strong support for the validity of this analysis.

The desirability of identifying a commonly accepted threshold value and the difficulties in defining a clinically important difference for symptomatic conditions such as pain have been well recognized (Beecher, 1959; Lasagna, 1960; Houde, 1982; Turk et al., 1993; Moore et al., 1997). As expected, when there are no ‘gold standards’, many different criteria have been used, often dependent on the question to be addressed by the researcher, the use anticipated for the measure, or the practical issues related to the data that can be reasonably obtained. However, most of the comparison standards used to determine the clinically important difference have serious flaws that limit their usefulness or are not based on the patient's perception of pain.

For example, some investigators have tried to compare the patient's reported pain with an observation by another individual. One such study compared the patient's report of pain intensity in an emergency room to that of the health care provider. A change of 18 mm/100 mm in the physician's assessment of the patient was found to correlate with the patient's assessment of being a ‘little bit better’ (Todd and Funk, 1996). Although the authors concluded that this is the minimum level of clinical importance that should be considered in clinical trials, there is ample evidence that outside observers are often inaccurate in their assessment of a patient's pain on an individual basis (Grossman et al., 1991).

Another common method has been the use of expert opinion to select the change necessary to define a clinically important difference. Some authors have called for a survey of experts, before conducting clinical trials, to determine the minimal difference that should be considered important (van Walraven et al., 1999). In their pioneering work on the use of meta-analysis to combine outcomes from smaller trials, Moore and McQuay chose a 50% cut-off point for the % Max TOTPAR reasoning that it ‘is a simple clinical endpoint of pain half relieved, easily understood by professionals and patients’ (Moore et al., 1996). While this value has been successfully applied to a number of subsequent studies providing statistically significant results (Moore et al., 1997, 1998), its importance to the patients in whom it is applied has never been established. Our analysis demonstrates that a value of 33% is a more appropriate cut-off point for this measure. The difference between 33 and 50% will not affect the conclusion of the published positive studies, since a positive study that uses 50% will also be positive using 33%. However, setting the bar too high may result in underestimating the efficacy of other treatments and has implications for future study design.

In their 1996 paper, Moore and McQuay also point out that the response of patients to analgesics recorded on pain intensity and pain relief scales is not normally distributed which renders the group mean values an ‘inaccurate description of the underlying distribution of individual responses’ (Moore et al., 1996). They then test the hypothesis that it is possible to infer the dichotomous data from the group mean values, finding that the proportion of patients with 50% maximum total pain relief is highly correlated with the group mean of the continuous analysis of the % Max TOTPAR in 11 out of 13 of their own post-operative pain studies and a variety of simulated data sets. They suggest that in studies which present only the mean value, the more useful dichotomous data can be inferred with confidence and call for further testing of this hypothesis in other pain models and with other interventions.

Outstanding work has also been carried out by the Outcome Measures in Rheumatology Clinical Trials (OMERACT) group to develop a uniform approach to studies in rheumatoid arthritis (Cranney et al., 1999). This group clearly outlined the distinction between clinically important and statistically significant differences, but again used expert opinion as the comparison. They brought some aspects of clinical decision making into the process by asking experts to choose patients they would consider to have improved from among 64 scenarios based on real cases. By comparing the level of change reported by the patient to the expert assessment, they concluded that a 36% improvement overall on the combined scale score (pain score, number of tender joints, and global medication performance) should be considered clinically important (Goldsmith et al., 1993). This value has now been used in a large number of subsequent trials with good results. However, again the correlation to a patient-based standard has not yet been established.

The OMERACT group argued that the ability to demonstrate differences in response between the placebo and active treatment is evidence for the validity of the cut-off point (Anonymous, 1997). Although a 36% cut-off point for a clinically important change is similar to ours, differentiating treated from untreated patients, while important, is not directly related to the validity of the cut-off point from the patient's point of view of what is clinically important. Since a clinically important improvement over time due to a placebo response is every bit as real to the patient as changes attributable to an active drug, both changes should have equal weight in determining the cut-off point. On the other hand, very small and unimportant differences can achieve statistical significance in large trials and still be of no clinical importance to the patients.

Another method of defining clinically important differences has been to compare different scales with each other in order to argue that they are comparable and clinically relevant. Guyatt has defined the minimal important difference (MID) as the smallest change reported by patients that correlates with the patient stating that he or she is ‘moderately better’ compared to his or her own state at an earlier point. Guyatt and colleagues used a seven point Likert scale centered at no-change as the standard. In a separate study, they also asked patients to compare themselves with the description of another patient with similar problems and to decide if they were moderately better than the comparison patient (Juniper et al., 1994; Redelmeier et al., 1996). This approach is clearly patient-based, but remains a comparison of two subjective scales. A limitation of this method is that the patient's response to the single question of being better or not may be biased by the patient's sense of what the interviewer wanted to hear or what the patient thought the answer ought to be. In addition, the timing and order of any such question can affect the results (Ware, 1978; Grootendorst et al., 1997). The results of our study support the use of a moderate or better improvement on the pain relief or global treatment performance scale as being clinically important outcomes.

In our study, we compare the patient's report of pain intensity and pain relief to the patient's action of taking or not taking an additional dose of rescue medication. Our standard measure can also be considered to reflect the subjective patient experience of pain, since only the patient can decide whether he or she wants to take an additional rescue dose. However, the measurement of this action (yes or no) is objective. In addition, this action allows the patient to integrate his or her perception of the amount of relief and any potential side-effects experienced. Although there are also other factors that can influence a patient's decision about taking an additional rescue dose of medication, this will not interfere with the relative value of the cut-off points for different scales, since the scales were measured simultaneously for each episode of pain. In addition, the action of taking an additional rescue medication can be considered an answer to an important question health care providers ask, namely ‘Does the patient need anything additional for his or her pain or is the treatment given good enough?’

It is important to acknowledge the wide disagreement on the choice of the best outcome measures for pain clinical trials (Houde, 1982; Max and Portenoy, 1993; LeResche, 1997). Some of these disagreements can be explained by differences in the questions that are being studied and the data that can be practically collected. There will be many situations in which the ‘additional rescue dose’ measure will not be an option. Since the data presented in this paper relate a number of different pain measures in such a way that the cut-off points can be considered of comparable clinical importance, use of these cut-off points will allow comparison across studies with different primary outcomes. Using these values and an extension of the method Moore and McQuay (Moore et al., 1996) presented above, it also may be possible to estimate the dichotomous data necessary for meta-analysis from studies that present only group mean values.

The potential limitations of the use of an additional dose of rescue medication as an outcome measure must also be considered. Factors other than pain level may play a role in the patient's decision making about taking an additional rescue dose. However, because each patient serves as his or her own control, we would expect that any factors that affect the decision of whether or not to use an additional rescue dose would not change substantially for the multiple episodes of pain treated in the same patient during the 2 week study period. Thus, the use of this measure as an outcome should provide a balanced assessment of the patient's perception of efficacy for both the active drug and placebo groups in a clinical trial. Second, there are many clinically relevant factors such as mood, expectation, and learning that affect a patient's report of pain. The current practice for dealing with these potential confounders is to measure them as best we can and then use a mathematical model to determine their relative importance and then try to adjust for them during the analysis. Our approach was to allow the patient to integrate all these potentially confounding factors into his or her own decision about the clinical importance of the response to the treatment. It seems more valid to allow the clinical trial subject to integrate the factors that go into his or her response rather than to use a mathematical model created by the investigator.

The results of the current analysis could be questioned because of limitations inherent in the original study (Farrar et al., 1998). All randomized clinical trials raise concerns about generalizability. The response to pain and its meaning in cancer patients may be different than for patients with other painful conditions. For example, the use of opioid medications is more acceptable in cancer patients, so patients may make a bigger effort to treat their pain aggressively. If so, this could result in an underestimate of the size of the clinically important difference in our study. It is reassuring to note the similarities between our values and the ones that have been successfully used in clinical trials of multiple other conditions (Leijon and Boivie, 1989; Max et al., 1992; Sheiner, 1994; Moore et al., 1996). In addition, breakthrough cancer pain is often self-limited. This is highlighted by the observation that no additional rescue medication was required in 66% of the episodes treated with placebo units in the randomized phase (Farrar et al., 1998). Thus, there may be concerns as to the applicability of this model to clinical trials of other types of acute or chronic pain or pain that requires longer term treatment. Ideally, these values would be calculated for each population of patients before conducting the actual clinical trial. At the very least, these results will need to be validated in other pain models to increase our confidence in their generalizability.

In conclusion, we have used a clinically relevant and easily measured indicator of an individual patient's assessment of the efficacy of his or her drug treatment for cancer-related breakthrough pain. Using this outcome as a standard, we have determined the association between the degree of change on several pain measurement scales that represents a clinically important difference. Using the optimal cut-off point values observed to re-analyze the randomized phase of this trial provided consistent estimates of the efficacy of OTFC for breakthrough cancer pain across all scales. Although the choice of scale and analysis for other clinical trials will depend on the study question and design, this analysis suggests the cut-off point values that can be used to indicate that a treatment produces a clinically important analgesia. This information should lead to a more standardized approach to the design of studies of pain treatments and better validity, comparability, and clinical utility of the results of future clinical trials.

Acknowledgements

This study was supported in part by a grant from Anesta Corp., Salt Lake City, UT and NIH-K08-NS01865.

References

Anonymous. OMERACT III. Outcome measures in arthritis clinical trials. Cairns, Australia, April 16–19, 1997. Proceedings. J Rheumatol. 1997;24:763-802.
Beecher HK. Measurement of subjective responses. New York: Oxford University Press; 1959. 51 pp.
Bromm B., 1984. The measurement of pain in man. In: Bromm B, editor., Pain measurement in man. Elsevier, Amsterdam, pp. 3-13.
Cranney A, Welch V, Tugwell P, Wells G, Adachi JD, McGowan J, Shea B. Responsiveness of endpoints in osteoporosis clinical trials – an update. J Rheumatol. 1999;26:222-228.
Farrar JT, Cleary J, Rauck R, Busch M, Nordbrock E. Oral transmucosal fentanyl citrate: randomized, double-blinded, placebo-controlled trial for treatment of breakthrough pain in cancer patients. J Natl Cancer Inst. 1998;90:611-616.
Goldsmith CH, Boers M, Bombardier C, Tugwell P. Criteria for clinically important changes in outcomes: development, scoring and evaluation of rheumatoid arthritis patient and trial profiles. OMERACT Committee. J Rheumatol. 1993;20:561-565.
Grootendorst PV, Feeny DH, Furlong W. Does it matter whom and how you ask? Inter- and intra-rater agreement in the Ontario Health Survey. J Clin Epidemiol. 1997;50:127-135.
Grossman SA, Sheidler VR, Swedeen K, Mucenski J, Piantadosi S. Correlation of patient and caregiver ratings of cancer pain. J Pain Sympt Manage. 1991;6:53-57.
Hosmer DW, Hosmer T, Le Cessie S, Lemeshow S. A comparison of goodness-of-fit tests for the logistic regression model. Stat Med. 1997;16:965-980.
Houde RW. Methods for measuring clinical pain in humans. Acta Anaesthesiol Scand Suppl. 1982;74:25-29.
Huskisson EC, Scott J. Graphic representation of pain. Pain. 1976;2:176.
Juniper EF, Guyatt GH, Willan A, Griffith LE. Determining a minimal important change in a disease-specific Quality of Life Questionnaire. J Clin Epidemiol. 1994;47:81-87.
Lasagna L. The clinical measurement of pain. Ann N Y Acad Sci. 1960;86:28-37.
Laska EM, Meisner M, Siegel C. Analytic approaches to quantifying pain scores. In: Max M, Portenoy R, Laska E, editors. Advances in pain research and therapy p. 675-683. New York: Raven Press; 1991.
Leijon G, Boivie J. Central post-stroke pain – a controlled trial of amitriptyline and carbamazepine. Pain. 1989;36:27-36.
LeResche L. Assessment of physical and behavioral outcomes of treatment. Oral Surg Oral Med Oral Pathol Oral Radiol Endodontics. 1997;83:82-86.
Lichtor JL, Sevarino FB, Joshi GP, Busch MA, Nordbrock E, Ginsberg B. The relative potency of oral transmucosal fentanyl citrate compared with intravenous morphine in the treatment of moderate to severe postoperative pain. Anesth Analg. 1999;89:732-738.
Max MB. Antidepressants as analgesics. Pharmacological approaches to the treatment of chronic pain. The fourth annual Bristol-Meyers Squibb Symposium on Pain Research, Seattle, WA, 1994.
Max MB, Portenoy RK., 1993. Methodological challenges for clinical trials of cancer pain treatments. In: Chapman CR, Foley KM, editors., Current and emerging issues in cancer pain: research and practice. Raven Press, New York, pp. 283-299.
Max MB, Lynch SA, Muir J, Shoaf SE, Smoller B, Dubner R. Effects of desipramine, amitriptyline, and fluoxetine on pain in diabetic neuropathy. N Engl J Med. 1992;326:1250-1256.
McQuay H, Carroll D, Jadad AR, Wiffen P, Moore A. Anticonvulsant drugs for management of pain: a systematic review. Br Med J. 1995;311:1047-1052.
Miller ME, Landis JR. Generalized variance component models for clustered categorical response variables. Biometrics. 1991;47:33-44.
Moore A, McQuay H, Gavaghan D. Deriving dichotomous outcome measures from continuous data in randomised controlled trials of analgesics. Pain. 1996;66:229-237.
Moore A, Moore O, McQuay H, Gavaghan D. Deriving dichotomous outcome measures from continuous data in randomised controlled trials of analgesics: use of pain intensity and visual analogue scales. Pain. 1997;69:311-315.
Moore RA, Tramer MR, Carroll D, Wiffen PJ, McQuay HJ. Quantitative systematic review of topically applied non-steroidal anti-inflammatory drugs (see comments) (published erratum appears in Br Med J 1998;316(7137):1059). Br Med J. 1998;316:333-338.
Price DD, McGrath PA, Rafii A, Buckingham B. The validation of visual analogue scales as ratio scale measures for chronic and experimental pain. Pain. 1983;17:45-56.
Redelmeier DA, Guyatt GH, Goldstein RS. Assessing the minimal important difference in symptoms: a comparison of two techniques (see comments). J Clin Epidemiol. 1996;49:1215-1219.
Savarese JJ, Thomas GB, Homesley H, Hill CS. Rescue factor: a design for evaluating long-acting analgesics. Clin Pharmacol Ther. 1988;43:376-380.
Serlin RC, Mendoza TR, Nakamura Y, Edwards KR, Cleeland CS. When is cancer pain mild, moderate or severe? Grading pain severity by its interference with function. Pain. 1995;61:277-284.
Sheiner LB. A new approach to the analysis of analgesic drug trials, illustrated with bromfenac data. Clin Pharmacol Ther. 1994;56:309-322.
Streisand JB, Varvel JR, Stanski DR, Le Maire L, Ashburn MA, Hague BI, Tarver SD, Stanley TH. Absorption and bioavailability of oral transmucosal fentanyl citrate. Anesthesiology. 1991;75:223-229.
Todd KH, Funk JP. The minimum clinically important difference in physician-assigned visual analog pain scores. Acad Emerg Med. 1996;3:142-146.
Turk DC, Rudy TE, Sorkin BA. Neglected topics in chronic pain treatment outcome studies: determination of success. Pain. 1993;53:3-16.
van Walraven C, Mahon JL, Moher D, Bohm C, Laupacis A. Surveying physicians to determine the minimal important difference: implications for sample-size calculation. J Clin Epidemiol. 1999;52:717-723.
Ware JE Jr. Effects of acquiescent response set on patient satisfaction ratings. Med Care. 1978;16:327-336.
Keywords:

Analgesics; Pain; Pain measurement; Randomized controlled trials; Treatment outcome

© 2000 Lippincott Williams & Wilkins, Inc.