The SF-36 Arthritis-Specific Health Index (ASHI) was constructed to improve the responsiveness of the SF-36 Health Survey to changes in the severity of arthritis through the use of arthritis-specific scoring algorithms. This study compared the responsiveness of the ASHI and other generic scales and summary measures scored from the SF-36 in clinical trials of health outcomes for patients with arthritis.
Longitudinal data for patients (n = 835) participating in four placebo-controlled trials were analyzed. Study participants had at least a 6-month history of moderate to severe osteoarthritis or rheumatoid arthritis of the knee or hip. All had undergone a washout period of 3 to 14 days before baseline assessment to bring about a flare state in osteoarthritis or rheumatoid arthritis symptoms. Their average age was 60 years, and 72% were female. Responders and nonresponders were classified on the basis of physician assessments of changes in arthritis severity, with blinding as to treatment group; treated and untreated (placebo) groups were also compared. For the SF-36 ASHI, generic physical (PCS) and mental (MCS) component summary measures and each of eight subscales scored from the SF-36 (acute version) change scores were computed by subtracting scores before treatment from scores at 2-week follow-up. To evaluate empirical validity, analyses of variance were performed. For each measure, an F-ratio was computed for the comparison between clinically defined groups of responders and nonresponders and between groups of patients assigned to placebo versus drug therapy. Relative validity (RV) coefficients were computed for the ASHI in comparison with PCS, MCS, and the best SF-36 scale to determine which was more responsive.
In analyses of each of the four trials and all trials combined, RV coefficients for the ASHI were higher than those for both of the generic SF-36 summary measures and for the most valid SF-36 scale (Bodily Pain), with only one exception. Across 40 tests of validity in distinguishing treated from untreated patients, the ASHI was 5% to 19% more valid than the best SF-36 scale (RV = 1.05-1.19; RV = 1.10 in all trials combined). The generic summary measures (PCS and MCS) were much less valid in these tests (RV = 0.67 and 0.27, respectively). In analyses of responders and nonresponders, RV coefficients for the ASHI ranged from 0.70 to 1.22 (RV = 1.04 in all trials combined), in comparison with the best SF-36 subscale, which was always Bodily Pain. RV coefficients were lower for PCS (RV = 0.75) and much lower than the MCS (RV = 0.18) in comparisons of treatment outcomes based on all trials combined.
The ASHI appears to be more valid than the eight SF-36 scales and PCS and MCS summary measures for purposes of distinguishing between treated and untreated patients and between clinical responders and nonresponders. This study demonstrates the feasibility of improving the validity of the SF-36 through the use of arthritis-specific scoring while retaining the option of generic scoring, which makes it possible to also compare results across diseases and treatments.