Fibromyalgia (FM) is a common systemic disorder and an important public health problem estimated to affect approximately 2% to 4% of the general population.1 FM is associated with a reduced threshold for pain, generally identified by an increased sensitivity to pressure at particular points on the body. This is characterized by chronic widespread pain often accompanied by persisting fatigue, muscle stiffness, and sleep disturbances. FM is also often associated with other functional syndromes such as irritable bowel syndrome and depression.
These symptoms represent an important burden for the patient and generates a high level of disability that needs to be considered and treated. Current therapeutic options are based on a multimodal approach that includes pharmacological treatment, physical exercise, and education. Antidepressants are the cornerstone of many treatment paradigms.2–6 The 5-HT and noradrenaline reuptake inhibitors are of particular interest because of their dual actions. Among them, milnacipran (MLN) has demonstrated its benefit in the treatment of patients with FM. MLN obtained a marketing authorization in the USA in 2009 and in Australia in 2011. Marketing authorization was refused in Europe in 2008.
During the submission process, regulatory bodies questioned the treatment-effect size and the distribution of this effect in patients enrolled in clinical trials. The Australian agency (Therapeutic Goods Administration) considered that the overall efficacy of MLN against placebo is moderate, although clinically significant, and could be translated into a real and strong improvement in some categories of patients. The determination of the symptom domains and the identification of the “good patients” who could benefit the most from MLN are the primary objectives of this manuscript.
MATERIALS AND METHODS
Source data come from the 3 placebo-controlled clinical trials used as pivotal trials in the application dossier of MLN to treat FM submitted to the Therapeutic Goods Administration in 2011. These are the clinical trials MLN-MD-027 (ClinicalTrials.gov Identifier NCT00098124) conducted in the United States with the doses MLN 100 and 200 mg, F02207-GE-3028 (ClinicalTrials.gov Identifier NCT00436033) conducted in Europe with the dose MLN 200 mg, and MLN-MD-039 (ClinicalTrials.gov Identifier NCT00314249) conducted in the United States with the dose MLN 100 mg. Market authorization was approved in 2012. Detailed public information about efficacy and safety results is given in the AusPAR, which is available at https://www.tga.gov.au/file/1303/download. Data of these 3 clinical trials were assembled into a global database, which consists of 3116 FM patients, including 836 patients who were administered MLN 200 mg, 917 patients administered MLN 100 mg, and 1363 patients administered placebo.
Efficacy assessments include pain intensity as measured on a visual analogue scale 0 to 100 mm and the patient global improvement change (PGIC) score. In the 3 clinical trials, the primary efficacy criterion was a binary composite criterion expressed in terms of the clinical response or nonresponse, wherein a clinical response is obtained if the pain intensity decreases by at least 30% and the PGIC is in the range [1,2] at the last visit. Several multi-item inventories and questionnaires were also used:
- Fibromyalgia Impact Questionnaire (FIQ) (a 20-item FM impact questionnaire, revised version 2002): a FM-specific questionnaire intended to assess the physical functioning, the work status (missed days of work and job difficulty), depression, anxiety, morning tiredness, pain, stiffness, fatigue, and well-being.10
- Multidimensional Fatigue Inventory (MFI) (20-items): an instrument to assess different domains of fatigue including general fatigue, physical fatigue, mental fatigue, reduced motivation, and reduced activity.11
- Short-form 36 questions (SF-36): a general health instrument intended to assess 8 health concepts including physical functioning, role limitations because of physical health problems, bodily pain, social functioning, general mental health, role limitations because of emotional problems, vitality, and general health perceptions. The SF-36 can be divided into 2 aggregate summary measures: the Physical Component Summary (PCS) and the Mental Component Summary (MCS).12,13
- Beck Depression Inventory (21-item): a survey intended to assess symptoms of depressed mood such as hopelessness and irritability, feelings such as guilt or feelings of being punished, as well as physical symptoms such as fatigue, weight loss, and lack of interest in sex.14,15
- Multiple Ability Self-Report Questionnaire (MASQ): an instrument to assess patient perception of cognitive functioning on 5 cognitive domains including language, visuoperceptual, verbal memory, visual memory, and attention.16
To enter the trials, patients should have a diagnosis of FM according to the 1990 ACR criteria and report a pain intensity >40 mm on the visual analogue scale at the randomization visit. Of note, 2010 ACR criteria were not used as clinical trials were initiated before this year. In addition, they should be without severe psychiatric illness as attested by the MINI questionnaire (Mini international psychiatric interview) and withdraw from CNS-active therapies commonly used for FM. The designs of these 3 clinical trials include an escalating dose period lasting up to 1 month, with a possibility to adapt the dose for tolerability reasons, and a 3-month fixed-dose period.
Patients’ baseline characteristics are derived from the 163 variables reported at the randomization visit. Continuous variables were categorized into 3 classes (low, medium, high) defined upon the 33.3 and the 66.6 percentiles as measured on the whole sample.
Efficacy is characterized by the 12 binary outcomes, defined in Table 1, which indicate clinical responses or nonresponses in the different FM symptoms. These outcomes were preclassified into 4 functional categories: “Pain and global” contains the composite criterion, PGIC2, Pain30%, Pain50%, and PainEarly30%; “Function” contains FIQ50%, FIQRefresh30%, FIQStiffness30%, and PCS6; “Mood and mental” contains BDI50% and MCS6; and “Fatigue” contains MFI20%. Outcomes characterize the efficacy at the last observed visit, except PainEarly30%, which reflects early improvement in pain at the end of the escalating dose period.
The first objective of analysis was the determination of symptom domains that are relevant in FM patients in daily practice. A Ward hierarchical analysis was performed on the whole patient set to exhibit natural groupings of homogeneous observations into clusters of efficacy outcomes and individual patients. This biclustering analysis was based on the Euclidian distance and was performed using the R software.
Next, baseline characteristics were investigated, alone and in combination, to determine the clinical features that may be predictive of a substantially improved MLN effect. A systematic exploration of the database was carried out using Ariana data-mining platform Knowledge Extraction and Management (KEM). KEM is a hierarchical clustering method based on the Galois lattices theory.17,18 Consequently, any baseline predictor is either a single baseline characteristic (ie, a single baseline predictor) or a combination of a single baseline predictor with another single baseline characteristic. Unexpected and nonredundant association rules between baseline characteristics and efficacy outcomes can be discovered systematically and without any preestablished assumption.
A baseline characteristic was declared to be a predictor of a substantially improved treatment effect if it identifies a subgroup in which the treatment effect is statistically greater than the treatment effect in the rest of the whole patient set (ie, P<0.05). Clinical messages from data mining about the predictive effect depend on the accuracy of the statistical methods used. In our analysis, “P-values” are provided by the Fisher exact tests and are adjusted for multiplicity to control the false-discovery rate using the Benjamini-Hochberg method.19 In other words, we control the expected proportion of false predictors among all the identified baseline predictors. An additional requirement was that the minimum size of subgroups identified by any potential candidate baseline predictors is 5% of the whole sample size. For MLN 100 mg, this represents a minimum of 46 patients among the 917 patients randomized in this group.
The effect of MLN against placebo on binary outcomes is expressed in terms of the odds ratio (OR). As a reminder, an “odds” is simply the rate of clinical responders divided by the rate of nonresponders. In our context, an OR is the ratio of the odds for MLN to the odds for placebo. Hence, the OR value indicates the extent to which the odds for the MLN group increases or decreases relative to placebo. An OR value of 1 indicates no MLN effect, whereas an OR value >1 indicates an effect in favor of MLN.
Clustering analysis allows the characterization of patients’ answers on the 12 efficacy outcomes. Figure 1 shows a heat map, which is a graphical representation of outcome values. Values are represented by the efficacy outcome in the rows and patients in the columns, in white for responders and black for nonresponders. Combining the heat map with hierarchical clustering is a way of arranging the rows and the columns to place similar values near each other on the basis of the similarity (or distance) between them.
Regarding efficacy outcomes, Figure 1 explicitly exhibits 3 symptom domains, namely, “Pain and global,” “Mood and central status,” and “Function.” It is worthwhile to note that MFI20% for “Fatigue” is clustered within “Function” and should be considered as a part of this symptom domain in FM patients.
Figure 1 also provides clusters of homogeneous patients relative to the profiles of answers. Cluster 1 mostly consists of patients who are nonresponders on outcomes related to pain. However, many of them respond on one of the other outcomes. This suggests that in real life, some patients may experience only a small reduction in pain (eg, 10% to 20%), but yet find that their functioning, fatigue, and/or global status are substantially improved. Cluster 2 mostly consists of patients who respond on the pain-related outcomes.
Interestingly, many patients who are BDI responders (BDI50%) respond neither on the mental (MCS6) nor the physical (PCS6) component of SF-36. In contrast, some BDI nonresponders respond on one or the other SF-36 component. This confirms that the mental and physical quality of life is not directly related to the mood in FM patients.
Influence of Baseline Characteristics on the Effect of MLN
In a first naive approach, a data-mining analysis was conducted on each of the 12 efficacy binary outcomes mentioned in Table 1 separately. A total of 8453 baseline characteristics and combinations of baseline characteristics were found to be associated with a substantially improved effect of MLN 100 mg and/or MLN 200 mg in at least 1 outcome. However, this finding does not lead to clinically relevant interpretations.
In a more informed strategy, we sought to determine the baseline predictors of an effect of MLN 100 mg, which is the registered posology in FM, on at least 3 outcomes related to pain and at least 1 outcome among the 7 others. Table 2 provides OR values for each efficacy outcome in the subgroups identified by baseline predictors. Four subgroups are identified by single baseline characteristics. All of them exhibit a substantial improvement in the treatment effect on the composite criterion, which is OR=1.9 in the whole patient set. The 4 subgroups consist of the 30% of patients with high pain intensity (OR=2.9; P=0.009), the 39% of patients with low anxiety or catastrophizing level (FIQ19) (OR=2.8; P=0.004), the 14% of patients without major sleeping problems (literally, “changes in sleeping patterns” in BDI16) (OR=2.9; P=0.021), and the 21% of patients with physical limitations in the daily life effort (literally, “capacity to carry groceries” in SF-363c) (OR=3.0; P=0.017).
Table 2 also exhibits combinations of these 4 baseline characteristics with others in a hierarchic manner. Totally, 48 combinations were found to be predictive of an effect of MLN 100 mg. To avoid redundancy of information, we removed the combinations based on a questionnaire item if another combination is based on a score or subscore derived from this item. For example, the 2 combinations of FIQ19-“low anxiety” with the item MFI18—“I don’t feel like doing anything” and with the subscore MFI-RM “Reduced motivation” were found to be predictive. As the subscore MFI-RM is derived from the item MFI18, the combination based on MFI18 was removed from the list in Table 2. Hence, the number of predictive combinations displayed reduces from 48 to 24.
Subgroups identified by combinations of baseline characteristics may exhibit a huge treatment effect. For the composite criterion, the 15% of patients who combine high pain intensity with medium capacity to follow phone calls exhibit an OR of 5.1 (P<0.001), and the 14% of patients who combine low anxiety level with no change in sleeping patterns exhibit an OR of 5.0 (P<0.001). This latter subgroup is also associated with the greatest mean of ORs over the improved outcomes (OR=4.2) among the identified subgroups. Improved outcomes in this subgroup are the composite criterion, the global impression of change (PGIC2) (OR=4.9; P=0.001), pain (PAIN30%) (OR=2.8; P=0.015), and morning stiffness (FIQStiffness50%) (OR=3.8; P=0.002).
The combination of baseline characteristics allowed the identification of 2 subgroups of patients who experienced a substantially improved treatment effect on the early pain outcome (PainEarly30%). These subgroups consist of the 20% of patients who combine high pain intensity with low indecisiveness (BDI13) (OR=3.3; P=0.011) and the 13% of patients who combine low anxiety (FIQ19) with low physical quality of life (SF-36 PCS) (OR=3.3; P=0.024). To end, although a high pain intensity is a single baseline predictor, it is worthwhile to note that the 16% of patients who combine low anxiety with low pain intensity benefited considerably from MLN 100 mg with, for example, OR=3.0; P=0.043 on the composite criterion.
For the dose MLN 200 mg, we found 3 single baseline characteristics that were predictive of a substantially improved effect on pain and at least 1 of the other symptoms. Two of them are low levels of answer to items of MFI, which are “I dread having to do things” (MFI09) and “I think I do very little in a day” (MFI10). These 2 baseline characteristics are related to a low level of pessimistic belief (associated with catastrophizing). The third single baseline characteristic is few “changes in sleeping patterns” (BDI16). These findings confirm the results obtained for MLN 100 mg with anxiety and sleeping problems. However, high pain intensity and low functional capacities, which are single baseline predictors for MLN 100 mg, are not found to be predictive for MLN 200 mg. This is not a contradictory result. This only means that the effect of MLN 200 mg in these subgroups is not substantially greater than the effect of MLN 200 mg in the rest of the whole patient set. This also suggests that the effect of MLN 200 mg is better distributed in the whole patient set than MLN 100 mg. The latter observation is confirmed by the decrease in the total number of baseline predictors, based on single baseline characteristics and combinations, from 52 for MLN 100 mg to 18 for MLN 200 mg.
Results presented in this manuscript provide additional insight about FM, its symptoms, and treatment. The clustering analysis indicates that improvement in fatigue goes with improvement in function, and not with pain, suggesting that the mechanisms of fatigue and pain are different. Next, the predictive data-mining analysis provides profiles of patients who benefit the most from a treatment with MLN. This information is useful to optimize prescriptions in daily practice. For example, if a patient comes to the physician with symptoms of anxiety, altered sleep, or both, it appears relevant to treat these symptoms before prescribing MLN. Conversely, if the pain intensity is high and there are some limitations in the functional ability, administration of MLN in first intention will be particularly beneficial. Hence, patient profiling can help one to prescribe treatment to the right patients at the right time.
The appeal of predictive data-mining methods is the possibility to analyze a large amount of information without assuming a formal link between potential baseline predictors and outcome variables. However, it is important to keep in mind that, in our context, the predictive profiles were identified in the patients enrolled in the 3 clinical trials included in the Australian application dossier using the available information at baseline from the inventories and questionnaires (ie, scores, subscores, items). Only clinical practitioners can state the extent to which each of the identified profiles, especially those described by combinations based on subscores and items, actually correspond to real patients.
Although the analysis was implemented using the KEM algorithm, which is based on logical rules, interpretation of data mining about the predictive effect of MLN depend on the relevancy of the clinical decisions (eg, to derive responses from continuous efficacy outcomes and to characterize the global improvement in FM) and the accuracy of the statistical methods (eg, to test the treatment effect and to adjust for multiplicity). Other limitations of analysis are those of any meta-analysis carried out on a pool of clinical trials. The sensitivity, which is the capacity to separate an active treatment from placebo, may vary across trials because of different patient characteristics and operating procedures. In our context, this impact is limited as the study designs and the eligibility criteria were similar in the 3 clinical trials.
Clinical development programs are usually designed to obtain marketing authorization, pricing, and reimbursement by demonstrating the efficacy on a broad population. However, phase 3 clinical trials do not reflect the daily practice of physicians, and global results do not allow an appreciation of the treatment effect on particular types of patients. Clustering and predictive data-mining methods, as implemented in our analysis, offer an opportunity to address some of these aspects. Results and interpretation allow a better characterization of the treatment effect and can be of interest to regulatory bodies. Although our analysis was carried out after dossier evaluation, such analyses can be conducted before submission, for example, at a predefined time in the clinical development program or within each clinical trial.
1. Clauw D. Fibromyalgia: a clinical review. JAMA. 2014;311:1547–1555.
2. Carette S, McCain GA, Bell DA, et al.. Evaluation of amitriptyline in primary fibrositis. A double-blind, placebo-controlled study. Arthritis Rheum. 1986;29:655–659.
3. Goldenberg DL, Felson DT, Dinerman H. A randomized, controlled trial of amitriptyline and naproxen in the treatment of patients with fibromyalgia. Arthritis Rheum. 1986;29:1371–1377.
4. Carette S, Bell MJ, Reynolds WJ, et al.. Comparison of amitriptyline, cyclobenzaprine, and placebo in the treatment of fibromyalgia. A randomized, double-blind clinical trial. Arthritis Rheum. 1994;37:32–40.
5. Arnold LM, Keck PE, Welge JA. Antidepressant treatment of fibromyalgia. A meta-analysis and review. Psychosomatics. 2000;41:104–113.
6. O’Malley PG, Balden E, Tomkins G, et al.. Treatment of fibromyalgia with antidepressants: a meta-analysis. J Gen Intern Med. 2000;15:659–666.
7. Clauw DJ, Mease P, Palmer RH, et al.. Milnacipran for the treatment of fibromyalgia in adults: a 15-week, multicenter, randomized, double-blind, placebo-controlled, multiple-dose clinical trial. Clin Ther. 2008;30:1988–2004. 2008.11.009. Erratum in: Clin Ther
. 2009; 31(2):446; Clin Ther
8. Branco JC, Zachrisson O, Perrot S, et al.. A European multicenter randomized double-blind placebo-controlled monotherapy clinical trial of milnacipran in treatment of fibromyalgia. J Rheumatol. 2010;37:851–859.
9. Arnold LM, Gendreau RM, Palmer RH, et al.. Efficacy and safety of milnacipran 100 mg/day in patients with fibromyalgia: results of a randomized, double-blind, placebo-controlled trial. Arthritis Rheum. 2010;62:2745–2756.
10. Bennett RM. The Fibromyalgia Impact Questionnaire (FIQ): a review of its development, current version, operating characteristics and uses. Clin Exp Rheumatol. 2005;23(suppl 39):S154–S162.
11. Smets EM, Garssen BJ, Bonke B, et al.. The Multidimensional Fatigue Inventory (MFI) psychometric qualities of an instrument to assess fatigue. J Psychosom Res. 1995;39:315–325.
12. Ware JE, Kosinski M, Keller SK. SF-36 Physical and Mental Health Summary Scales: A User’s Manual. Boston. MA: The Heath Institute; 1994.
13. Ware JE, Kosinski M, Dewey JE. How to Score Version 2 of the SF-36 Health Survey (Standard and Acute forms). Lincoln, RI: Quality Metric Incorporated; 2000.
14. Beck AT, Ward CH, Mendelson M, et al.. An inventory for measuring depression. Arch Gen Psychiatry. 1961;4:561–571.
15. Beck AT, Steer RA, Brown GK. Manual for Beck-Depression Inventory-II. San Antonio, TX: Psychological Corporation; 1996.
16. Seidenberg M, Haltiner A, Taylor MA, et al.. Development and validation of a Multiple Ability Self-Report Questionnaire. J Clin Exp Neuropsychol. 1994;16:93–104.
17. Liquiere M, Sallantin J. Structural Machine Learning With galois Lattice and Graphs. ICML’98: 5th International Conference on Machine Learning
. Madison, WI: 1998; 305-313.
18. Ganter B, Will R. Formal Concept Analysis: Mathematical Foundations. Berlin, Germany: Springer; 1999.
19. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B. 1995;57:289–300.