Journal Logo

Research Paper

Machine-learning–based knowledge discovery in rheumatoid arthritis–related registry data to identify predictors of persistent pain

Lötsch, Jörna,b,*; Alfredsson, Larsc; Lampa, Jond

Author Information
doi: 10.1097/j.pain.0000000000001693

1. Introduction

Persistent pain is one of the most common conditions encountered by health care professionals.60 It is triggered by a variety of causes,77 including rheumatoid arthritis (RA), which is characterized by repeated inflammations in joints that lead to joint destruction and various forms of functional impairment. Indeed, pain in RA64 owes to joint inflammation and nociceptive activation elicited by the inflamed synovia.65 A persistent hallmark feature of RA is pain, and patients often rate freedom from pain as their top priority when asked about treatment outcomes.63 However, despite the development of effective antirheumatic agents,69 a significant group of patients still experiences significant pain72,75 even after apparent inflammation control.2,89

Early detection of patients with RA at risk of developing persistent pain is clinically desirable for timely initiation of effective multimodal medical and psychosocial therapeutic interventions.19,56 Among known risk factors figure, eg, impaired pain processing and pain sensitization,49 female sex,36 or psychosocial factors.33,34,49 The detection of risk factors requires prospective longitudinal studies. In addition, data gathered in pain- or RA-related registries may provide further candidate risk factors.25 However, registries are designed with a general purpose in mind, and therefore, data sets drawn from them often only partly cover the data obtained with a prospective study. To this adds that data analysis strategies are usually not included in registries.

The utility of registries with pain data25 is enhanced by developments in computational and data science, which facilitate extracting information from data and generating knowledge from this information.57 This enables combining hypothesis-driven research with data-driven research.20 Specifically, in the present analysis, the hypothesis was derived from repeated reports about patterns in pain-related phenotypes42,43,46 that patients with RA may segregate into groups with respect to a pain-related phenotype. Furthermore, the hypothesis was derived from prior knowledge that a limited set of clinical, laboratory, and demographic parameters, acquired either at the time of RA diagnosis or 3 months later, is associated with this pain-related phenotype group structure of patients with RA and confers information suitable to identify patients at risk of persistent pain at early stages of RA treatment. These hypotheses were pursued using machine-learning methods.

Specifically, machine-learning is the currently most popular implementation of artificial intelligence. It may possibly be summarized as a set of methods that can automatically detect patterns in data and use the uncovered patterns to observe structures such as subgroups, to predict or classify future data, or to extract information from the data suitable to derive new knowledge.15,21,53 Machine-learning is increasingly used in pain research45 and other clinical and biomedical research areas such as psychiatric13,14,27,28 or neurological44 diseases. Both unsupervised and supervised algorithms have been successfully applied. That is, in unsupervised learning, a class or group structure is unknown, and the task is to find such a structure. By contrast, in supervised learning, a class or group structure is first known. An algorithm can be trained to associate a case to the correct class using available information. This finally enables the algorithm to use such information to assign a new and formerly unseen case to its correct class. In the present analysis, both unsupervised and supervised methods were combined (Table 1) to (1) identify pain-phenotype groups and to (2) select informative parameters about future persistent pain and judging their performance in the identification of patients at risk.

Table 1
Table 1:
Overview of the key steps in data analyses outlined in Figure 1.

2. Methods

2.1. Study design and setting

This was a retrospective analysis. It was a registry study based on the Swedish Epidemiological Investigation of RA. The study was initiated in 1996, has a participation rate among cases of 95%, and has been described elsewhere.7,71 All included patients were diagnosed by a rheumatologist and fulfilled the American College of Rheumatology 1987 criteria for RA.4 Further inclusion criteria were age >18 years and the ability to understand Swedish. The study base comprised the population, aged 18 to 70 years, living in parts of Sweden during a time period from 1996 and onwards. The study base now includes approximately 5000 patients.

2.2. Participants

In this study, only patients who had been followed up from diagnosis and performed a 5-year follow-up were included. Hence, the queried data had been collected between 2011 and 2016 and included n = 789 patients. Incident RA cases, diagnosed according to the 1987 ACR criteria and >18 years of age,4 were included at the time of diagnosis. At this time, patients answered an extensive questionnaire and provided blood samples for several laboratory analyses. The study had 94% response rate, and of these individuals, more than 95% provided blood samples. The study population was linked to the SRQ, which covers about 80% of the prevalent RA cases in Sweden.22 The SRQ is used in clinical practice for patient follow-up and includes information on disease activity and pain, acquired at diagnosis and at each follow-up visit.

All patients had been treated according to the updated local and international guidelines directed at reaching disease remission by optimizing antirheumatic treatment at each follow-up visit. The decision to change treatment is based on the physicians' assessment of the Disease Activity Score 28 (DAS2824), which is a composite index based on clinical examination, laboratory inflammatory parameters, and patient-reported global health. The study followed the Declaration of Helsinki and was approved by the Ethics Committee of Northern Stockholm. Informed written consent on data acquisition and publication in an anonymized manner was obtained from each participating patient.

2.3. Objectives

The main objectives of the present analysis were (1) the identification of a possible group structure of RA patients with respect to the intensity of pain they had communicated over up to 60 months after RA diagnosis and (2) the identification of those clinical, laboratory, and demographic parameters that provided early information about the membership of a patient to one of the identified pain-related phenotype groups. In addition, it was analyzed whether parameters acquired at the time of RA diagnosis were the most informative in this respect. Furthermore, it was analyzed whether data acquisition at 3 months after the diagnosis should be preferred.

2.4. Variables and measurement

Pain was assessed using a 100-mm visual analogue scale (VAS) ranging from 0, “n pain,” to 100, “worst imaginable pain.”88 Pain intensity was recorded at the time of RA diagnosis (time point “zero”, hereafter named "baseline") and at 3, 6, 12, 24, 36, 48, and 60 months after diagnosis and therapy start. As widely accepted,77 pain was regarded as persistent when it lasted or recurred for more than 3 to 6 months.51 Pain in RA is regarded as severe when rated VAS >40 mm at the presently used scale.78,79 However, as specified as an objective of the present analysis, no previous definition of pain phenotypes was used. A possible group structure among the patients with respect to pain was a target of the assessment.

To predict the development and course of persistent pain, d = 21 different variables acquired at month 0 or 3 from RA diagnosis were queried from the registry. This included demographic parameters (age and sex), patient-rated parameters including patient global assessment (PGA), a parameter reflecting function, health assessment questionnaire (HAQ), objective clinical parameters of disease severity including erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), swollen joint count (SJC), and tender joint count (TJC), the composite index of the disease activity score using ESR (DAS28) and CRP (DAS28CRP), respectively,24 the ratio between swollen and tender joints (STR), and the status of anti–citrullinated protein antibodies, known to be important in RA pathogenesis.69

2.5. Data analysis

Data analysis was performed using the R software package59 (version 3.4.4 for Linux; http://CRAN.R-project.org/) on an Intel Core i7—7500U notebook computer running on Ubuntu Linux 18.04.1 64-bit. The data set comprised (1) demographic, clinical, and laboratory parameters expected to provide predictive information for (2) the intensity of pain rated by the patients on a 100-mm VAS. The temporal limit between predictive parameters and predicted pain was set at 3 months, agreeing with a definition of persistent pain requiring that it should last at least 3 months.50 Hence, d = 21 demographic, clinical, and laboratory parameters were included in the analysis, comprising the patients' sex, age, patient-rated parameters (PGA and HAQ) acquired at months 0 and 3, parameters of disease severity (ESR, CRP, SJC, and TJC) acquired at months 0 and 3, anti–citrullinated protein antibodies acquired at month 0, and composite indexes including DAS28 and DAS28CRP. Moreover, STR was calculated as the quotient of SJC and TJC as described previously.37 By contrast, pain-related parameters were queried by means of a 100-mm VAS starting from 3 months after the start of the therapy, with an observation period up to 60 months.

The analysis followed a data-driven approach comprising 2 main steps (Fig. 1). First, the VAS ratings acquired between 3 and 60 months from RA diagnosis and therapy start were assessed for a subgroup structure of pain-related phenotype classes among patients with respect to the intensity and persistence of pain in RA. To this end, the distribution of the medians of the VAS pain scores acquired between 3 and 60 months after RA diagnosis was investigated by analyzing the probability density function as described previously.42,83 That is, the Pareto density estimation, which is a kernel density estimator particularly suitable for the discovery of groups in the data,80 was explored for a multimodal distribution of the individual median VAS ratings over the observation period. A group structure was analyzed by fitting a Gaussian mixture model (GMM) to the Pareto density estimations as, where N (x|mi, si) denotes Gaussian probability densities (components) with mean values mi and SDs si. M denotes the number of components in the mixture. Models with M = 1,…,5 Gaussian modes were tested. To avoid any subjective components in the fitting, an automated evolutionary algorithm5 was implemented as the R library39 “DistributionOptimization” (https://cran.r-project.org/package=DistributionOptimization). The final model was selected based on the Akaike information criterion (AIC),1 on the Bayesian information criterion (BIC),67 and on likelihood ratio tests.74 Subject assignment to the identified subgroups was obtained using the Bayes theorem6 that provided the decision limits needed to assign an individual observation to mode i, based on the calculation of the posterior probabilities.

Figure 1.
Figure 1.:
Flow chart of the data analysis: The figure provides an overview of the applied data sciences approach in 2 main steps (indicated in blue: output space preparation and input space mapping comprising feature selection and verification). After identification of a pain-related phenotype group structure (“output space”), among candidate demographic, clinical, or laboratory parameters, those were selected that provided the most relevant information to train machine-learned algorithms with the goal to enable them to assign a patient to the correct pain-phenotype group. This was implemented as principal component analysis or random forest analysis, followed by the computed ABC analysis. The selected features were subsequently tested in 5 different algorithms (classification and regression trees, k-nearest neighbors, support vector machines, multilayer perceptrons, and naive Bayes classifiers) about whether they provided information for successful phenotype class association. The machine-learning part of the feature selection was repeated using a reduced set of candidate features (loop arrow). VAS, visual analogue scale.

Second, a mapping of the demographic, clinical, and laboratory parameters corresponding to these phenotype classes was sought. This included a selection of relevant parameters among the candidates providing the most relevant information to assign a patient to one of the identified pain-phenotype classes. Features selection was performed using (1) principal component analysis (PCA) and (2) supervised machine-learning implemented as random forests.11,31 Principal component analysis55 was performed to project the high-dimensional data into a q-dimensional space of PCs. The first PC has the largest possible variance in the data, and each succeeding orthogonal component is chosen for the highest possible remaining variance, with a usually applied accepted cutoff set at a PC's eigenvalue of 1.29,32 The most relevant parameters are identified through the so-called loadings on the PCs. Adequate data transformation was addressed using PC-corr analysis.16 Any types of normalization and dimension are used, and the analysis was performed using the R script “pccorrv2.R” freely available at https://github.com/biomedical-cybernetics/PC-corr_net.16

However, the main approach to the identification of the most relevant parameters for phenotype class assignment used supervised machine-learned algorithms implemented as feature selection62 by means of random forest analysis,11,31 followed by the computed ABC analysis as proposed previously.47 Specifically, the candidate parameters were used to train random forest classifiers11,31 to assign a subject to the correct pain-phenotype group. Such random forests are built from many decision trees41 in which the dichotomous splits are set randomly along the data range. The classification is obtained as the majority vote for class membership provided by the decision trees, ie, random forest uses collective decisions or ensemble learning for classification. Following the concept of a nested cross-validation analysis, the analysis was performed using 1000 different disjoint training and test data sets, created using Monte-Carlo resampling to split the original data set into a training (2/3 of the data) and test (1/3 of the data) data set. During this procedure, for every candidate parameter, the information about its importance for a correct class assignment becomes available as the mean decrease in classification accuracy when the respective parameter (feature) was excluded from random forest building. These calculations were performed using the R libraries40 “randomForest” (https://cran.r-project.org/package=randomForest).

The values of the mean decrease in classification accuracy when the respective parameter (feature) was excluded from random forest building were used to select the most relevant parameters. Specifically, after each random forest analysis, these values were submitted to the computed ABC analysis.82 This is a categorization technique for the selection of the most important subset among a larger set of items. ABC analysis aims to divide a set of data into 3 disjointed subsets called “A,” “B,” and “C.”87 It originates from economical sciences where subset “A” is regarded to contain the most profitable features, which for the same reason were also chosen in the present context. In the computed ABC analysis, the set limits are set on a mathematical basis. That is, the border between subsets “A” and “B,” which is the relevant limit in the present analysis, is found as follows. The values of the mean decrease in classification accuracy when the respective parameter was excluded from random forest building were plotted cumulatively in decreasing order of importance. This plot shows the cumulative fraction that each feature contributes to the total classification at the ordinate and the feature number at the abscissa. An extreme scenario would be when one single feature provides all needed information to assign a subject to the correct class. This would be drawn at the upper left corner of the coordinate system. The limit between subsets “A” and “B” is now obtained as the x-value of the point on the cumulative curve that is closed to that extreme point (for more details, see Ref. 81). After feature selection, the utility of the selected parameters was assessed and compared with that of other parameters not selected during previous data analysis steps. These calculations were performed using our R library82 “ABCanalysis” (http://cran.r-project.org/package=ABCanalysis).

Different types of supervised machine-learning methods were chosen, including decision trees implemented as (1) classification and regression trees,12 prototyping implemented as (2) k-nearest neighbors (kNNs),18 neuronal networks implemented as (3) support vector machines,17 and (4) multilayer perceptron (MLP).61 In addition, (5) naive Bayesian classifiers were used as a classical method.6 Tree-structured rule-based classifiers41 analyze ordered variables xi by recursively splitting the data at each node into children nodes, starting at the root node. During learning, the splits are modified such that misclassification is minimized. In the present form, the information gain as judged by the reduction in entropy was used to find optimal (local) dichotomic decisions as used for the C4.5 decision tree method.58 The calculations were performed using the “rpart” function of the similarly named R package (B. Ripley; https://cran.r-project.org/package=rpart). During kNN learning,18 the entire labeled training data set is stored while a test case is placed in the feature space in the vicinity of the test cases at the smallest high-dimensional distance. The test case receives the class label according to the majority vote of the class labels of the k training cases in its vicinity. The present analyses were performed with k = 5 and the Euclidean distance, which also corresponds to the default of the R package “KernelKnn” (Mouselimis L, https://cran.r-project.org/package=KernelKnn). Support vector machines classify data mainly based on geometrical and statistical approaches used for finding an optimum decision surface (hyperplane) that can separate the data points of one class from those belonging to another class in the high-dimensional feature space.17 A Gaussian kernel with a radial basis was used. The analyses were performed using the R library35 “kernlab” (https://cran.r-project.org/package=kernlab). The perceptron61 was built from artificial neurons. Those are provided with a number of input channels, a processing level, and an output level that connects a neuron to one or multiple other artificial neurons. Experiments using any combination of 1 to 3 hidden layers with 2 to 32 neurons each indicated no improvement of classification accuracy beyond 4 neurons in a single hidden layer. The analyses were performed using the R library RNNS8 (https://cran.r-project.org/package=RSNNS). Naive Bayesian classifiers were used that provide the probability that a data point being assigned to a specific class calculated by application of the Bayes theorem.6 The calculations were performed using the R package85 “klaR” (https://cran.r-project.org/package=klaR).

Possible overfitting issues were addressed in the present analysis fourfold. First, before the data analysis, the classification algorithms were tuned with respect to available hyperparameters. For example, the number of k in kNN was tested between 3 and 9, and the best performing variant was chosen. Similarly, several sizes of the MLP were tested including 1 to 3 hidden layers with up to 20 neurons each. Second, analyses were performed in 1000 cross-validation runs using random splits of the original data set into training (2/3 of the data) and test (1/3 of the data) data subsets. The reported classification performances are the median of the performances obtained during the 1000 runs. Third, a negative control condition was created as described previously. A classification better than chance when trained with permuted data would hint at possible overfitting. Fourth, different classifiers were applied to avoid that the analysis relied on a single method in which overfitting had occurred occasionally.

3. Results

3.1. Participants and descriptive data

Main characteristics of the study sample are summarized in Table 2. A minimum of 4 VAS pain ratings was available from n = 288 patients providing an analyzed cohort of 209 women and 79 men, aged on average 52.2 ± 12.3 years (mean ± SD). The number of available VAS pain ratings decreased with time, providing 268, 236, 280, 273, 175, 109, and 46 ratings acquired at 3, 6, 12, 24, 36, 48, and 60 months after RA diagnosis, respectively. In the patients meeting the required minimum of 4 VAS ratings, the d = 21 features were complete without missing values. There was neither a difference between pain levels at baseline in included vs nonincluded patients (Wilcoxon test48,86: W = 35,088, P = 0.8485), nor at any further time point (details not shown).

Table 2
Table 2:
Main characteristics of the study sample.

3.2. Main results

3.2.1. Pain-related phenotype classes

The distribution of the medians of the VAS pain scores acquired between 3 and 60 months after RA diagnosis was multimodal (Fig. 2). In this distribution, fitting a GMM identified 3 phenotype subgroups comprising patients with low (n = 87), moderate (n = 121), or high (n = 80) median pain scores (hereafter called “persistent pain groups” 1 [low], 2 [moderate], and 3 [high]). The clinical characteristics of baseline and 3 months of all 3 groups are displayed in Table 3. Specifically, as shown in Figure 2, the 3 groups are represented by the 3 Gaussian modes describing the shape of the probability density distribution of the individual median VAS ratings acquired between 3 and 60 months after RA diagnosis. The modes have their mean values at VAS ratings of 5, 22, or 50, respectively. They were separated from each other at VAS = 11.4 and VAS = 36.5 mm, respectively, as indicated by the Bayesian decision limits between each 2 neighbored Gaussian modes (perpendicular magenta lines in Fig. 2A). The 3-modal distribution of the individual median VAS ratings (Fig. 2) was supported by the AIC and BIC that both were the smallest with M = 3 (AIC: 2577.878, 2560.687, 2540.561, 2556.896, and 2723.497 for M = 1-5, respectively; BIC: 2585.204, 2582.665, 2573.527, 2600.852, and 2778.441 for M = 1-5, respectively). Three Gaussian modes as the best solution were further supported by the results of the likelihood ratio tests, which indicated a statistically significant improvement of the fit for M = 2 vs M = 1 Gaussian modes (change in log likelihood ΔLL = −12.595, P = 1.49 × 10−5) and a further improvement when using M = 3 modes (ΔLL = −13.064, P = 8.97 × 10−6), whereas the fit was not improved by adding a fourth mode (M = 4 vs M = 3: ΔLL = 5.168, P = 1) or a fifth mode (M = 5 vs M = 4: ΔLL = −80.3, P = 1). The satisfactory quality of the fit was also supported by the quantile–quantile plot (Fig. 2) and by a nonsignificant Kolmogorov–Smirnov test (P = 0.197).

Figure 2.
Figure 2.:
Distribution of the medians of the VAS ratings calculated between 3 months up to the rating acquired at 60 months after the start of the therapy: (A) The probability density function (PDF, black line) estimated by means of the Pareto density estimation (PDE80) overlaid on a histogram could be fitted using a Gaussian mixture model (GMM) given as
, with M = 3 modes. The fit is shown as a red line, and the M = 3 mixes are indicated as differently colored dashed lines (G #1-#3). The Bayesian boundaries between the Gaussians are indicated as a perpendicular magenta line. The inset shows a quantile–quantile (QQ) plot comparing the observed distribution of median VAS ratings (ordinate) with the distribution expected from the GMM (abscissa). The blue dots symbolize the quantiles of observed data vs predicted data, and the red line indicates identity. The close vicinity of the dots to this line indicates satisfactory fits of the data by the respective GMM. (B) Median VAS pain ratings, separately for each Gaussian mode: The widths of the boxes are proportional to the respective numbers of subjects per group. The quartiles and medians (solid horizontal line within the box) are used to construct a “box and whisker” plot. The whiskers add 1.5 times the interquartile range (IQR) to the 75th percentile or subtract 1.5 times the IQR from the 25th percentile and are expected to include 99.3% of the data if normally distributed. The notches indicate the confidence interval around the median based on
. Single data are overlaid over the boxplots as dots. (C) Time courses of VAS pain ratings per subgroup defined by the Gaussian mode membership (Fig. 2): The Spaghetti plots show the individual VAS ratings (dots), connected by straight lines for each subject. The overlaid box and whisker plots show basic summary statistics of pain VAS ratings for each time point. The figure has been created using the R software package59 (version 3.4.4 for Linux; http://CRAN.R-project.org/) and the R packages “beeswarm” (https://cran.r-project.org/package=beeswarm) and “ggplot2” (https://cran.r-project.org/package=ggplot2). VAS, visual analogue scale.
Table 3
Table 3:
Descriptive statistics (median and interquartile range) of the baseline and 3-month values of the d = 21 candidate features, comprising demographic, clinical, and laboratory parameters, assessed for their utility to assign a patient with RA to the pain-related phenotype group identified in the first step of the analysis (Fig. 2).

3.2.2. Parameters relevant for pain-phenotype class assignment

PC-corr analysis of the d = 21 candidate demographic, clinical, or laboratory features (Table 3; and supplementary Figure 1, available at http://links.lww.com/PAIN/A873) suggested noncentered PCA and log transformation of the data as best suited for separation of the pain-phenotype groups. Noncentered PCA and log transformation produced a significant group segregation along the first dimension (PC1), with P = 0.017, area under the receiver operating characteristic curve (AUC-ROC) = 0.658, and area under the precision versus recall curve (AUC-PR) = 0.58; PC1 explained 92.3% of the variance. However, Kolmogorov–Smirnov tests68 were significant for log-transformed, but nonsignificant for square root–transformed PGA. The latter transformation corresponded to the second-best result of the PC-corr analysis with a PC1 explaining 90.3% of the variance. Principal component analysis according to the PC-corr results (noncentered, log-transformed data except square root–transformed PGA) resulted in 3 PCs with eigenvalues >1 (supplementary Figure 2, available at http://links.lww.com/PAIN/A873), explaining 93.45%, 3.98%, and 1.18% of the total variance, respectively. Phenotype group segregation was most pronounced along PC1 consistent with the initial PC-corr analysis. PC1 carried loadings mainly from the parameters PGA acquired at 0 and 3 months relative to therapy start (supplementary Figure 1 bottom right, available at http://links.lww.com/PAIN/A873). Sex was the main parameter reflected in PC3; however, this component explained only 1.18% of the total variance.

3.2.3. Prediction of pain phenotype

Using all d = 21 parameters, random forests followed by the computed ABC analysis of the feature importance measure identified |{ABC set A}| = 3 parameters as the most frequent size of set “A” during the 1000 runs on randomly created independent training and test data. The most frequent members of set “A” were PGA at 3 months, HAQ at 3 months, and DAS28CRP at the 3-month visit. Rerunning the feature selection without the partly redundant composite parameter DAS28CRP at 3 months resulted in |{ABC set A}| = 2 as the most frequent set size, with PGA and HAQ at 3 months as the most frequent members. This contrasts with parameters of systemic inflammation, ESR and CRP, that were not sufficiently often member of the most relevant ABC set “A” to meet the criteria for being chosen for the final feature set. Training different machine-learning algorithms with these 2 parameters provided consistent results of approximately 70% overall balanced accuracy of phenotype group assignment (Table 4). Rerunning the analysis after omitting the patient-rated parameters, PGA and HAQ, resulted in a most frequent set size |{ABC set A}| = 2 and the parameters TJC and SJC at 3 months as its most frequent members (Fig. 3). However, omitting the previously identified best parameters led to a substantial drop of the overall classification accuracy to only 57% to 59% (Table 5). Therefore, the analysis was stopped at this point. Quality control measures provided the expected results, ie, training the algorithms with randomly permutated data obtained a balanced classification accuracy of around 50%. In addition, training the algorithms with the original data but using 2 randomly selected parameters that had not been selected during the feature selection step also produced poor classification results.

Table 4
Table 4:
Performance measures of classifiers obtained using different machine-learned methods (classification and regression trees [CART], k-nearest neighbors [kNNs], support vector machines [SVMs], multilayer perceptron [MLP], and naive Bayesian classifiers [Bayes]) in the classification task to identify, from the selected parameters PGA and HAQ at the 3-month follow-up visit, the membership to the pain-related phenotype groups defined by the membership to the 3 Gaussian modes (M1-M3) found in the distribution of median pain 3 months after therapy start.
Figure 3.
Figure 3.:
Feature selection based on random forest analysis followed by the computed ABC analysis: (A) Mean decrease in classification accuracy when the respective feature is excluded from the random forest analysis (ACPA, anti–citrullinated protein antibodies; CRP, C-reactive protein; DAS28, Disease Activity Score 28; ESR, erythrocyte sedimentation rate; HAQ, health assessment questionnaire; PGA, patient global assessment; SJC, swollen joint count; STR, swollen–tender joint ratio; TJC, tender joint count). The plot displays a typical example from 1000 bootstrap resampled data subsets. (B) The numerical value of the mean decrease in classification accuracy associated with each item was submitted to the computed ABC analysis. The ABC plot (blue line) shows the cumulative distribution function of the mean decreases in accuracy, along with the uniform distribution (green line), ie, each feature has the same chance to contribute to the classification accuracy (for further details about the computed ABC analysis, see Ref. 82). The red lines indicate the borders between ABC sets “A,” “B,” and “C.” Only set “A” containing the most profitable items was selected as the pain-relevant questionnaire subsets. (C) Bar plot of the number of parameters found in ABC set A during the 1000 runs. (D) Bar plot of the features found most frequently in ABC set “A.” The final feature set was given by the number of items according to the most frequent size of set “A,” starting from the most frequent member of set “A.” The figure has been created using the R software package59 (version 3.4.4 for Linux; http://CRAN.R-project.org/) and our R package82 “ABCanalysis” (http://cran.r-project.org/package=ABCanalysis).
Table 5
Table 5:
Performance measures of classifiers obtained using different machine-learned methods (classification and regression trees [CART], k-nearest neighbors [kNNs], support vector machines [SVMs], multilayer perceptron [MLP], and naive Bayesian classifiers [Bayes]) in the classification task to identify, from the selected parameters TJC and SJC at the 3-month visit, the membership to the pain-related phenotype groups defined by the membership to the 3 Gaussian modes (M1-M3) found in the distribution of median pain 3 months after therapy start.

4. Discussion

Pain is complex, and in RA, it comprises both inflammatory pain from joints and noninflammatory pain developing during the disease. In the present study, a subgroup of subjects was identified in whom pain between 6 months and up to 5 years was in median higher than in the other patients. Although this subgroup was identified in a data-driven approach without previous assumptions about a possible group structure of RA patients with respect to persistent pain, its separation from other patients at VAS = 36.5 mm well agrees with the conventional limit of high pain intensity drawn at VAS = 40 mm. Of note, although the latter has been the result of expert reasoning and agreement,78,79 the present limit is purely based on data analysis.

After identification of a clinically relevant subgroup that can be regarded as RA patients with persistent pain, among d = 21 candidate parameters, 4, all acquired at 3 months after RA diagnosis, were found to provide useful information about the later assignment of the patient to the high-pain phenotype. The 3-month outcome was chosen in line with earlier data using this time point as the first possible stable evaluation of antirheumatic treatment84 and also pain course.2 Among the candidate parameters, PGA and HAQ were the strongest predictors of later persistent pain. Among parameters related to objective assessments, TJC and SJC were found to suite this purpose best, although their predictive performance was modest. By contrast, the utilization of data from the time of RA diagnosis may differ considerably from all other data. This is a state when the disease is not yet under inflammatory control. Indeed, parameters acquired earlier than 3 months after diagnosis did not pass the present feature selection analyses. Furthermore, an inclusion of the composite DAS28CRP marker, which had been dismissed after clinical discussion and because of its redundancy with DAS28 and CRP, did not change the present results substantially. DAS28CRP at the third month was strongly correlated with both, TJC (Spearman's ρ = 0.8970) and SJC (ρ = 0.78). Thus, the correlation with its component DAS28 (ρ = 0.92) was only slightly higher than that with TJC. By contrast, its correlation with its other component CRP was weaker (ρ = 0.4), and CRP had not been identified as informative for pain-phenotype association in the present context. Thus, the relevant information from DAS28CRP was probably captured by its correlated TJC, which outperformed the components DAS28 and CRP in the classifier.

All patients in this study were treated according to the updated local and international guidelines,9,26 the latter directed at reaching remission by optimizing antirheumatic treatment at each follow-up visit. The current patient cohort, therefore, clearly represents the monitoring and treatment decisions with a basis in clinical practice. Interestingly, although all patients have had adequate antirheumatic treatment, the analysis could identify 3 distinct patterns of persistent pain. Moreover, while parameters of inflammation and disease activity, as measured by means of ESR, CRP, and DAS28,23 did not differ markedly between the persistent pain groups at the diagnosis, already at 3 months significant differences were noted. Specifically, values of DAS28 and HAQ at 3 months were higher in the high-pain group (supplementary Figure 1, available at http://links.lww.com/PAIN/A873). Higher HAQ is in line with pain being associated with decreased functional ability, as reported earlier,49 and with known importance also for work ability.54 The higher DAS28 is in line with the high prevalence of nonresponders to antirheumatic therapy in this group and corroborates the earlier reported frequency of therapy resistance early in the disease.84 It is a known fact that a significant part of clinical response quantification relies on patient-reported data, such as PGA, which comprise an important part of the DAS28 calculation. Changes in DAS28 from diagnosis to month 3 were lowest in the high-pain group (−1.25 ± 1.52) and highest in the low-pain group (−2.8 ± 1.5), although within all 3 groups the differences were statistically significant (repeated measures analysis of variance with factors “Gaussian mode” [1, 2, 3] and “time” [0, 3 months]: factor Gaussian mode: df = 2,234, F = 8.19, P = 0.00,036, factor “time”: df = 1,234, F = 415.2, P < 2 × 10−16, interaction “Gaussian mode” by “time”: df = 2,234, F = 21.32, P = 3.11 × 10−9; post hoc t tests73 for differences between 3 and 0 months: all P < × 10−7). The present data support this and also imply that almost one-third of all patients did not respond clinically to treatment after 3 months. In this context, it is important to note that although some few patients in the low-pain group had occasional high pain ratings on singular occasions (Fig. 2C), these patients tended to report decreased pain at the next follow-up visit. The reasons for these fluctuations in pain reporting are not known. They may be due to insufficient disease control. However, as shown in Figure 2C, all these patients report decreased pain at the next follow-up, thereby on a group level the patients are still not characterized by persistent high pain levels. Moreover, median values are consistently low at all visits (Fig. 2).

As shown in Table 3, pain and PGA at diagnosis were significantly higher in the group that would later be classified as high-pain group, whereas ESR, CRP, and DAS28 ESR did not differ markedly. This finding is in line with earlier reports that the pain course and disability during RA is mostly dependent on noninflammatory factors,30 and severe pain may persist or develop despite low disease activity at diagnosis. Moreover, other secondary consequences of the disease with possible impact pain are mainly related to PROs rather than inflammation at diagnosis.54 There are several reports that low inflammation at diagnosis may characterize a phenotype of the disease, where pain is less responsive to antirheumatic treatment. Therefore, these patients may be at risk of developing persistent pain and also, as discussed earlier, refractory pain conditions such as chronic widespread pain.3

As already pointed out, both PGA and HAQ are parameters that had been rated by the patients, and therefore, they could be confounded by the actual pain. To perform an analysis directed at investigating which of the non-patient reported outcome parameters could predict the pain group assignment, PGA and HAQ were omitted and the analyses rerun. The resulting second-line parameters SJC and TJC still provided relevant information to obtain phenotype group assignment better than guessing after the patient-rated parameters were omitted. However, they provided only modest classification performance. Nevertheless, both SJC and TJC represented early markers of pain-phenotype group association that continued to differ among the 3 pain-phenotype groups throughout the later observation period (supplementary Figure 3, available at http://links.lww.com/PAIN/A873). This was supported by the results of analyses of variance for repeated measures, indicating significant effects of the pain phenotype, respectively, Gaussian mode membership for both, TJC and SJC (supplementary Figure 3, available at http://links.lww.com/PAIN/A873). This observation is in line with an earlier report that found the difference between baseline TJC and SJC to be directly associated with the level of clinical response in RA.52 Interestingly, TJC displayed stronger prediction, in terms of its importance in the classification algorithms, than SJC for discriminating persistent pain groups. This is well in line with earlier data that TJC displays a higher association with pain than SJC.2 It may be speculated that TJC represents both joint inflammation and a component of patient-reported pain, whereas SJC solely depicts an objective assessment of joint inflammation by the rheumatologist. The different predictive ability of TJC compared with SJC further supports that noninflammatory factors are vital contributors to the persistence of pain during RA course, which has also been discussed earlier.10

The present approach to an RA-related pain phenotype shows limitations of registry studies, of which most neither have been designed to test a specific research hypothesis, nor do provide necessarily complete data for a later established research question. Notably, in the present data, while apparently having a 5-year follow-up after RA diagnosis to study a pain-related phenotype, data became increasingly sparse with longer distance to the day of RA diagnosis. Nevertheless, setting an arbitrary cutoff at 36 months after RA diagnosis and maintaining the requirement of more than 3 pain ratings would have resulted in 28 patients less in the analysis, which while reducing the cohort size is nevertheless unlikely to have changed the results substantially. Furthermore, the study included replications using data splits and resampling techniques in cross-validation experiments. However, a completely independent control data set was not available. The median VAS rating across a minimum of available queries provided a clinically meaningful solution based on the registry data. Other data analytical solutions are nevertheless possible. For example, when calculating linear regressions of the VAS ratios at 3 to 60 months and using the individual slopes and y-intercepts for clustering, the obtained group memberships overlapped at a high significance level (3 clusters, χ2 test vs the 3 GMM-based groups: χ2 = 137.3, df = 4, P value < 2.2 × 10−16) with the present patients grouping, indicating that the here presented results are not limited to the particular method used for pain-phenotype definition. Another possible limitation may arise from the analyzed cohort of 286 patients as compared to the 789 patients who were initially available in the registry. This omission of patients was due to lack of sufficient pain data to be included in the analysis. Although the selection was made on predefined conditions, it may be argued that the analyzed study population is then not representative of the general population with RA. To address this possible issue, the pain ratings provided by the included patients were compared with those available from the nonincluded patients. This indicated that (1) the included and nonincluded patients did not differ with respect to the pain rated at baseline or at any further time point (Wilcoxon tests, see Results section). Moreover, (2) the distribution of the VAS ratings acquired between 3 and 60 months was similar in the included and nonincluded patients and did not hint at any further group that, by the predefined selection criterion, had been omitted from the analysis (supplementary Figure 4, available at http://links.lww.com/PAIN/A873).

The obtained subgroup separation at VAS = 36.5 mm between subjects who often suffered from high pain from the other patients agreed well with the limit of VAS >40 mm for persistent pain, which is clinically accepted in the rheumatic arthritis context.78,79 Furthermore, it should be noted that the present associative approach at candidate factors predictive of a clinically relevant phenotype does not provide any hits at causality; this has to be substituted from expert knowledge of RA experts. In addition, the limited value of the data is also reflected in the only modest predictive performance of the machine-learned classifiers. A balanced classification accuracy of 70% or 60% is moderate or only modest, ie, 20% or 10% better than simple guessing, respectively (Tables 4 and 5, respectively). This implies that the registry data do not suffice to base a predictive diagnosis of future persistent pain solely on them. For clinical applicability, further parameters are probably needed to raise this accuracy as close to 100% as possible. Nevertheless, the present classifier is not devoid of clinical utility and has the advantage that it uses simply accessible parameters, without a need to acquire more of complex and/or expensive biomarkers. This predictive component of the results emphasizes the preference of machine-learning to classical statistical approaches.

The present analysis was designed with a classification task in mind; hence, the preference of machine-learning as classification is among key tasks of machine-learning, whereas classical statistics mainly results in the identification of significant effects of particular features, without implicitly assessing the classification performance. However, with the present d = 21 parameters, further machine-learned algorithms are unlikely to provide substantial improvements of balanced classification accuracy over that what has been obtained with the presently used algorithms. They had been chosen to represent different principal implementations of machine-learning, such as prototype-based learning (eg, kNN) or ensemble learning–based classifications (eg, random forests) and artificial neuronal networks (eg, support vector machines and MLP), with the addition of a classical method such as naive Bayes. All of them delivered consistent results with only marginal differences. Given the limited number of parameters, it is unlikely that further algorithms, such as adaptive boosting of simple decision trees66 or rule-based classifiers (for a review, see Ref. 45), would provide the desired higher classification accuracy. As stated above, a need for further parameters not yet available in the present data set is a more likely solution of the modest performance problem.

A further possible limitation of the study may be that pain assessments of joint pain could be influenced also from concurrent pain, for example, headache. However, the impact of such a confounder should be minor because the questions regarding pain used in the register were directed at the assessment of joint pain only. Female sex as a previously proposed risk factor for persistent pain in RA36,76 was not among selected parameters predicting phenotype group membership. A χ2 test of the sex vs Gaussian mode cross table was also not significant (χ2 = 1.114, P = 0.573), and when setting a cutoff at a median of VAS = 40 mm according to a widely accepted standard, the resulting 2 groups did also not differ with respect to sex (χ2 = 0.324, P = 0.569), indicating that the negative result does not owe to the present modeling-based grouping of the patients with respect to the phenotype. Similarly, older age, which is known as a further risk factor for persistent pain in RA, was not selected, and the phenotype groups did not differ with respect to the patients' age (Kruskal–Wallis test38: χ2 = 1.52, P = 0.468). Moreover, the initial pain at RA diagnoses differed significantly among groups, with medians of 39-, 55-, and 64-mm VAS in the low, medium, and high pain-phenotype groups, respectively (χ2 = 18.801, P = 8.27 × 10−5).

5. Conclusions

The present results show that machine-learning is suited to extract knowledge from data queried from disease-related registries. Such data collections provide both, strengths and limitations, and become increasingly available, reflecting general developments toward computational and data-driven science.57 Supported by the results of Gaussian mixture modeling, the analysis provided 3 distinct pain-related phenotypes that showed significantly different distributions concerning several clinical and laboratory parameters. Interestingly, although parameters of inflammation and disease activity did not differ markedly between the persistent pain groups at the diagnosis, already at 3 months significant differences were noted, with higher disability in the high-persistent pain group. Patient-rated parameters PGA and HAQ, acquired at 3 months from RA diagnosis and therapy start, were identified as best predicting pain parameters during the period of 6 to 60 months. When pushing the analytical focus toward functional parameters, SJC and TJC, also acquired at 3 months from RA diagnosis, proofed as most informative about pain rated later during the observation period. The results indicate that early functional parameters of RA are informative for the development and degree of persistent pain, however, not earlier than 3 months after RA diagnosis.

Conflict of interest statement

The authors have no conflicts of interest to declare.

Appendix A. Supplemental digital content

Supplemental digital content associated with this article can be found online at http://links.lww.com/PAIN/A873.

Acknowledgements

The authors acknowledge Lena Nise and Helga Westerlind for valuable help with data sets. We thank the EIRA group and the Swedish Rheumatology Quality Register.

This work was funded by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 602919 (“GLORIA”, J. Lötsch, J. Lampa), the Swedish Rheumatism Association (J. Lampa) and by the Landesoffensive zur Entwicklung wissenschaftlich-ökonomischer Exzellenz (LOEWE), LOEWE-Zentrum für Translationale Medizin und Pharmakologie (J. Lötsch). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

[1]. Akaike H. A new look at the statistical model identification. IEEE Trans Aut Control 1974;19:716–23.
[2]. Altawil R, Saevarsdottir S, Wedrén S, Alfredsson L, Klareskog L, Lampa J. Remaining pain in early rheumatoid arthritis patients treated with methotrexate. Arthritis Care Res (Hoboken) 2016;68:1061–8.
[3]. Andersson ML, Svensson B, Bergman S. Chronic widespread pain in patients with rheumatoid arthritis and the relation between pain and disease activity measures over the first 5 years. J Rheumatol 2013;40:1977–85.
[4]. Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF, Cooper NS, Healey LA, Kaplan SR, Liang MH, Luthra HS, Medsger TA Jr, Mitchell DM, Neustadt DH, Pinals RS, Schaller JG, Sharp JT, Wilder RL, Hunder GG. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum 1988;31:315–24.
[5]. Ashlock D. Evolutionary computation for modeling and optimization. New York, Springer Verlag: Springer Science & Business Media, 2006.
[6]. Bayes M, Price M. An essay towards solving a problem in the doctrine of chances. By the Late Rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S. Philos Trans 1763;53:370–418.
[7]. Bengtsson C, Berglund A, Serra ML, Nise L, Nordmark B, Klareskog L, Alfredsson L, Eira Study G. Non-participation in EIRA: a population-based case-control study of rheumatoid arthritis. Scand J Rheumatol 2010;39:344–6.
[8]. Bergmeir C, Benitez JM. Neural networks in R using the Stuttgart Neural Network Simulator: RSNNS. J Stat Softw 2012;46:1–26.
[9]. Borigini MJ, Paulus HE. Innovative treatment approaches for rheumatoid arthritis. Combination therapy. Baillieres Clin Rheumatol 1995;9:689–710.
[10]. Boyden SD, Hossain IN, Wohlfahrt A, Lee YC. Non-inflammatory causes of pain in patients with rheumatoid arthritis. Curr Rheumatol Rep 2016;18:30.
[11]. Breiman L. Random forests. Mach Learn 2001;45:5–32.
[12]. Breimann L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Boca Raton: Chapman and Hall, 1993.
[13]. Chekroud AM, Gueorguieva R, Krumholz HM, Trivedi MH, Krystal JH, McCarthy G. Reevaluating the efficacy and predictability of antidepressant treatments: a symptom clustering approach. JAMA Psychiatry 2017;74:370–8.
[14]. Chekroud AM, Zotti RJ, Shehzad Z, Gueorguieva R, Johnson MK, Trivedi MH, Cannon TD, Krystal JH, Corlett PR. Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry 2016;3:243–50.
[15]. Chollet F, Allaire JJ. Deep Learning with R. Shelter Island. New York: Manning Publications Co, 2018.
[16]. Ciucci S, Ge Y, Durán C, Palladini A, Jiménez-Jiménez V, Martínez-Sánchez LM, Wang Y, Sales S, Shevchenko A, Poser SW, Herbig M, Otto O, Androutsellis-Theotokis A, Guck J, Gerl MJ, Cannistraci CV. Enlightening discriminative network functional modules behind Principal Component Analysis separation in differential-omic science studies. Sci Rep 2017;7:43946.
[17]. Cortes C, Vapnik V. Support-vector networks. Machine Learn 1995;20:273–97.
[18]. Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theor 1967;13:21–7.
[19]. Dableh LJ, Yashpal K, Henry JL. Neuropathic pain as a process: reversal of chronification in an animal model. J Pain Res 2011;4:315–23.
[20]. Defining the scientific method. Nat Methods 2009;6:237.
[21]. Dhar V. Data science and prediction. Commun ACM 2013;56:64–73.
[22]. Eriksson JK, Askling J, Arkema EV. The Swedish Rheumatology Quality Register: optimisation of rheumatic disease assessments using register-enriched data. Clin Exp Rheumatol 2014;32(5 suppl 85):S-147–9.
[23]. Fransen J, Creemers MC, Van Riel PL. Remission in rheumatoid arthritis: agreement of the disease activity score (DAS28) with the ARA preliminary remission criteria. Rheumatology (Oxford) 2004;43:1252–5.
[24]. Fransen J, van Riel PL. The disease activity score and the EULAR response criteria. Clin Exp Rheumatol 2005;23(5 suppl 39):S93–99.
[25]. Freytag A, Scriba B, Kaiser U, Meissner W. Pain registries and similar data collections: a systematic review [in German]. Schmerz 2016;30:568–75.
[26]. Furst DE, Keystone EC, Braun J, Breedveld FC, Burmester GR, De Benedetti F, Dörner T, Emery P, Fleischmann R, Gibofsky A, Kalden JR, Kavanaugh A, Kirkham B, Mease P, Sieper J, Singer NG, Smolen JS, Van Riel PL, Weisman MH, Winthrop K. Updated consensus statement on biological agents for the treatment of rheumatic diseases, 2010. Ann Rheum Dis 2011;70(suppl 1):i2–36.
[27]. Gao C, Sun H, Wang T, Tang M, Bohnen NI, Müller MLTM, Herman T, Giladi N, Kalinin A, Spino C, Dauer W, Hausdorff JM, Dinov ID. Model-based and model-free machine learning techniques for diagnostic prediction and classification of clinical outcomes in Parkinson's disease. Sci Rep 2018;8:7129.
[28]. Gurke R, Etyemez S, Prvulovic D, Thomas D, Fleck SC, Reif A, Geisslinger G, Lötsch J. A data science-based analysis points at distinct patterns of lipid mediator plasma concentrations in patients with dementia. Front Psychiatry 2019;10:41.
[29]. Guttman L. Some necessary conditions for common factor analysis. Psychometrika 1954;19:149–61.
[30]. Gwinnutt JM, Sharp CA, Symmons DPM, Lunt M, Verstappen SMM. Baseline patient reported outcomes are more consistent predictors of long-term functional disability than laboratory, imaging or joint count data in patients with early inflammatory arthritis: a systematic review. Semin Arthritis Rheum 2018;48:384–98.
[31]. Ho TK. Random decision forests. Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1), 14–16 August 1995. Montreal: IEEE Computer Society, 1995. p. 278.
[32]. Kaiser HF, Dickman K. Analytic determination of common factors. Am Psychol 1959;14:425.
[33]. Kapoor M, Martel-Pelletier J, Lajeunesse D, Pelletier JP, Fahmi H. Role of proinflammatory cytokines in the pathophysiology of osteoarthritis. Nat Rev Rheumatol 2011;7:33–42.
[34]. Kapoor SR, Hider SL, Brownfield A, Mattey DL, Packham JC. Fibromyalgia in patients with rheumatoid arthritis: driven by depression or joint damage? Clin Exp Rheumatol 2011;29(6 suppl 69):S88–91.
[35]. Karatzoglou A, Smola A, Hornik K, Zeileis A. Kernlab—an S4 package for kernel methods in R. J Stat Softw 2004;11:1–20.
[36]. Kovacs WJ, Olsen NJ. Sexual dimorphism of RA manifestations: genes, hormones and behavior. Nat Rev Rheumatol 2011;7:307–10.
[37]. Kristensen LE, Bliddal H, Christensen R, Karlsson JA, Gulfe A, Saxne T, Geborek P. Is swollen to tender joint count ratio a new and useful clinical marker for biologic drug response in rheumatoid arthritis? Results from a Swedish cohort. Arthritis Care Res (Hoboken) 2014;66:173–9.
[38]. Kruskal WH, Wallis WA. Use of ranks in one-criterion variance anaylsis. J Am Stat Assoc 1952;47:583–621.
[39]. Lerch F. Bestimmung von gaussmixturmodellen durch einen evolutionären algorithmus und den chi quadrat test. Faculty of mathematics and computer science. Germany: Philipps University Marburg, 2016:101.
[40]. Liaw A, Wiener M. Classification and regression by randomForest. R News 2002;2:18–22.
[41]. Loh WY. Fifty years of classification and regression trees. Int Stat Rev 2014;82:329–48.
[42]. Lötsch J, Dimova V, Lieb I, Zimmermann M, Oertel BG, Ultsch A. Multimodal distribution of human cold pain thresholds. PLoS One 2015;10:e0125822.
[43]. Lötsch J, Geisslinger G, Heinemann S, Lerch F, Oertel BG, Ultsch A. QST response patterns to capsaicin- and UV-B-induced local skin hypersensitization in healthy subjects: a machine-learned analysis. PAIN 2018;159:11–24.
[44]. Lötsch J, Schiffmann S, Schmitz K, Brunkhorst R, Lerch F, Ferreiros N, Wicker S, Tegeder I, Geisslinger G, Ultsch A. Machine-learning based lipid mediator serum concentration patterns allow identification of multiple sclerosis patients with high accuracy. Sci Rep 2018;8:14884.
[45]. Lotsch J, Ultsch A. Machine learning in pain research. PAIN 2017;159:623–30.
[46]. Lötsch J, Ultsch A. A machine-learned knowledge discovery method for associating complex phenotypes with complex genotypes. Application to pain. J Biomed Inform 2013;46:921–8.
[47]. Lötsch J, Ultsch A. Random forests followed by computed ABC analysis as a feature selection method for machine-learning in biomedical data. In: Imaizumi T, editor. Proceedings of the Conference of the International Federation of Classification Societies IFCS-2017 Tokyo, Japan, August 8–10 2017. New York: Springer, 2017, p.170.
[48]. Mann HB, Whitney DR. On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 1947;18:50–60.
[49]. McWilliams DF, Walsh DA. Pain mechanisms in rheumatoid arthritis. Clin Exp Rheumatol 2017;35(suppl 107):94–101.
[50]. Merskey H, Bogduk N; Taxonomy IAftSoPTFo. Classification of chronic pain: descriptions of chronic pain syndromes and definitions of pain terms. Washington: IASP Press, 1994.
[51]. Merskey H, Bogduk NE. Classification of chronic pain. Seattle: IASP Press, 1994.
[52]. Michelsen B, Kristianslund EK, Hammer HB, Fagerli KM, Lie E, Wierød A, Kalstad S, Rødevand E, Krøll F, Haugeberg G, Kvien TK. Discordance between tender and swollen joint count as well as patient's and evaluator's global assessment may reduce likelihood of remission in patients with rheumatoid arthritis and psoriatic arthritis: data from the prospective multicentre NOR-DMARD study. Ann Rheum Dis 2017;76:708–11.
[53]. Murphy KP. Machine Learning: A Probabilistic Perspective. Cambridge: The MIT Press, 2012.
[54]. Olofsson T, Söderling JK, Gülfe A, Kristensen LE, Wallman JK. Patient-reported outcomes are more important than objective inflammatory markers for sick leave in biologics-treated patients with rheumatoid arthritis. Arthritis Care Res (Hoboken) 2018;70:1712–16.
[55]. Pearson K. LIII. On lines and planes of closest fit to systems of points in space. Lond Edinb Dublin Philosophical Mag J Sci 1901;2:559–72.
[56]. Pergolizzi JV, Gharibo C, Ho KY. Treatment considerations for cancer pain: A Global perspective. Pain Pract 2015;15:778–92.
[57]. President's Information Technology Advisory C. Report to the President: Computational Science. Ensuring America's Competitiveness. Arlington:National Coordination Office for Information Technology Research and Development, 2005.
[58]. Quinlan JR. Induction of decision trees. Machine Learn 1986;1:81–106.
[59]. R Development Core Team. R. A Language and Environment for Statistical Computing, 2008.
[60]. Reid MC, Eccleston C, Pillemer K. Management of chronic pain in older adults. BMJ 2015;350:h532.
[61]. Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 1958;65:386–408.
[62]. Saeys Y, Inza I, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007;23:2507–17.
[63]. Sanderson T, Morris M, Calnan M, Richards P, Hewlett S. What outcomes from pharmacologic treatments are important to people with rheumatoid arthritis? Creating the basis of a patient core set. Arthritis Care Res (Hoboken) 2010;62:640–6.
[64]. Schaible HG. Mechanisms of chronic pain in osteoarthritis. Curr Rheumatol Rep 2012;14:549–56.
[65]. Schaible HG, von Banchet GS, Boettger MK, Bräuer R, Gajda M, Richter F, Hensellek S, Brenn D, Natura G. The role of proinflammatory cytokines in the generation and maintenance of joint pain. Ann N Y Acad Sci 2010;1193:60–9.
[66]. Schapire RE, Freund Y. A short introduction to boosting. J Jpn Soc Artif Intell 1999;14:771–80.
[67]. Schwarz G. Estimating the dimension of a model. Ann Stat 1978;6:461–4.
[68]. Smirnov N. Table for estimating the goodness of fit of empirical distributions. Ann Math Stat 1948;19:279–81.
[69]. Smolen JS, Aletaha D, Barton A, Burmester GR, Emery P, Firestein GS, Kavanaugh A, McInnes IB, Solomon DH, Strand V, Yamamoto K. Rheumatoid arthritis. Nat Rev Dis Primers 2018;4:18001.
[70]. Spearman C. The proof and measurement of association between two things. Am J Psychol 1904;15:72–101.
[71]. Stolt P, Bengtsson C, Nordmark B, Lindblad S, Lundberg I, Klareskog L, Alfredsson L. Quantification of the influence of cigarette smoking on rheumatoid arthritis: results from a population based case-control study, using incident cases. Ann Rheum Dis 2003;62:835–41.
[72]. Strand V, Wright GC, Bergman MJ, Tambiah J, Taylor PC. Patient expectations and perceptions of goal-setting strategies for disease management in rheumatoid arthritis. J Rheumatol 2015;42:2046–54.
[73]. Student. The probable error of a mean. Biometrika 1908;6;1–25.
[74]. Swets JA. The Relative Operating Characteristic in Psychology: a technique for isolating effects of response bias finds wide use in the study of perception and cognition. Science 1973;182:990–1000.
[75]. Taylor PC, Moore A, Vasilescu R, Alvir J, Tarallo M. A structured literature review of the burden of illness and unmet needs in patients with rheumatoid arthritis: a current perspective. Rheumatol Int 2016;36:685–95.
[76]. Tengstrand B, Ahlmén M, Hafström I. The influence of sex on rheumatoid arthritis: a prospective study of onset and outcome after 2 years. J Rheumatol 2004;31:214–22.
[77]. Treede RD, Rief W, Barke A, Aziz Q, Bennett MI, Benoliel R, Cohen M, Evers S, Finnerup NB, First MB, Giamberardino MA, Kaasa S, Kosek E, Lavandʼhomme P, Nicholas M, Perrot S, Scholz J, Schug S, Smith BH, Svensson P, Vlaeyen JWS, Wang SJ. A classification of chronic pain for ICD-11. PAIN 2015;156:1003–7.
[78]. Tubach F, Ravaud P, Martin-Mola E, Awada H, Bellamy N, Bombardier C, Felson DT, Hajjaj-Hassouni N, Hochberg M, Logeart I, Matucci-Cerinic M, van de Laar M, van der Heijde D, Dougados M. Minimum clinically important improvement and patient acceptable symptom state in pain and function in rheumatoid arthritis, ankylosing spondylitis, chronic back pain, hand osteoarthritis, and hip and knee osteoarthritis: results from a prospective multinational study. Arthritis Care Res (Hoboken) 2012;64:1699–707.
[79]. Uhlig T, Kvien TK, Glennas A, Smedstad LM, Forre O. The incidence and severity of rheumatoid arthritis, results from a county register in Oslo, Norway. J Rheumatol 1998;25:1078–84.
[80]. Ultsch A. Pareto density estimation: a density estimation for knowledge discovery. In: Baier D, Werrnecke KD, editors. Proceedings of the Innovations in Classification, Data Science, and Information Systems—Proceedings 27th Annual Conference of the German Classification Society (GfKL). Heidelberg: Springer-Verlag, 2003.
[81]. Ultsch A, Lötsch J. Functional abstraction as a method to discover knowledge in gene ontologies. PLoS One 2014;9:e90191.
[82]. Ultsch A, Lötsch J. Computed ABC analysis for rational selection of most informative variables in multivariate data. PLoS One 2015;10:e0129767.
[83]. Ultsch A, Thrun MC, Hansen-Goos O, Lötsch J. Identification of molecular fingerprints in human heat pain thresholds by use of an interactive mixture model R toolbox (AdaptGauss). Int J Mol Sci 2015;16:25897–911.
[84]. van Vollenhoven RF, Ernestam S, Geborek P, Petersson IF, Cöster L, Waltbrand E, Zickert A, Theander J, Thörner A, Hellström H, Teleman A, Dackhammar C, Akre F, Forslind K, Ljung L, Oding R, Chatzidionysiou A, Wörnert M, Bratt J. Addition of infliximab compared with addition of sulfasalazine and hydroxychloroquine to methotrexate in patients with early rheumatoid arthritis (Swefot trial): 1-year results of a randomised trial. Lancet 2009;374:459–66.
[85]. Weihs C, Ligges U, Luebke K, Raabe N. klaR Analyzing German Business Cycles. Data Analysis and Decision Support. Berlin: Springer-Verlag, 2005. pp. 335–43.
[86]. Wilcoxon F. Individual comparisons by ranking methods. Biometrics 1945;1:80–3.
[87]. Wild A. Best practice in inventory management. New York: Wiley, 1997.
[88]. Wolfe F, Michaud K. Assessment of pain in rheumatoid arthritis: minimal clinically significant difference, predictors, and the effect of anti-tumor necrosis factor therapy. J Rheumatol 2007;34:1674–83.
[89]. Yunus MB. The prevalence of fibromyalgia in other chronic pain conditions. Pain Res Treat 2012;2012:584573.
Keywords:

Data science; Machine-learning; Registry study; Persistent pain; Rheumatoid arthritis

Supplemental Digital Content

© 2019 International Association for the Study of Pain