Share this article on:

Comparison of the psychometric properties of 3 pain scales used in the pediatric emergency department: Visual Analogue Scale, Faces Pain Scale-Revised, and Colour Analogue Scale

Le May, Sylviea,b,*; Ballard, Arianea,b; Khadra, Christellea,b; Gouin, Sergec; Plint, Amy C.d; Villeneuve, Edithe; Mâsse, Benoitb; Tsze, Daniel S.f; Neto, Ginag; Drendel, Amy L.h; Auclair, Marie-Christineb; McGrath, Patrick J.i; Ali, Saminaj,k

doi: 10.1097/j.pain.0000000000001236
Research Paper

Appropriate pain measurement relies on the use of valid, reliable tools. The aim of this study was to determine and compare the psychometric properties of 3 self-reported pain scales commonly used in the pediatric emergency department (ED). The inclusion criteria were children aged 6 to 17 years presenting to the ED with a musculoskeletal injury and self-reported pain scores ≥30 mm on the mechanical Visual Analogue Scale (VAS). Self-reported pain intensity was assessed using the mechanical VAS, Faces Pain Scale-Revised (FPS-R), and Colour Analogue Scale (CAS). Convergent validity was assessed by Pearson correlations and the Bland–Altman method; responsiveness to change was assessed using paired sample t tests and standardized mean responses; and reliability was estimated using relative and absolute indices. A total of 456 participants were included, with a mean age of 11.9 years ± 2.7 and a majority were boys (252/456, 55.3%). Correlations between each pair of scales were 0.78 (VAS/FPS-R), 0.92 (VAS/CAS), and 0.79 (CAS/FPS-R). Limits of agreement (95% confidence interval) were −3.77 to 2.33 (VAS/FPS-R), −1.74 to 1.75 (VAS/CAS), and −2.21 to 3.62 (CAS/FPS-R). Responsiveness to change was demonstrated by significant differences in mean pain scores among the scales (P < 0.0001). Intraclass correlation coefficient and coefficient of repeatability estimates suggested acceptable reliability for the 3 scales at, respectively, 0.79 and ±2.29 (VAS), 0.82 and ±2.07 (CAS), and 0.76 and ±2.82 (FPS-R). The scales demonstrated good psychometric properties for children with acute pain in the ED. The VAS and CAS showed a strong convergent validity, whereas FPS-R was not in agreement with the other scales.

Mechanical Visual Analogue Scale, Faces Pain Scale-Revised, and Colour Analogue Scale have strong responsiveness and reliability. A strong agreement was found between Visual Analogue Scale and Colour Analogue Scale.

aFaculty of Nursing, University of Montreal, Montreal, QC, Canada

bCHU Sainte-Justine Research Centre, Montreal, QC, Canada

cDivision of Emergency Medicine, Department of Pediatrics, CHU Sainte-Justine, Montreal, QC, Canada

dDepartment of Pediatrics and Emergency Medicine, University of Ottawa, Ottawa, ON, Canada

eDepartment of Anesthesia, CHU Sainte-Justine, Montreal, QC, Canada

fDepartment of Pediatrics, Columbia University College of Physicians and Surgeons, New York, NY, United States

gEmergency Department, Children's Hospital of Eastern Ontario, Ottawa, ON, Canada

hDepartment of Pediatrics, Section of Emergency Medicine, Medical College of Wisconsin, Milwaukee, WI, United States

iIWK Health Centre, Nova Scotia Health Authority and Dalhousie University, Halifax, NS, Canada

jWomen and Children's Health Research Institute, Edmonton, AB, Canada

kDepartment of Pediatrics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada

Corresponding author. Address: Faculty of Nursing, University of Montreal, P.O. Box 6128, Succursale Centre-Ville, Montreal, QC H3C 3J7, Canada. Tel.: 514-566-8892. E-mail address: (S. Le May).

Sponsorships or competing interests that may be relevant to content are disclosed at the end of this article.

Received September 22, 2017

Received in revised form March 23, 2018

Accepted March 28, 2018

Back to Top | Article Outline

1. Introduction

Accurate assessment and management of pain represent a major challenge to health care professionals, especially for those working in pediatric emergency departments (EDs).7,39 Assessment of pain in children is particularly complex considering both subjectivity of the pain experience and the limits and variability of children's cognitive and social development.20,50,58 Accurate pain measurement in children depends on the effective utilization of valid, reliable, and developmentally appropriate and age-appropriate scales.20,50,55 Considering that children develop the capacity to self-report pain at approximately 4 years of age, self-report should always be used as the primary source of information.54,55,58 Self-report may be particularly useful because it allows children to identify pain location and quantify its intensity, which are important factors to determine the best therapeutic option.58

The Visual Analogue Scale (VAS), Colour Analogue Scale (CAS), and Faces Pain Scale-Revised (FPS-R) are well-established tools for the self-report assessment of children's acute pain.18,41,51 The VAS has been validated and is recommended for use in children aged 8 to 17 years old,41 but another study has also used it in children aged 6 to 8 years old.47 As for the FPS-R, this scale is the recommended scale for use in children aged 4 to 12 years old.21,52 However, it is also considered valid to be used in older children.27,54 The CAS has been considered as a valid and reliable scale for self-report of acute pain in children aged 5 years and older.14

Although these self-reported measures of pain intensity have been widely studied, there is no consensus regarding their agreement or whether they can be used interchangeably. However, it is particularly important to rigorously evaluate the psychometric properties of the scales in the specific setting in which they are intended to be used.29,39 Indeed, no large study has compared, rated, and validated these 3 scales, altogether, in the ED setting. This study will add to the corpus of knowledge on psychometric properties of 3 common self-reported pain scales used with children.

The aim of our study was to determine and compare key psychometric properties of 3 self-reported pain scales commonly used in pediatric EDs. Specifically, we assessed the convergent validity/agreement, the responsiveness, and the reliability of the mechanical VAS, CAS, and FPS-R in children reporting acute pain related to a musculoskeletal injury.

Back to Top | Article Outline

2. Methods

2.1. Study design

We performed a secondary analysis34 on responses to pain scales used with the participants of a Randomized Controlled Trial of Oral analgesics Utilization for CHildren with Musculoskeletal injury (OUCH trial).33 The OUCH trial aimed to assess the efficacy of a combination of morphine and ibuprofen, as compared to either morphine or ibuprofen alone, for the management of pain in children presenting to the ED with a musculoskeletal injury.33 This trial was conducted at 3 university-affiliated pediatric EDs in Canada: CHU Sainte-Justine (Montreal, QC), Stollery Children's Hospital (Edmonton, AB), and Children's Hospital of Eastern Ontario (Ottawa, ON). The study was approved by the Research Ethics Board (REB) of each site where the trial was conducted, in accordance with the Declaration of the World Medical Association ( Both or either one of the parents of participants provided written informed consent, and participants provided verbal or written assent.

Back to Top | Article Outline

2.2. Participants

A consecutive sampling was used for recruitment for the OUCH trial. Children between the ages of 6 and 17 years presenting to the ED with a musculoskeletal injury (suspected fracture or severe sprain) to an upper or a lower limb, with a self-reported pain score greater than 29 mm, on a 0- to 100-mm VAS, were eligible to participate. This inclusion criterion was established to recruit children with moderate to severe pain to justify the administration of morphine or a combination of analgesics, as the cutoff between mild and moderate pain is 30 mm on the VAS.31 Information regarding the VAS is detailed in the section on Instruments. All participants spoke either French or English. We excluded children with (1) a known allergy to morphine, ibuprofen, or artificial colouring; (2) a suspicion of child abuse (as determined by the triage nurse); (3) an inability to self-report pain; (4) chronic pain requiring daily analgesics; (5) a nonsteroidal anti-inflammatory drug or opioid use within 3 hours before triage presentation; (6) an injury to more than 1 limb; (7) known hepatic or renal disease/dysfunction; (8) known bleeding disorder; (9) a neurocognitive disability precluding assent and participation in the study; and (10) known history of sleep apnea or loud snoring in the past 5 days.

Back to Top | Article Outline

2.3. OUCH trial

Eligible patients were enrolled in the study after initial assessment by the ED triage nurse. Recruited participants were randomly assigned to receive either 1 of 3 treatments: (1) a combination of oral morphine (0.2 mg/kg [maximum 15 mg]) and oral ibuprofen (10 mg/kg [maximum 600 mg]), (2) a combination of oral morphine (0.2 mg/kg [maximum 15 mg]) and a placebo of oral ibuprofen, or (3) a combination of oral ibuprofen (10 mg/kg [maximum 600 mg]) and a placebo of oral morphine. An unbalanced allocation ratio of 2:2:1 was used to increase the statistical power of the 2 groups including oral morphine to assess the safety profile of this opioid. In addition, it was hypothesized, based on a previous study62 and the expert opinion of a pediatric anesthesiologist from our team (E.V.), that the difference would be smaller between both morphine groups because morphine is a stronger analgesic than ibuprofen. Randomization was stratified by center and by pain intensity (moderate: 30-69 mm; severe: ≥70 mm). As this was a double-blinded study, children and their parents, research nurses, and physicians were unaware of the treatment received. The study blinding could be broken to administer a rescue analgesia if the child's safety was compromised. Concerning the results of the OUCH trial, there was no statistically significant difference between the 3 groups at 60 minutes after the medication administration (primary outcome). There was also no statistically significant difference when compared by pain intensity groups. Therefore, considering the equivalent results in the 3 study groups, the results of the OUCH trial did not influence the results of the current study concerning the validity and reliability of the VAS, CAS, and FPS-R.

Back to Top | Article Outline

2.4. Instruments

The VAS, CAS, and FPS-R are among the most commonly used measures of pain intensity in pediatric population, in both research and clinical settings.22,57 These 3 scales are considered as validated measures for the assessment of pain in children according to the assessment criteria developed by Cohen et al.18 Criteria by Cohen et al.18 used to conclude to a well-established assessment are the following: “(1) The measure must have been presented in at least 2 peer-reviewed articles by different investigators or investigatory teams; (2) Sufficient details about the measure to allow for critical evaluation and replication; and (3) Detailed information indicating good validity and reliability in at least 1 peer-reviewed article” (p. 913).

The VAS is a continuous scale for self-report of pain intensity in children aged 6 years and older.38,40,48 The version of the VAS used in this study is the mechanical VAS, which is a plastic ruler with a mechanical slider consisting of a 100-mm horizontal line with the anchors labelled “no pain” (0 mm) and “very much pain” (100 mm).47 Children were asked to move the slider along the line to the point indicating their pain intensity. The plastic ruler had a numerical rating on the back allowing the research nurse to quickly record the number representing the child's pain. It has been successfully used and validated in many pediatric ED pain studies.4,12,17,23,35

The CAS was developed by McGrath et al.40 and consists of a 100-mm long plastic scale with a wedge-shaped figure gradually progressing from white (lower anchor) to red (upper anchor), end anchored by the descriptor “no pain” and “worst pain” with a mechanical slider to indicate pain intensity.15,47 It has a numerical rating on the back side numbered from 0 to 10 cm (with increments of 0.25 cm). This scale was developed specifically for measuring children's pain ratings in clinical settings. Children were asked to move the slider to the intensity of the red colour that best described their current pain.47 The CAS is considered a valid and reliable self-report scale for acute pain assessment of children aged 5 years and older presenting to the ED.14,15,39,40

The FPS-R is a self-report ordinal scale validated to measure pain in children aged between 4 and 17 years.27 It was initially adapted from the FPS developed by Bieri et al.6 to derive a common metric from 0 to 10 to enhance compatibility scoring with other pediatric pain measures.27,52 This scale consists of 6 faces, from left to right, each showing a greater level of pain than the previous one, with scores varying from 0, 2, 4, 6, 8, and 10. On the back side of the scale, the instructions state that the face on the left most position is associated with the score 0 and shows “no pain,” and the face on the right most position is associated with the score 10 and shows “very much pain.”27,29 Each child had to indicate which face best represented the pain they had at the time of assessment. Validity and reliability of the FPS-R has been extensively explored and established in children aged 4 to 17 years.16,25,42,52

Back to Top | Article Outline

2.5. Data collection procedure

Pain intensity was first assessed through self-report using the VAS, second the FPS-R, and third the CAS, at specific study times, including recruitment after triage (time of recruitment [T-R]), before medication administration (T-0), and at 60 minutes after medication administration (T-60), as this latter time reflected the peak plasma level and action of both orally administered medications.46,61 The 3 scales were always consecutively administered in the same order: VAS, FPS-R, and CAS. The VAS and CAS were not administered consecutively, given the close resemblance of these 2 mechanical scales. Instead, the child's answer was meant to be mitigated by the FPS-R intervening.

Back to Top | Article Outline

2.6. Sample size

The required sample size was not recalculated specifically for the current article because it was a secondary analysis of the results of the OUCH trial.33 Details on the sample size calculation of the OUCH trial are provided in the published article about this trial.33 However, a power analysis was run before each inferential statistic performed. All power analyses results were equal to 1 indicating enough power to run the tests.

Back to Top | Article Outline

2.7. Psychometric data analyses

To homogenize data from the VAS and FPS-R with the 0 to 10 continuous CAS, data from the VAS were converted into continuous values from 0 to 10, and data from the FPS-R were treated as continuous instead of ordinal. Convergent validity, a subtype of construct validity referring to the capacity of different scales to measure the same construct and produce the same results,53 was assessed by determining Pearson correlations and level of agreement using the Bland and Altman method.9 Pearson correlations were calculated using the Fisher r-to-Z transformation for independent samples and compared after stratification by age groups (6-11; 12-17) and by pain intensity (moderate [30-69 mm]; severe [≥70 mm]), to determine whether agreement between scales would be different depending on these 2 factors. The age groups were chosen as children aged 6 to 11 years old represent “middle childhood” and children aged 12 to 17 years old represent “adolescents.”63 Groups by pain intensity were categorized as moderate pain intensity (30-69 mm on the VAS) and severe pain intensity (≥70 mm on the VAS), as defined by Kelly.31 Regarding the interscale agreement, results were first converted into numerical values from 0 to 10, and then, the classic Bland and Altman method9 was performed. Bland and Altmans9 95% confidence interval (CI) of the mean difference between paired scores needs to be within the a priori maximum limit of agreement to conclude on agreement and interchangeability of 2 scales. Based on previous studies, we predetermined the maximum limit of agreement at ±2/10 points (±20 mm) when using the VAS, which is equivalent to the coefficient of repeatability in adults.2,19,55 We believed that this limit is more liberal and permissive than the minimum clinically significant difference in children, which is ±10 mm45 on the 100-mm VAS (or ±1/10). A recent debate in the literature suggested that the Bland and Altman's 95% CI criterion would be too strict and unrealistic, given the subjectivity about children's capacity to self-report their pain.59 To mitigate this problem, von Baeyer59 proposed that an 80% CI criterion, instead of a 95% CI, would be more adequate in this context. Considering this, we calculated and reported results on scales' agreement based on both 95% CI and 80% CI criteria. Proportion of children exceeding the predetermined limits of agreement (±2/10) was also calculated and compared for each age group (6-11; 12-17 years) and pain severity (30-69 mm; ≥70 mm).

Internal responsiveness to change in pain scores was used to determine the ability of an instrument to detect a change over a prespecified time frame, if change has occurred.30 Based on the assumption that the administration of analgesics decreases the intensity of pain over time and therefore the pain score, we compared the preanalgesia mean VAS, FPS-R, and CAS scores with their respective postanalgesia mean scores (60 minutes after medication administration) by performing the paired sample t tests. In addition to mean values and SDs, results on the VAS, FPS-R, and CAS were also reported using medians and interquartile ranges (IQRs). Internal responsiveness was also evaluated using the standardized response mean (SRM), which is a type of effect size estimating change in the measure while accounting for patient variability in change scores.30 The SRM results are interpreted as follows: 0.20: low responsiveness; 0.50: moderate responsiveness; and >0.80: high responsiveness.5,24,37

Repeatability and stability of measurements on the same subject over time and under similar conditions was assessed through the test–retest reliability. As such, the 3 scales were used to evaluate children's pain scores at 2 different study times where their pain was assumed to be similar (T-R and T-0). Estimation of the test–retest reliability was performed using both relative and absolute indices, as recommended by best practice guidelines on health scale measurement.1,11,36,56 Relative reliability was estimated using the Pearson correlations and the intraclass correlation coefficient (ICC) for a 2-way effect model of the scales' mean pain scores at T-R and T-0.56 Results of these analyses yielded the direction and strength of relationships between test–retest results. Confidence intervals for ICC were also reported. The absolute reliability, which refers to the variability accounting for both random and systematic errors,28,56 was evaluated by calculating the coefficient of repeatability (CR). The CR refers to the value below which the absolute differences between 2 measurements would lie within a probability of 0.958,9 and was demonstrated using the Bland and Altman method9 for assessing agreement for repeated measures.10 It was calculated by multiplying the within-subject SD (Sw) by 2.77, according to the following formula: (CR = 2.77 Sw). The Sw was obtained by performing a 1-way analysis of variance.10,32 The CR can be interpreted as follows: a change in pain scores of at least [±CR value] is needed to conclude that a significant change in pain scores occurred and that the intervention administered was beneficial.56

Finally, anchor biases were calculated for the 3 scales using descriptive statistics following 3 patterns: low-end bias (all between 0 and 2), high-end bias (all between 8 and 10), and low/high bias (all 0s and 10s). The anchor biases were computed at T-0, given that this study time occurred before the medication administration and therefore was not influenced by the treatment group.

All statistical analyses were performed with SAS software version 9.3. A P value <0.05 was considered statistically significant.

Back to Top | Article Outline

3. Results

3.1. Characteristics of participants

Data from 456 participants were available for analyses. Mean age of participants was 11.9 years (SD = 2.7) and a majority were boys (252/456, 55.3%). A mean baseline pain score of 60.9 (SD = 16.2 mm) was reported using the VAS, which corresponds to a moderate pain intensity. Regarding the type of injury, 277 children (60.7%) presented a soft-tissue injury and 175 (38.4%) had a fracture. A total of 91 (20.0%) children received ibuprofen, 188 (41.2%) received morphine, and 177 (38.8%) were administered a combination of morphine and ibuprofen. Characteristics of participants are detailed in Table 1.

Table 1

Table 1

Back to Top | Article Outline

3.2. Convergent validity

Pearson correlations between the VAS, FPS-R, and CAS pain scores at 60 minutes after medication administration showed positive and strong correlations, particularly between the VAS and CAS (VAS/CAS: r = 0.92; VAS/FPS-R: r = 0.78; FPS-R/CAS: r = 0.79). When stratified by age group, magnitude of correlation coefficients was similar for younger children (VAS/FPS-R: r = 0.81; CAS/FPS-R: r = 0.82) and older children (VAS/FPS-R: 0.76; CAS/FPS-R: r = 0.77) between the VAS/FPS-R and the CAS/FPS-R. Correlations between the VAS and the CAS (r = 0.92) were the same for both age groups. When stratified by pain intensity, the magnitude of correlation coefficients was similar in children with severe pain (VAS/FPS-R: r = 0.79; CAS/FPS-R: r = 0.81; VAS/CAS: r = 0.90) and in children with moderate pain (VAS/FPS-R: r = 0.73; CAS/FPS-R: r = 0.73; VAS/CAS: r = 0.92) for the 3 pairwise comparisons. Detailed results are presented in Table 2.

Table 2

Table 2

According to the classic Bland and Altman method,9 only the VAS and the CAS showed acceptable agreement, with more than 95% of scores falling within the a priori maximum limit of agreement set at ±2.0 (limits of agreement [95% CI]: −1.73 to 1.75). The VAS/FPS-R (limits of agreement [95% CI]: −3.77 to 2.33) and the CAS/FPS-R (limits of agreement [95% CI]: −2.21 to 3.62) failed to demonstrate sufficient and acceptable agreement at the 95% level, indicating that these scales were not interchangeable. We obtained similar results when applying the 80% CI criteria proposed by von Baeyer.59 Results of the limits of agreement according to the Bland and Altman method9 based on both 95% CI and 80% CI criteria are presented in Table 3.

Table 3

Table 3

Figure 1 shows the proportion of children exceeding the limits of agreement by subgroup stratification for age (6-11 vs 12-17) and pain intensity (moderate pain [30-69 mm] vs severe pain [≥70 mm]) for the 3 pairwise comparisons. Overall, a proportion of 2.98% (12/403), 13.47% (54/401), and 18.76% (85/453) of children's pain scores was over the predetermined limits of agreement, for the comparisons between the VAS/CAS, FPS-R/CAS, and VAS/FPS-R, respectively.

Figure 1

Figure 1

As for comparisons between VAS/FPS-R and CAS/FPS-R, there was a greater proportion of children falling outside the limits of agreement in the older group (VAS/FPS-R: 19.9% [51/256]; CAS/FPS-R: 14.9% [34/229]) than that in the younger group (VAS/FPS-R: 17.3% [34/197]; FPS-R/CAS: 11.6% [20/172]). On the other hand, the VAS/CAS showed greater agreement in older children than in younger children with only 1.7% (4/231) of them falling outside the limits of agreement. However, only the VAS/CAS demonstrated sufficient agreement at the 95% CI level, with less than 5% of the children's pain scores falling outside the limits of agreement, thus 4.7% (8/172) and 1.7% (4/231), respectively, in younger and older children.

When stratified by pain intensity, VAS/FPS-R and CAS/FPS-R seemed to be in greater agreement in children with severe pain with, respectively, 21.7% (31/143) and 10.5% (13/124) of the scores falling outside the limits of agreement compared with children with moderate pain (VAS/FPS-R: 17.4% [54/310]; CAS/FPS-R: 14.9% [41/277]). Only the comparison between the VAS and the CAS showed an acceptable agreement at the 95% CI level in children presenting moderate pain (1.8% [5/279]) compared with severe pain (5.7% [7/124]) (Fig. 1).

Back to Top | Article Outline

3.3. Internal responsiveness

The 3 scales demonstrated a good responsiveness to change. The mean pain scores preanalgesia were 5.73 (SD = 1.81; median: 5.6; IQR: 4.5-7.0) for the VAS; 5.73 (SD = 1.82; median: 6.0; IQR: 4.0-6.0) for the FPS-R; and 5.69 (SD = 1.81; median: 5.8; IQR: 4.5-7.0) for the CAS. After analgesic administration, the mean pain scores were 4.30 (SD = 2.30; median: 4.2; IQR: 2.5-6.0) with the VAS; 4.30 (SD = 2.31; median: 4.0; IQR: 2.0-6.0) with the FPS-R; and 4.27 (SD = 2.27; median: 4.3; IQR: 2.5-6.0) with the CAS (Fig. 2). As hypothesized, the mean differences in pain scores were significantly lower 60 minutes after the administration of the medication for the VAS (mean = −1.43, SD = 1.97, P < 0.0001), the FPS-R (mean = −1.61, SD = 2.00, P < 0.0001), and the CAS (mean = −1.42, SD = 1.73, P < 0.0001), which suggest a high responsiveness of the scales to pain relief (Table 4). Finally, SRMs were 0.72 for the VAS, 0.80 for the FPS-R, and 0.82 for the CAS.

Figure 2

Figure 2

Table 4

Table 4

Back to Top | Article Outline

3.4. Reliability

Regarding the indices of relative reliability, we found a good agreement between the test and retest for each scale, with ICCs of 0.79 (95% CI 0.75-0.82, r = 0.80), 0.76 (95% CI 0.72-0.80), and 0.82 (95% CI 0.78-0.95, r = 0.82), respectively, for the VAS, FPS-R, and CAS. Despite the assumption that there should be no statistically significant difference in the mean pain scores between T-R and T-0, a statistically significant difference was found between the test and the retest for the VAS (mean difference bias: −0.37 ± 1.11, t = −7.04, P < 0.0001), the FPS-R (mean difference bias: −0.41 ± 1.28; t = −6.35, P < 0.0001), and the CAS (mean difference bias: −0.28 ± 1.02, t = −6.35, P < 0.0001). The estimated CR was ±2.29 (95% CI 2.15-2.45) for the VAS, ±2.82 (95% CI 2.64-3.01) for the FPS-R, and ±2.07 (95% CI 1.93-2.23) for the CAS. Results of relative and absolute reliability indices of the test–retest are presented in Table 5.

Table 5

Table 5

Back to Top | Article Outline

3.5. Anchor biases

The low/high bias, which refers to the frequency of 0/0 and 10/10 scores,60 occurred only in 1.98% of children for the VAS (n = 9), in 2.64% of children for the FPS-R (n = 12), and in 1.50% of children for the CAS (n = 6). Results for the low-end bias and the high-end bias are presented in Table 6.

Table 6

Table 6

Back to Top | Article Outline

4. Discussion

To our knowledge, this is the first study to simultaneously compare, for the same population, the psychometric properties of the VAS, FPS-R, and CAS, 3 highly recommended pain scales to measure acute pain in children.41

We were mostly interested in looking at the agreement between self-report scales as this is an important and often controversial issue in the pediatric pain field.47 Although some studies reported a lack of agreement across scales,2,47 others have concluded to their equivalency.44 When we applied the 95% CI criterion of the Bland and Altman method,9 the VAS and CAS showed a stronger agreement suggesting that they could be used interchangeably. This result may be explained by the similarities between the mechanical version of the VAS used in the current study and the CAS that also has a mechanical slider and is a variant of the VAS.40

Based on the same criterion, no agreement was found between the FPS-R and the 2 other scales. This result is consistent with Bailey et al.2 who reported an agreement between the VAS and CAS in children presenting to the ED with acute abdominal pain. However, another study47 reported a lack of agreement not only between the VAS/FPS-R and CAS/FPS-R but also between the VAS/CAS. A possible explanation is that the FPS-R has a coarser metric than the VAS and CAS. This has important implications for the field of pediatric pain research and practice. First, interpretation of pain scores between studies obtained with the FPS-R should not be compared with those obtained using the CAS and VAS. Second, within a single clinical trial, it might not be appropriate to use the FPS-R to assess pain intensity in 1 age group, whereas using the CAS or VAS to assess another age group with the objective of pooling and/or comparing the scores for analyses, as it may lead to bias in interpretation. Therefore, results with the FPS-R should be analyzed separately or pain intensity assessed with the same or interchangeable scales in all children.

Although the 80% CI criterion showed similar results to the 95% CI criterion for the agreement between scales, we agree with von Baeyer59 that the latter may be too strict for use in children's self-reported ratings of pain intensity. However, the 80% CI criterion proposed could be too permissive. As stated by Miro et al.,43 we already used a liberal standard in the choice of the predetermined limits of agreement (±2/10) allowing for a flexibility in interpretation of scores. Therefore, adding the 80% CI criterion to these limits of agreement largely increases the possibility of concluding to an agreement between 2 scales while there is none. Moreover, it largely reduces the possibility of highlighting the differences and subtleties between different pain scales. Consequently, we believe that a 90% CI criterion could be more appropriate for self-reported pain scales in children.

The pairwise comparisons involving the FPS-R showed better agreement in younger children than in older ones, aligning with the guidelines recommending its use particularly in younger children.27 It is also the only self-reported pain scale that has been validated in children aged 4 years and older.16,25,42,52 On the other hand, the VAS/CAS presented higher agreement in older children. Other studies corroborate our results but recommended that the VAS be used in children aged 8 years and older,41,58 considering the limited ability of younger children to think in abstract terms and to seriate.20 Comparisons between the VAS/FPS-R and the VAS/CAS demonstrated higher agreement in children presenting with moderate pain, whereas the CAS/FPS-R were more in agreement in children with severe pain. This latter result is consistent with Tsze et al.55 who also reported better agreement when pain intensity was severe between CAS/FPS-R in children aged between 4 and 17 years old.

Standardized response mean was added to the paired sample t tests to assess responsiveness of the 3 scales.30 The SRM is one of the most recommended tests because it takes into consideration the between-subject variability of the individual score change over time.26,30 The 3 scales showed good responsiveness to change on both tests. For the paired sample t test, the t value for each scale was greater than 1.96, indicating that a statistically significant change in pain scores occurred over time. Based on the adopted thresholds, SRM of the 3 scales suggested a clinically meaningful improvement of pain scores. The CAS showed higher responsiveness than the 2 other scales.

Regarding relative reliability, the single use of correlations has been largely criticized in literature because it does not detect systematic errors and is known to be higher than the true reliability.13,53,56 Instead, it is suggested to use the ICC that reflects both degrees of consistency and agreement.13 After adjusting for any real change or inconsistency in subject responses over time, our ICC results showed that 79% (VAS), 82% (CAS), and 76% (FPS-R) of the variance in children's respective pain scores were attributable to variances in true score. To date, there is no consensus regarding the acceptable level of reliability using the ICC.1,13,28 Shrout and Fleiss49 support that a clinically acceptable ICC >0.75 is necessary in health research. Based on this recommendation, the 3 scales analyzed in the current study are showing high test–retest reliability.

Regarding absolute reliability, the calculation of the CRs determined the measurement error or the smallest real difference that represents true change in pain scales. Our results showed that a minimal difference of ±2.29 on the VAS (or 23 mm), ±2.07 on the CAS, and ±2.82 (or one face) on the FPS-R on a 0 to 10 scale is required to conclude that a true difference has occurred in pain scores or the effect of an intervention. However, we should interpret this result with caution, given the low effect sizes and the significant difference in pain intensity scores between T-R and T-0 for the 3 scales. Several studies have calculated the CR of the VAS in children presenting to the ED with acute pain and reported a range from ±6 to 22 mm on a 0 to 100 mm scale.3,4,7,19,23 Differences regarding the intervals selected between assessments across studies may explain the variability in the results.3 Only 1 study55 has reported CRs for the FPS-R (±0.53) and the CAS (±0.35), which were much lower than ours.

The occurrence of anchor biases was minimal and, therefore, does not explain the differences in results between the 3 scales. The low occurrence of anchor biases may be explained by the age of participants as only children aged 6 years and older were recruited. Anchor biases are usually more frequent in children younger than 5 years. Younger children tend to select 1 of the 2 extremity scores of continuous scales because they sometimes consider pain scales as dichotomous.58,60

Back to Top | Article Outline

4.1. Limitations

Our study has some limitations that might have influenced results. First, we did not counterbalance the presentation of the scales, which could have subjected our findings to order effects. This might have impacted mean self-ratings and, consequently, artificially increased validity and reliability results. To overcome this limit, we did not administer the VAS and CAS, consecutively, because the CAS is a variant of the VAS. Second, it is possible that a fluctuation in pain scores occurred between T-R and T-0 because the delay between these 2 time points was not necessarily the same for all children. Therefore, reliability results should be interpreted with caution. Third, other environmental, physical, or psychological factors (eg, stressful environment, parental response, etc.) occurring during this time frame could have caused some children to feel more or less pain. Finally, our results are applicable to children presenting acute pain after a musculoskeletal injury, and it may not be appropriate to generalize all pediatric populations.

Back to Top | Article Outline

5. Conclusions

In conclusion, as pain assessment in pediatric EDs will always remain a challenge for health care providers, it is important to optimize measurement by using age-appropriate, valid, and reliable scales. Providing recommendations about the choice of scales in daily clinical practice is beyond the scope of this article because its selection is based on multiple factors such as the child's age, development, clinical status, and preference among others. Our findings suggest that VAS, FPS-R, and CAS all have strong psychometric properties in children aged between 6 and 17 presenting to the ED with a musculoskeletal injury and, therefore, can all be recommended for use in daily clinical practice. However, as the CAS showed a slightly higher responsiveness and reliability when compared with both the FPS-R and VAS, we recommend its use for this population. For research purposes, only the VAS and CAS showed sufficient agreement to be used interchangeably. We propose that future research in scale agreement should consider applying the 90% CI criterion when performing the Bland and Altman method, to mitigate the challenges of the 80% CI and 95% CI norms that are currently used.

Back to Top | Article Outline

Conflict of interest statement

The authors have no conflict of interest to declare.

All phases of this study were supported by the Canadian Institutes for Health Research (CIHR) Operational Grant program (MOP #125943) and the Pediatric Emergency Research Canada (PERC) group.

An abstract on the results of this article was published in the Canadian Journal of Pain: LeMay S, Ballard A, Sillyboy JR, Latimer M, and Gélinas C. Pain scales development and validation across age, culture, and clinical contexts of care (abstract). Canadian Journal of Pain 2017;1(1):A36.

Back to Top | Article Outline


The authors acknowledge the research nurses and study coordinators at the three sites (Maryse Lagacé, Ramona Cook, Nadia Dow, and Zachary Cantor) for their invaluable contribution. Furthermore, they are grateful for the rigorous work done by the data management and biostatistics teams (Josée Robillard, Viet Anh Tran, Reda Eltaani, Dr Thierry Ducruet, and Melanie Fon-Sing).

Back to Top | Article Outline


[1]. Atkinson G, Nevill A. Statistical methods for assessing measurement error (reliability) in variables relevant to sport medicine. Sport Med 1998;26:217–38.
[2]. Bailey B, Bergeron S, Gravel J, Daoust R. Comparison of four pain scales in children with acute abdominal pain in a pediatric emergency department. Ann Emerg Med 2007;50:379–83. 383.e371–e372.
[3]. Bailey B, Gabbay J, Daoust R, Gravel J. Theoretical repeatability coefficient of a 100 mm visual analog scale in children. Clin J Pain 2014;30:515–20.
[4]. Bailey B, Gravel J, Daoust R. Reliability of the visual analog scale in children with acute pain in the emergency department. PAIN 2012;153:839–42.
[5]. Beaton DE, Hogg-Johnson S, Bombardier C. Evaluating changes in health status: reliability and responsiveness of five generic health status measures in workers with musculoskeletal disorders. J Clin Epidemiol 1997;50:79–93.
[6]. Bieri D, Reeve RA, Champion GD, Addicoat L, Ziegler JB. The Faces Pain Scale for the self-assessment of the severity of pain experienced by children: development, initial validation, and preliminary investigation for ratio scale properties. PAIN 1990;41:139–50.
[7]. Bijur PE, Silver W, Gallagher EJ. Reliability of the visual analog scale for measurement of acute pain. Acad Emerg Med 2001;8:1153–7.
[8]. Bland J. An introduction into medical statistics. Oxford: Oxford University Press, 2000.
[9]. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307–10.
[10]. Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Meth Med Res 1999;8:135–60.
[11]. Bland JM, Altman DG. Applying the right statistics: analyses of measurement studies. Ultrasound Obstet Gynecol 2003;22:85–93.
[12]. Borland M, Milsom S, Esson A. Equivalency of two concentrations of fentanyl administered by the intranasal route for acute analgesia in children in a paediatric emergency department: a randomized controlled trial. Emerg Med Australas 2011;23:202–8.
[13]. Bruton A, Conway J, Holgate S. Reliability: what is it and how is it measured? Physiotherapy 2000;86:94–9.
[14]. Bulloch B, Garcia-Filion P, Notricia D, Bryson M, McConahay T. Reliability of the color analog scale: repeatability of scores in traumatic and nontraumatic injuries. Acad Emerg Med 2009;16:465–9.
[15]. Bulloch B, Tenenbein M. Validation of 2 pain scales for use in the pediatric emergency department. Pediatrics 2002;110:e33.
[16]. Chambers CT, Hardial J, Craig KD, Court C, Montgomery C. Faces scales for the measurement of postoperative pain intensity in children following minor surgery. Clin J Pain 2005;21:277–85.
[17]. Chapman LL, Sullivan B, Pacheco AL, Draleau CP, Becker BM. VeinViewer-assisted Intravenous catheter placement in a pediatric emergency department. Acad Emerg Med 2011;18:966–71.
[18]. Cohen LL, La Greca AM, Blount RL, Kazak AE, Holmbeck GN, Lemanek KL. Introduction to special issue: evidence-based assessment in pediatric psychology. J Pediatr Psychol 2008;33:911–15.
[19]. DeLoach LJ, Higgins MS, Caplan AB, Stiff JL. The visual analog scale in the immediate postoperative period: intrasubject variability and correlation with a numeric scale. Anesth Analg 1998;86:102–6.
[20]. Drendel AL, Kelly BT, Ali S. Pain assessment for children: overcoming challenges and optimizing care. Pediatr Emerg Care 2011;27:773–81.
[21]. Emmott AS, West N, Zhou G, Dunsmuir D, Montgomery CJ, Lauder GR, von Baeyer CL. Validity of simplified versus standard self-report measures of pain intensity in preschool-aged children undergoing venipuncture. J Pain 2017;18:564–73.
[22]. Ferreira-Valente MA, Pais-Ribeiro JL, Jensen MP. Validity of four pain intensity rating scales. PAIN 2011;152:2399–404.
[23]. Gallagher EJ, Bijur PE, Latimer C, Silver W. Reliability and validity of a visual analog scale for acute abdominal pain in the ED. Am J Emerg Med 2002;20:287–90.
[24]. Garratt AM, Ruta DA, Abdalla MI, Russell IT. SF 36 health survey questionnaire: II. Responsiveness to changes in health status in four common clinical conditions. Qual Health Care 1994;3:186–92.
[25]. Goodenough B, Addicoat L, Champion GD, McInerney M, Young B, Juniper K, Ziegler JB. Pain in 4- to 6-year-old children receiving intramuscular injections: a comparison of the Faces Pain Scale with other self-report and behavioral measures. Clin J Pain 1997;13:60–73.
[26]. Guyatt G, Walter S, Norman G. Measuring change over time: assessing the usefulness of evaluative instruments. J Chronic Dis 1987;40:171–8.
[27]. Hicks CL, von Baeyer CL, Spafford PA, van Korlaar I, Goodenough B. The Faces Pain Scale-Revised: toward a common metric in pediatric pain measurement. PAIN 2001;93:173–83.
[28]. Hopkins WG. Measures of reliability in sports medicine and science. Sports Med 2000;30:1–15.
[29]. Huguet A, Stinson JN, McGrath PJ. Measurement of self-reported pain intensity in children and adolescents. J Psychosom Res 2010;68:329–36.
[30]. Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol 2000;53:459–68.
[31]. Kelly AM. Setting the benchmark for research in the management of acute pain in emergency departments. Emerg Med (Fremantle) 2001;13:57–60.
[32]. Kottner J, Audige L, Brorson S, Donner A, Gajewski BJ, Hrobjartsson A, Roberts C, Shoukri M, Streiner DL. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. Int J Nurs Stud 2011;48:661–71.
[33]. Le May S, Ali S, Plint AC, Masse B, Neto G, Auclair MC, Drendel AL, Ballard A, Khadra C, Villeneuve E, Parent S, McGrath PJ, Leclair G, Gouin S; Pediatric Emergency Research Canada. Oral analgesics utilization for children with musculoskeletal injury (OUCH trial): an RCT. Pediatrics 2017;140.
[34]. Le May S, Ballard A, Sillyboy JR, Latimer M, Gélinas C. Pain scales development and validation across age, culture and clinical contexts of care. Can J Pain 2017;1:A35–A36.
[35]. Le May S, Gouin S, Fortin C, Messier A, Robert MA, Julien M. Efficacy of an ibuprofen/codeine combination for pain management in children presenting to the emergency department with a limb injury: a pilot study. J Emerg Med 2013;44:536–42.
[36]. Lexell JE, Downham DY. How to assess the reliability of measurements in rehabilitation. Am J Phys Med Rehabil 2005;84:719–23.
[37]. Liang MH, Fossel AH, Larson MG. Comparisons of five health status instruments for orthopedic evaluation. Med Care 1990;28:632–42.
[38]. Maunuksela EL, Olkkola KT, Korpela R. Measurement of pain in children with self-reporting and behavioral assessment. Clin Pharmacol Ther 1987;42:137–41.
[39]. McConahay T, Bryson M, Bulloch B. Clinically significant changes in acute pain in a pediatric ED using the Color Analog Scale. Am J Emerg Med 2007;25:739–42.
[40]. McGrath PA, Seifert CE, Speechley KN, Booth JC, Stitt L, Gibson MC. A new analogue scale for assessing children's pain: an initial validation study. PAIN 1996;64:435–43.
[41]. McGrath PJ, Walco GA, Turk DC, Dworkin RH, Brown MT, Davidson K, Eccleston C, Finley GA, Goldschneider K, Haverkos L, Hertz SH, Ljungman G, Palermo T, Rappaport BA, Rhodes T, Schechter N, Scott J, Sethna N, Svensson OK, Stinson J, von Baeyer CL, Walker L, Weisman S, White RE, Zajicek A, Zeltzer L; PedImmpact. Core outcome domains and measures for pediatric acute and chronic/recurrent pain clinical trials: PedIMMPACT recommendations. J Pain 2008;9:771–83.
[42]. Miro J, Huguet A. Evaluation of reliability, validity, and preference for a pediatric pain intensity scale: the Catalan version of the faces pain scale–revised. PAIN 2004;111:59–64.
[43]. Miro J, Sanchez-Rodriguez E, Castarlenas E. Response to letter from von Baeyer. PAIN 2012;153:2152–4.
[44]. Newman CJ, Lolekha R, Limkittikul K, Luangxay K, Chotpitayasunondh T, Chanthavanich P. A comparison of pain scales in Thai children. Arch Dis Child 2005;90:269–70.
[45]. Powell CV, Kelly AM, Williams A. Determining the minimum clinically significant difference in visual analog pain score for children. Ann Emerg Med 2001;37:28–31.
[46]. Quiding H, Anderson P, Bondesson U, Boreus LO, Hynning PA. Plasma concentrations of codeine and its metabolite, morphine, after single and repeated oral administration. Eur J Clin Pharmacol 1986;30:673–7.
[47]. Sanchez-Rodriguez E, Miro J, Castarlenas E. A comparison of four self-report scales of pain intensity in 6- to 8-year-old children. PAIN 2012;153:1715–19.
[48]. Shields BJ, Palermo TM, Powers JD, Grewe SD, Smith GA. Predictors of a child's ability to use a visual analogue scale. Child Care Health Dev 2003;29:281–90.
[49]. Shrout P, Fleiss J. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979;2:420–8.
[50]. Spagrud LJ, Piira T, Von Baeyer CL. Children's self-report of pain intensity. Am J Nurs 2003;103:62–4.
[51]. Stinson J, Yamada J, Dickson A, Lamba J, Stevens B. Review of systematic reviews on acute procedural pain in children in the hospital setting. Pain Res Manag 2008;13:51–7.
[52]. Stinson JN, Kavanagh T, Yamada J, Gill N, Stevens B. Systematic review of the psychometric properties, interpretability and feasibility of self-report pain intensity measures for use in clinical trials in children and adolescents. PAIN 2006;125:143–57.
[53]. Streiner DL, Norman GR, Cairney J. Health measurement scales: a practical guide to their development and use. Oxford, Toronto: Oxford University Press, 2015.
[54]. Tomlinson D, von Baeyer CL, Stinson JN, Sung L. A systematic review of faces scales for the self-report of pain intensity in children. Pediatrics 2010;126:e1168–e1198.
[55]. Tsze DS, von Baeyer CL, Bulloch B, Dayan PS. Validation of self-report pain scales in children. Pediatrics 2013;132:e971–e979.
[56]. Vaz S, Falkmer T, Passmore AE, Parsons R, Andreou P. The case for using the repeatability coefficient when calculating test-retest reliability. PLoS One 2013;8:e73990.
[57]. von Baeyer CL. Children's self-reports of pain intensity: scale selection, limitations and interpretation. Pain Res Manag 2006;11:157–62.
[58]. von Baeyer CL. Children's self-report of pain intensity: what we know, where we are headed. Pain Res Manag 2009;14:39–45.
[59]. von Baeyer CL. Reported lack of agreement between self-report pain scores in children may be due to a too strict criterion for agreement. PAIN 2012;153:2152–3. author reply 2153–2154.
[60]. von Baeyer CL, Forsyth SJ, Stanford EA, Watson M, Chambers CT. Response biases in preschool children's ratings of pain in hypothetical situations. Eur J Pain 2009;13:209–13.
[61]. Wiffen P, Moore R, McQuay H. Bandolier Extra. Oral modified release morphine for the management of severe pain: a UK perspective. Bandolier Extra 2007;2018:1–8.
[62]. Wille C, Bocquet N, Cojocaru B, Leis A, Cheron G. Oral morphine administration for children's traumatic pain [in French]. Arch Pediatr 2005;12:248–53.
[63]. Williams K, Thompson D, Seto I, Contopoulos-Ioannidis DG, Ioannidis JP, Curtis S, Constantin E, Batmanabane G, Hartling L, Klassn T; for the StaR Child Health Group. Standard 6: age groups for pediatric trials. Pediatrics 2012;129:S153–S160.

Pediatric pain; VAS; FPS-R; CAS; Scales agreement

© 2018 International Association for the Study of Pain