Journal Logo

Comprehensive Review

Systematic review of the Face, Legs, Activity, Cry and Consolability scale for assessing pain in infants and children

is it reliable, valid, and feasible for use?

Crellin, Dianne J.a,b,c,*; Harrison, Denisea,b,d; Santamaria, Nicka; Babl, Franz E.b,c,e

Author Information
doi: 10.1097/j.pain.0000000000000305

Abstract

1. Background

Pain assessment is widely considered to be integral to effective pain management, and patient self-report, when available, is commonly identified as the primary source of information for assessment of pain. However, many cannot self-report pain, including infants, young children, the cognitively impaired, and the critically unwell. To facilitate pain assessment for infants and children unable to self-report, in excess of 40 multidimensional observational scales have been developed over the last 2 decades to assess and quantify pain intensity.38 Many of these have since been adapted and used in other populations unable to self-report. These scales are a composite of a number of parameters considered indicative of pain that can be detected and graded by an observer. Commonly, these parameters are a combination of behaviours (facial expressions, body movements, and cry) and physiological parameters (heart rate, oxygen saturation, and blood pressure). The Face, Legs, Activity, Cry and Consolability (FLACC) scale, designed to assess postoperative pain in young children, is one of the most commonly used scales.74 The FLACC scale scores pain intensity by rating 5 behaviours on a 0 to 2 scale; face, legs, activity, consolability, and cry resulting in a maximum score of 10 (Table 1).

Table 1
Table 1:
Face, Legs, Activity, Cry and Consolabilty (FLACC) scale.74

The FLACC scale was published in 1997, developed as a more practical alternative to existing pain measurement scales.74 The authors considered existing scales impractical for clinical use as they were too long or difficult to remember or score. They drew heavily from existing scales and clinical experts to identify appropriate pain-related behaviours and the descriptors to grade these behaviours. The new scale was comprised exclusively behavioural items and was originally designed and validated for use in infants and children aged 2 months to 7 years to measure postoperative pain.74 Two observers blinded to each other's scores independently and simultaneously applied the scale to 30 children on 3 occasions each. Results showed almost perfect interrater reliability (r = 0.97). The kappa scores for the 5 items showed moderate to substantial agreement, ranging from 0.52 (face) to 0.82 (cry). Responsiveness to analgesics was shown (FLACC scores decreased postanalgesic from 7.0 ± 2.9 to 1.7 ± 2.2 at 10 minutes, 1.0 ± 1.9 at 30 minutes, and 0.02 ± 0.05 at 60 minutes [P < 0.001 at each interval]) in a second group of children (n = 30). However, the study was at risk of bias as observers were not blinded to the use of analgesics. Substantial agreement between FLACC and Objective Pain Scale scores, an existing scale designed to assess postoperative pain in infants,15 (r = 0.80, P < 0.001) was used to demonstrate convergent validity in a third cohort of children (n = 29).

The results of the original study, although promising, were insufficient to confirm the reliability and validity of the FLACC scale when used to assess postoperative pain in infants and children aged 2 months to 7 years. Studies have attempted to provide confirmatory evidence, with many focusing on application of the scale to alternate populations (eg, older children, children with cognitive impairment, and adults) and under alternate circumstances (eg, procedural pain and critical illness).14,71,74,96–98,105–108,112 The results of studies published before 2007 have been summarised in 2 separate systematic reviews.27,110 They each recommend the FLACC scale to assess pain in children unable to self-report based on the strength of the available evidence. However, the conclusions of both systematic reviews suggest that there remain too many limitations to claim the scale as reliable and valid for use in all circumstances associated with its previous use.

The objectives of this systematic review were to determine the suitability of the FLACC scale to assess pain in infants and children by rigorously evaluating the psychometric properties of the scale. Specifically, to (1) identify and describe studies providing psychometric data and the populations and circumstances to which FLACC has been applied in these studies, (2) systematically review the quality of these studies using appropriate assessment tools, (3) analyse and synthesise the evidence for the psychometric and practical properties (feasibility and utility) of the FLACC scale, and (4) provide contemporary recommendations for the scale's role in pain assessment.

2. Methods

A systematic review was conducted to identify and appraise the evidence for the psychometric properties of the FLACC scale using a protocol developed by the authors and based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA statement).77 The protocol was registered with the International Prospective Register of Systematic Reviews (CRD42014014296) and is available in full text on the PROSPERO Web Site.26

Primary outcomes were the reliability, validity, feasibility, and utility of the FLACC scale for assessing pain in infants and children.

2.1. Inclusion/exclusion criteria

Studies were included in this review if they reported on the reliability, validity, feasibility, or utility of the FLACC scale used to assess pain in infants and children. This included studies where the aim was evaluation of the FLACC scale, comparison between scales, including FLACC, or evaluation of an alternate scale where the FLACC scale was used as the reference. As construct validity can be demonstrated by the difference between known or extreme groups, randomised controlled trials (RCT) using the FLACC scale to measure a study outcome in infants and children were also included in the review. Infants were defined as participants aged from birth to 1 year of age, and children were defined as participants aged from 1 to 18 years.

Studies were excluded from this review if FLACC scores were not reported or analysed separately, if the sample did not include children or their results were not analysed separately, or if the study was not available in full or in English. Additionally, RCTs considered of low quality and at high risk of bias (Jadad scores <3)57 were also excluded from analysis as they were considered unlikely to contribute evidence to the review.

2.2. Search strategy

Electronic databases searched were MEDLINE (1996—week 4, August 2014), EMBASE (1996—week 35, August 2014), the Cochrane Database of Systematic Reviews and Cochrane Controlled Trials (1996—Issue 8, August 2014), Cumulative Index Nursing and Allied Health Literature (CINAHL), and PsycINFO (1996—August 2014) using the Ovid, PubMed, and EBSCOhost platforms. Google Scholar and the reference lists of the included studies and identified reviews were also searched.

The search terms used were combinations of “FLACC” or “Face Legs Activity Cry Consolability” and “infant” or “child.” Because the original publication describing the FLACC scale was published in 1997, the search range included 1 year before this date to account for publication delays and earlier developmental works. No other limits were applied.

2.3. Study selection

Duplicates were removed, and relevant abstracts were reviewed by 2 independent reviewers (D.J.C. and one of N.S., F.E.B., or D.H.). Full-text articles were reviewed where eligibility could not be determined from the abstract. A third reviewer was used to reach consensus where study eligibility remained unclear.

2.4. Data extraction and analysis

Data extraction was completed by 1 reviewer (D.J.C.) and recorded on one of 2 extraction tools: for the psychometric evaluation studies, a modification of the QAREL data extraction form70 designed for appraisal of diagnostic reliability studies was used and for the RCTs, a modification of the Cochrane Collaboration data collection tool designed for intervention studies22 was used. Modifications of these forms included the deletion of irrelevant fields and the addition of fields to capture relevant methods and results not included in the original form.

Data extracted included participant details (eg, numbers, demographics), setting and circumstances of the pain being measured (associated with disease, operative, or procedural), scale description and application (eg, modifications and translation), study methods (design, psychometric properties evaluated, and statistical methods), sources of bias, and study results.

A second reviewer (F.E.B., D.H., or N.S.) checked and confirmed details of the data extraction and where there was disagreement a third assessor extracted data independently to reach consensus.

2.4.1. Quality assessment

Quality of the studies included in this review was assessed using one of 2 tools. The COSMIN checklist and a 4-point rating scale was used to assess the methodological quality of the studies focusing on psychometric evaluation,100 and the Jadad score was used to assess the quality of RCTs.57 Both tools were applied independently by 2 reviewers and a third if agreement was not achieved by the first 2 reviewers.

The COSMIN checklist and 4-point rating scale was developed to assess the quality of studies focused on health-related patient-reported outcome measures and provides standards for study design, statistical methods, and acceptable outcome values.100 The checklist is also considered suitable for other clinical rating scales that measure constructs not directly measurable.

The checklist is comprised of 12 boxes, which focus on measurement properties (9 boxes), interpretability, item response theory methods, and generalizability. The measurement properties addressed by the COSMIN checklist are internal consistency, reliability (test–retest, interrater and intrarater), measurement error, content validity, construct validity (structural validity, hypothesis testing, cross-cultural validity), criterion validity, and responsiveness. Each item within these boxes is scored on a 4-point scale (“poor,” “fair,” “good” or “excellent”) depending on the standard met by the study. The lowest item rating forms the final assessment for that property. The COSMIN taxonomy and the terms commonly used in pain scale evaluations studies are defined in Table 2.

Table 2
Table 2:
Pain scale validation strategies and COSMIN taxonomy.

The Jadad scale for assessing the quality of RCTs focuses on randomisation, blinding, and participant follow-up and results in a total score of 5, where 5 is a perfect score.57 A minor adjustment from the original scale was made to the definition for participant follow-up. In this review, we scored follow-up as acceptable if, in the absence of the explicit statement that “there were no withdrawals from the study,” all participants could be accounted for in the results.

For RCTs where reliability and/or responsiveness was assessed, these methods were evaluated using the relevant items from the COSMIN tool.

2.5. Data synthesis

The results of the search and study selection were described using the PRISMA flowchart.77 Studies using different designs were included in this review; therefore, pooling of data for meta-analysis was not considered possible. A narrative synthesis of the evidence provided by each study was therefore used to address each of the study outcomes. It was also anticipated that eligible studies would apply the scale to different populations and under different circumstances to those for which the scale was developed and originally tested. These studies were reviewed separately to the studies concentrating on the original population and circumstances. Subgroups were created for studies focusing on similar populations based on parameters such as age (older and younger than the original age range), circumstances of the pain (procedural), and modifications to the scale (translation to another language).

Two approaches to assessing the weight of evidence have been used for similar purposes,28,110 Method Guidelines for systematic reviews in the Cochrane Collaboration Back Review Group103 and evaluation criteria for Initiative on Methods, Measurement and Pain Assessment in Clinical Trials (IMMPACT) reviews;23 however, each has limitations. The Cochrane Back Review Group for systematic reviews defines the weight of evidence required of RCTs and controlled clinical trials evaluating treatment effects. It is reasonable to consider that a greater weight of evidence would be required to confirm the psychometric properties of a scale than would be required where the outcome is demonstrated in RCTs and controlled clinical trials. The IMMPACT review standards make no reference to the quality of the studies providing data to the review, which is a significant limitation of this approach.

3. Results

A total of 78 full-text articles were included in this review (26 psychometric evaluation studies1,5,7,14,21,31,32,43,47,52,59,71,72,74,83,86–88,96–98,105–108,112 and 52 RCTs2–4,6,9,10,12,16–18,20,29,30,36,37,40–42,44,45,48–50,53–56,58,62–69,75,76,80–82,84,85,90–92,95,99,101,104,109,114) (Fig. 1).

Figure 1
Figure 1:
PRISMA flow chart detailing the search and study screening results.

3.1. Study and patient characteristics

3.1.1. Psychometric evaluation studies

Twenty-six studies evaluated psychometric properties of the FLACC scale, which included: the original evaluation of FLACC,74 2 additional studies evaluating the psychometric properties of FLACC when applied to the same population under the same circumstances,14,112 and 18 studies evaluating the psychometric properties of FLACC applied to alternate populations or in alternate circumstances than those for which the scale was originally designed.1,5,7,31,32,47,59,71,72,83,88,96–98,105–108 In 4 of these studies, item descriptors were also modified to better suit the new population (eg, to describe pain behaviours of children with cognitive impairment).59,71,105,106 Seven studies used the FLACC scale translated into another language (Chinese, Thai, Swedish, and Brazilian Portuguese).7,31,32,59,83,96,97 Finally, 5 studies evaluated measurement properties of another pain assessment scale and used FLACC as a reference scale.21,43,52,86,87

These studies are summarised in Table 3, and a more detailed summary of these studies is available in Appendix A (available online as Supplemental Digital Content at http://links.lww.com/PAIN/A134).

Table 3-a
Table 3-a:
Summary of population, circumstances, and the methods used for psychometric testing for the psychometric evaluation studies.
Table 3-b
Table 3-b:
Summary of population, circumstances, and the methods used for psychometric testing for the psychometric evaluation studies.
Table 3-c
Table 3-c:
Summary of population, circumstances, and the methods used for psychometric testing for the psychometric evaluation studies.
Table 3-d
Table 3-d:
Summary of population, circumstances, and the methods used for psychometric testing for the psychometric evaluation studies.

3.1.2. Randomised controlled trials

A total of 86 eligible trials were identified, which potentially met the inclusion criteria. However, 34 RCTs were excluded on retrieval of the full-text article, as they did not report FLACC scores separately (13 trials) or the Jadad quality score was less than 3 (21 trials). The population, setting and circumstances, and quality scores for the 52 included studies are summarised in Table 4. More details about these studies are available in Appendix B (available online as Supplemental Digital Content at http://links.lww.com/PAIN/A135).

Table 4-a
Table 4-a:
Summary of randomised controlled trials.
Table 4-b
Table 4-b:
Summary of randomised controlled trials.
Table 4-c
Table 4-c:
Summary of randomised controlled trials.

In 26 RCTs, the FLACC scale was applied to the original population (infant and children aged 2 months to 7 years) and under similar conditions (postoperative pain) as the original scale.2–4,9,18,20,36,41,45,53–56,58,62–64,66–69,91,95,99,101,109 In 10 of the remaining 26 trials, the sample included infants and children aged older and younger than the age range for which FLACC was originally intended,10,37,42,44,49,50,84,85,90,92 and in the other 16 trials, the FLACC scale was used to measure procedural pain,6,12,16,17,29,30,40,48,65,75,76,80–82,104,114 of which, 10 included older children12,16,17,65,75,76,80–82,114 and one younger children.30

3.2. Psychometric properties and study quality

The 26 psychometric evaluation studies included evaluated the measurement properties: reliability (15 studies14,47,59,71,74,83,88,96–98,105–108,112), internal consistency (4 studies14,31,47,98 plus one excluded as results not reported separately for children108), content validity (2 studies14,97), structural validity (one study14), hypothesis testing (14 studies1,5,7,14,21,43,71,74,86–88,97,98,107,112), cross-cultural validity (one study32), criterion validity (7 studies7,31,59,83,98,105,108), and responsiveness (13 studies5,7,14,52,59,72,74,83,88,97,98,105,107 plus one excluded as paediatric data not analysed separately108). Measurement error was not reported in any of the studies. The RCTs all compared known groups using the FLACC scale to measure a study outcome (n = 52). In addition, 3 trials evaluated reliability,40,81,104 and 2 trials assessed the scale's responsiveness.6,104

The quality of the methods used to measure the psychometric properties of the FLACC scale was variable, ranging from “poor” to “good,” with only 9 studies scoring “good” for at least 1 property. The COSMIN checklist scores for the psychometric properties for each study are shown in Table 5. There were a number of common design limitations that impacted on the quality of these methods, the most noteworthy of which are noted in Table 3.

Table 5
Table 5:
COSMIN checklist (quality) scores for psychometric parameters.

The design of the psychometric evaluation studies and the RCTs allowed unbiased blinded assessment of reliability. Despite this, 8 of the psychometric evaluation studies scored “fair” or “poor” on the COSMIN checklist71,74,83,96,105,106,108,112 and the RCTs scored similarly. Reliability assessments were most commonly limited by small sample sizes and statistical analysis techniques. Sample size also had a substantial impact on the quality of the methods used to assess internal consistency.

The methods used for hypothesis testing in the psychometric studies (correlation with alternate scales and comparison between known groups) were limited by small sample sizes, failure to report missing values, or provide adequate descriptions of how they were managed and insufficient detail describing the comparator scale used for convergence testing and its measurement properties. As low-quality RCTs were excluded, the quality of the included RCTs was relatively higher than for the psychometric evaluation studies.

Correlation of the FLACC scale with another scale was described as criterion validity testing in 6 studies.7,31,59,83,98,108 However, an alternate behavioural scale, which cannot be considered a gold standard for pain assessment was used in 4 of these studies,7,59,98,105,108 resulting in low COSMIN scores. Hence, these evaluations should be more correctly considered convergence and not concurrent (criterion) testing.

Responsiveness to analgesics and procedural pain was commonly used to demonstrate the validity of the FLACC scale.7,14,52,59,72,74,83,88,97,98,105–107 However, the quality of most methods used to assess this was poor as the observers, although blinded to group allocation, were not blinded to the patient's circumstances (eg, administration of analgesics or pain producing procedures) potentially biasing their assessments.52,59,72,74,83,88,105–107

3.3. Data synthesis

The level of evidence of the psychometric evaluation studies and the RCTs are provided in Table 3 and 4, respectively. The following sections provide a synthesis of this evidence in attempt to draw conclusions about the feasibility, reliability, validity, and clinical utility of the FLACC scale.

3.3.1. Reliability

Of the 18 studies addressing the reliability of the FLACC scale, only 3 studies evaluated reliability in the population and circumstances for which FLACC was originally intended (infants and children aged 2 months to 7 years; for postoperative pain).14,74,112

Merkel et al. (1997), the authors of the original FLACC scale, reported near perfect levels of agreement between observers (r = 0.97). Agreement for each category ranged from a kappa value of 0.52 (face) to 0.82 (cry). The strength of these results was limited by the small sample size and the use of a nonweighted kappa to describe agreement for an ordinal scale.74 Both factors contributed to a COSMIN score of “fair.” Bringuier et al.14 reported near perfect interrater agreement (intraclass coefficients [ICC] = 0.9) for FLACC scores in a study rated “good” on the COSMIN checklist. The study by Willis reporting percentage agreement between 6 pairs of observations scored “poor” on the COSMIN checklist and therefore adds limited evidence to support interrater reliability.112

Video-taping of infants and children for later scoring by observers was only a feature of one of the studies focusing on postoperative pain in infants and children aged between 2 months and 7 years.14 However, observers did not score these segments of video a second time to enable calculation of intrarater reliability. Hence, there has been no evaluation of intrarater reliability of the FLACC scale when applied to infants and children from the original age group experiencing postoperative pain.

Of the remaining 15 studies, 7 studies focused on children undergoing a painful procedure,40,47,81,83,88,98,104 4 of which included either infants younger than 2 months or children older than 7 years.81,83,88,104 Four studies concentrated on older children with cognitive impairment.71,105–107 Three of these also applied the scale with modified descriptors,71,105,106 one of which also assessed application by the parents.105 Finally, 4 studies evaluated the scale after translation to another language (Thai = 296,97 and Swedish = 259,83) all to different populations or under unique circumstances. All but 1 study40 evaluated the interrater reliability and 7 assessed the intrarater reliability of the FLACC scale.47,71,88,97,98,107

The reliability for observers applying the FLACC scale to children experiencing procedural pain was reported as almost perfect agreement with intraclass correlations ranging from 0.85 to 0.99.40,47,83,88,98,104 Three of these 6 studies used rigorous methods and included sufficiently similar populations to more confidently draw conclusions about application of the FLACC scale to infants and young children undergoing a painful procedure.47,88,98 Two studies also reported interrater reliability separately for each phase of the procedure (eg, baseline and during the procedure) with contrasting results. Vaughan et al.104 and Gomez et al.47 reported similar interrater reliability during the painful phase of the procedure (95% confidence interval [CI], [0.92-0.99] and ICC = 0.95, respectively).47,104 However, ICCs for the baseline phases differ markedly (95% CI, [0.93-0.99] and ICC = 0.4, respectively). These results provide evidence to support the reliability of the scale for assessing pain during the painful phase of the procedure but raise questions about the reliability during nonpainful phases of a procedure. It is possible that the lower correlation demonstrated in the study by Gomez et al. is a function of lower FLACC scores with little variance. However, it should be noted that the intrarater correlation was 0.88 for the same phase of the procedure. These factors make it difficult to interpret the low interrater reliability results in their study.

Intrarater reliability of the FLACC scale has been assessed in 4 studies applying the scale to children experiencing procedural pain47,88,98 and in 3 studies, ICC ranged from 0.88 to 0.98.47,88,98 The age ranges of children included in each of these studies varied with no 2 studies including similar age ranges. This makes it difficult to draw conclusions about the intrarater reliability of the FLACC scale when used to measure acute procedural pain.

Application of the FLACC scale to children with cognitive impairment has been evaluated in 4 psychometric evaluation studies,71,105–107 all conducted by the same research group. The age of study participants included children older than the original age range, and in 3 of the 4 studies, the item descriptors had been modified to better suit assessment of pain in children with cognitive impairment. Reliability coefficients for interrater reliability ranged from 0.507 to 0.92 for total scores71,105–107 and 0.339 to 0.826 for scale items.71,107 The quality of these studies was generally “fair” to “good,” and these studies provide promising evidence for the interrater reliability of the FLACC scale applied to children with cognitive impairment, both with or without modification to the descriptors. The intrarater reliability of the FLACC scale modified to be applied to children with cognitive impairment aged between 4 and 19 years was reported in 1 “good” quality study (ICC = 0.8-0.883).107

Reliability testing has been reported for the Thai and Swedish versions of the scale.59,83,96,97 The reliability of the Thai version applied to assess children experiencing postoperative pain from the original age group was tested on 2 occasions by the same authors and on 1 occasion applied by the parent.96,97 After minor modifications to ensure the same meaning in Thai as was intended in the English version, interrater agreement (ICC = 0.949 and 0.948) and intrarater agreement (ICC = 0.095-0.99) were similarly high. It may be possible to cautiously accept the reliability of the Thai version of the FLACC scale applied to infants and children aged 2 months to 7 years experiencing postoperative pain.

The 2 studies testing the Swedish version of the FLACC scale reported application to specific and different populations to each other, and in 1 study, the cry item had been modified.59,83 Conclusions about the reliability of the Swedish version of the FLACC scale cannot be drawn at this time.

3.3.2. Internal consistency

Four studies evaluated internal consistency of the FLACC scale.14,31,47,98 Two studies used immunisation pain in children aged 2 to 18 months,47,98 one used postoperative pain in children aged 1 to 7 years14 and one used pain in children aged 7 to 17 years with cancer after translation of the scale to Brazilian Portuguese.31 Taddio et al.98 included 120 children and reported an alpha correlation coefficient of 0.88 at both baseline and during the immunisation. Gomez et al.47 reported that different items of the scale make more or less significant contributions to the overall score depending on the circumstance, eg, consolability made the largest contribution during needle insertion (0.903), whereas 15 seconds after needle insertion, the cry item made the larger contribution (0.957). In contrast to Taddio's results, all items scored low at baseline. This study was limited by a small sample size (n = 29), which was considered insufficient to meet the requirements for a high-quality examination of internal consistency. However, the findings raise important questions about the performance of the scale during different phases of a painful procedure. The study details for the other 2 studies are shown in Table 3 and the results in Appendix A (available online as Supplemental Digital Content at http://links.lww.com/PAIN/A134).14,31

3.3.3. Validation

The first attempt to establish content validity was made by Bringuier et al.14 in a study comparing the psychometric properties of 4 behavioural pain assessment scales. This assessment did not determine whether the scale was sufficiently comprehensive to assess pain and whether each item was an appropriate measure of pain and relevant to all groups for whom it had been designed. As these factors are key to content validation, this study was assigned a low score on the COSMIN checklist. Only 1 other attempt to establish the content validity of the scale has been made and that occurred after translation into Thai. In light of these results, it cannot be said that content validation for the FLACC scale has been achieved.

Three attempts to validate the FLACC scale, applied to the same population and under the same conditions as those for which it was designed, were identified.14,74,112 Validity of the FLACC scale was inferred in the original validation study using correlation with the objective pain scale (r = 0.8) and by responsiveness of the score to treatment with analgesics (7 ± 2.9-1 ± 1.9, 30 minutes after analgesic). However, limitations of the methods used in this study significantly reduce the strength of their findings (Table 3).

Willis et al.112 sought to validate the FLACC scale by correlating FLACC scores with self-reported pain scores. They demonstrated almost perfect correlation (r = 0.83, P = 0.0001) between the FLACC scores and self-report in older children (aged 5 to 7 years), which is much higher than the summary concordance reported in a 2008 meta-analysis (child and parent, r = 0.64) and (child and nurse, r = 0.58).113 The correlation between FLACC scores and self-report in younger children (aged 3 to 5 years) in Willis' study is not similarly high (r = 0.254, P = 0.381). This result may be a function of the increasing evidence that young children have difficulty using self-report pain scales.34,35,93,102 In light of only fair study quality, the results should be cautiously accepted as contributing evidence of FLACC validation for children aged 5 to 7 years old.

Bringuier et al.14 provided equivocal evidence of the FLACC scale's validity applied to the original population and circumstances. Responsiveness was shown by change over time in postoperative pain scores using repeated-measures analysis of variance. However, only the P value was reported (P < 0.001). Correlations between FLACC and 3 other behavioural rating scales (0.88-0.94) and facial action summary score (FASS) (0.71-0.78) were high. Conversely, correlations with self-report of pain at the same times were only fair to moderate (0.31-0.51). Furthermore, FLACC scores were also moderately correlated with self-reported anxiety at 2 time points (0.63 in postoperative acute care unit and 0.63 one day postoperatively) suggesting limited capacity for the scale to discriminate between pain and anxiety.

A study aimed at assessing the psychometric properties of the Toddler, Preschooler Postoperative Pain Scale (TPPPS) compared the validity of this scale to other observational measures of pain, including the FLACC scale.52 Results showed that FLACC scores were responsive to the administration of analgesics to children aged 1 to 5 years experiencing postoperative pain (FLACC = 5 (interquartile range 4.25-7.75) to FLACC = 0 (IQR = 0-0) P < 0.0002). However, the risk of bias was high as observers were not blinded to the treatment.

Nineteen RCTs provided evidence (3 contributing moderate levels45,63,66) supporting the validity of the scale to measure pain in children aged 2 months to 7 years experiencing postoperative pain.2,3,9,18,20,36,41,45,53,54,63,64,66,68,69,91,95,99,109

Finally, another 3 psychometric evaluation studies used the FLACC scale as a reference to evaluate a newly developed scale and comprised samples which included older children21 and younger infants86,87 than the original studies. The FLACC scale was only used as the reference scale to which the newly developed scale was compared to demonstrate concurrent validity of the new scale. Agreement with a scale not previously validated offers no evidence of FLACC validity. Where these studies used other methods such as scale responsiveness to demonstrate validity, only the new scale was assessed. Nine RCTs10,37,42,44,49,50,84,85,90 contribute evidence (2 contributing moderate levels10,37) towards the validation of the FLACC scale to measure postoperative pain for older children than was originally intended (7 years). No RCTs included younger infants.

The remaining psychometric evaluation studies (n = 16)1,5,7,31,43,59,71,72,83,88,97,98,105,107,108 and RCTs (n = 16)6,12,16,17,29,30,40,48,65,75,76,80–82,104,114 applied the FLACC scale under different circumstances, to different populations (other than age related differences), and/or after modification to the scale descriptors or translation into languages other than English. The focus of 5 psychometric evaluation studies was the assessment of the validity of the FLACC scale applied to children undergoing a painful procedure.1,5,82,88,98 Two other studies, focusing on assessment of acute and procedural pain, did not report results independently for procedural pain and cannot add evidence to the validity of the FLACC scale for procedural pain assessment.43 Sixteen trials included assessment of procedural pain. Study participants in these 20 studies ranged from neonates to adolescents, and on 1 occasion, the scale was translated into another language.82

A study by Taddio et al.98 applied FLACC in its original form to infants aged 2 to 6 months undergoing immunisation. The results of this study, rated “fair” on the COSMIN checklist, demonstrated that the FLACC scale was responsive to immunisation pain (FLACC scores preimmunisation mean = 0.6 [SD = 1.6] and postimmunisation mean = 6.5 (SD = 3.0), P < 0.001), strongly correlated with other measures (eg, Neonatal Infant Pain Scale, r = 0.92, P < 0.001 and Modified Behavioural Pain Scale (MBPS), r = 0.84, P < 0.001), and can discriminate between known groups (receipt of different vaccines, mean = 5.3 [SD = 3.30] vs mean = 7.8 [SD = 1.9], P < 0.001). Babl et al.5 aimed to determine whether the FLACC scale can distinguish between pain and distress in infants and children using discrimination between known groups (painful and nonpainful procedures) and responsiveness (before during and after a painful procedure). However, the quality of their methods rated “fair” and they did not report P values limiting the capacity to interpret the results. Low levels of evidence derived from 4 RCTs support the validity of FLACC to measure procedural pain in samples aged within the original limits (2 months to 7 years).6,29,40,48

Studies by Anh and Ranger included infants less than 2 months in their studies using FLACC to assess procedural pain.1,88 Anh and Jun demonstrated the capacity of the FLACC scale to differentiate between neonates undergoing painful (mean FLACC score = 4.58 [SD = 2.42]) and nonpainful stimulus (routine care mean FLACC score = 1.41 [SD = 1.86] and auditory stimulus mean FLACC score = 0.69 [SD = 1.38]), P < 0.001. Their results also show a strong correlation between FLACC scores and CRIES scores across each category of procedure (r = 0.826, 0.843, and 0.824; P < 0.01 in all). Ranger et al.88 also demonstrated responsiveness with changes in scores across phases of the procedure (baseline 0.25 [SD = 0.12], 95% CI, [0.01-0.51]; tactile 3.25 [SD = 0.56], 95% CI, [2.08-4.23]; and noxious 6.7 [SD = 0.66], 95% CI, [5.32-8.08], P < 0.001). However, they were unable to demonstrate similar responsiveness to the administration of analgesics and could show no correlation between FLACC scores and near-infrared spectroscopy results.

An additional 12 RCTs included older children (10 trials12,16,17,65,75,76,80–82,114) and younger infants (2 trials30,104) in trials using the FLACC scale to assess procedural pain, 7 of which contribute low levels of evidence and 1 moderate level of evidence towards validation of the FLACC scale for assessing procedural pain. Three trials also reported scale responsiveness.6,82,104 However, observers were not blinded to the circumstances (procedure or analgesics), and the quality of these methods scored “poor.” Therefore, these results contribute little to the evidence.

The use of the FLACC scale as a valid measure of pain in children with cognitive impairment aged between 4 and 18 to 21 years has been examined in 3 studies.71,105,107 Malviya et al.71 reported correlations between FLACC scores and Nursing Assessment of Pain Intensity scores ranging from 0.78 to 0.87, (P < 0.01), FLACC and parent applied Visual Analogue Scale scores ranging from 0.65 to 0.82 (P < 0.01), and FLACC scores and child report 0.67, (P = 0.051) to 0.86, (P < 0.001). They also demonstrated scale responsiveness with lower scores after analgesics assessed by video observers (6.1 ± 2.6 vs 1.9 ± 2.7; P < 0.001) and bedside observers (6.1 ± 2.5 vs 2.2 ± 2.4; P < 0.001). Voepel-Lewis et al. (2005) reported agreement between observer FLACC scores and Numeric Rating Scale, (ICC = 0.81 [CI, 0.70-0.89]), and child rating, (kappa = 0.65). Additionally, they demonstrated FLACC scale responsiveness to analgesics in their 2002 study107 (blinded nurses' scores: 5.1 ± 2.9 vs 2.2 ± 3.0, P = 0.001) and again in their 2005 study105 (FLACC 6.6 ± 2.4 vs 2.6 ± 2; P = 0.003). It should be noted that in 2 of these studies, the descriptors for the scale were modified to include pain behaviours of the included children and that these studies were all conducted by the same research group. There are no RCTs of sufficient quality to support these results.

The FLACC scale was translated into 4 languages (Brazilian Portuguese, Chinese, Swedish, and Thai), and validity was assessed in 6 psychometric evaluation studies.7,31,32,83,96,97 Only 2 studies by the same authors focus on a similar population and circumstances (7 to 17-year olds with oncological disease) after translation into the same language (Brazilian Portuguese).31,32 These studies provide insufficient data to support the use of FLACC after translation to these 4 languages and do not contribute to validation of the English version of the scale.

The remaining studies (n = 3) each applied the FLACC scale to different populations and circumstances and cannot be grouped.59,72,108 Single studies do not provide sufficient validation data to draw conclusions about the validity of the scale applied to that population. These results are presented in Table 4.

3.3.4. Feasibility and utility

The utility of the FLACC scale has been evaluated on 9 occasions7,14,59,71,72,96,97,106,107 and the feasibility on 3 occasions97,98,106 in a range of populations and circumstances. Due to heterogeneity of studies, populations of children, and circumstances, it is not possible to confidently draw broad conclusions about scale feasibility or utility. Taddio et al.98 have made the most objective attempt to determine the feasibility of several behavioural scales used to assess procedural pain, including the FLACC scale. Three observers recorded a pain score after one viewing of video-taped segments of 120 infants undergoing an immunisation procedure, then viewed the segment as frequently as necessary to reach a final score. The correlations between these scores were almost perfect for all scales (0.97-0.99, FLACC = 0.98), and there was no difference in the proportion of final pain scores achieved after the first viewing across scales (50%-66%, FLACC = 50%, P = 0.06). Application of the FLACC scale took the longest time in total (5 hours 25 minutes to 6 hours 50 minutes, FLACC = 6 hours 55 minutes), and only 20% preferred the FLACC scale (80% preferred the MBPS).

Nine studies evaluated the clinical utility of the FLACC scale, across heterogeneous populations, circumstances, and/or after modification or translation.7,14,59,71,72,96,97,106,107 Only 1 cohort, children aged 4 to 19 years with cognitive impairment, was studied on more than 1 occasion (3 studies).71,106,107 Each of these studies was conducted by the same research group and after modification to individualise the descriptors for the children included in the study. These studies used similar approaches to each other to show good levels of agreement between observers' scores coded to clinically meaningful categories: “mild,” “moderate,” and “severe.”71,106,107 These data provide a reasonable foundation to accept the clinical utility of a modified FLACC scale applied to children aged 4 to 19 years with cognitive impairment.

4. Discussion

This systematic review is the first comprehensive and robust review of the psychometric properties of the FLACC scale to be undertaken since it was developed. Previously published reviews assessing the psychometrics of the FLACC scale offer limited insight into the quality of the studies contributing evidence and make limited attempts to quantify the weight of evidence required to support their recommendations.27,110 This review attempts to address these limitations and provides a unique platform for making recommendations about the application of the FLACC scale in practice and identifying directions for future research and development.

The FLACC scale was developed as existing behavioural pain assessment scales were considered too long and difficult to score and remember and impractical to apply clinically.74 For example, the Children's Hospital Eastern Ontario Pain Scale comprised 6 items, to contribute to a total score ranging from 4 to 13.73 Similar to existing scales from which it was in part derived,8,15 the FLACC behaviours are scored on a consistent scale and the total score ranges from 0 to 10. The most obvious advance on existing scales is the potential ease with which clinicians might remember the behaviours as the first letter of each item has been used to name the scale (Table 1). Whilst feasibility was a primary reason for the development of the FLACC scale, it was not tested by the authors of the scale and has yet to be examined in the original population, circumstances, and setting. Taddio et al.98 completed a robust assessment of the feasibility of application of the FLACC scale to assess pain in infants undergoing immunisation. However, their conclusions suggest a clinician preference for the MBPS rather than the FLACC scale.

Procedural pain assessment presents unique challenges for a scale designed to assess postoperative pain in children. Although there is increasing concern about the use of physical restraint during procedures, in clinical, practice restraint continues to be used.25 No attempts have been made to determine the impact of restraint on the feasibility of using the FLACC scale where restraint is likely to directly interfere with the behaviour of the child, the capacity of carers to console, or the capacity of the assessor to assess the behaviour.

There is widespread acceptance that fear and anxiety generally accompany pain during a procedure and that the behaviours associated with these emotions may significantly modify or mimic the behaviours of children experiencing procedural pain. Data demonstrating the scale's responsiveness to pain and analgesics and the capacity of the scale to differentiate between known groups undergoing painful and nonpainful procedures are cited as evidence of validity. However, questions about the capacity of the FLACC scale to discriminate pain from fear can be raised from this same data. Babl et al.5 demonstrated the FLACC scale's responsiveness across the phases of a procedure. However, they note that FLACC scores were still high for nonpainful procedures and during the preparation phases of all procedures. These data were confirmed by other studies using responsiveness to support validity.88

Despite the data providing support for the validity of the FLACC scale, when examined closely, a number of concerns about the validity of the FLACC scale present themselves.47 For example, the results of 2 separate studies, one using the Facial Action Coding System39,46 and the other the Child Facial Coding System, demonstrate that infants and children rarely showed “jaw clenching” or “chin quivering” as an indication of pain, both of which are descriptors for the FLACC facial expression item.14,19 This is echoed in work completed by Breau et al.13 and more recently by Chang et al.,19 where observers coded the postoperative facial expressions of 44 infants and children aged 13 to 74 months using the facial items of the Child Facial Coding System and those found in common behavioural scales, including the FLACC. Results confirmed concerns about reliability, face, and convergent validity, and the authors concluded that where behavioural descriptors are inconsistent with what is observed empirically the scales are likely to perform more poorly.

Furthermore, several of the descriptors of the FLACC are open to interpretation as demonstrated by Harrison et al.,51 who showed in their recently published study that clinicians reinterpret the facial expression descriptors to include behaviours more commonly seen in infants and children experiencing pain and score accordingly. It is also unclear from the original scale description to what lengths efforts to console the child should be made before the consolability item is scored. This item is particularly problematic for procedural assessment where conduct of the procedure may impede attempts to console the child. Despite these concerns, the consolability item has been shown to make the largest contribution during needle insertion.47 Rigorous attempts to examine the descriptors for the items included in the scale and clarify how these items should be scored have not been attempted, so doubt remains about the accuracy of the descriptors and in turn this arguably forms the basis for concerns about the validity of the FLACC scale.

Many behavioural scales are derivatives of others, and the FLACC scale is no exception. Studies aiming to validate these scales frequently use correlation between the newly developed scale and other behavioural scales to claim validity. The logic of using behavioural scales all derived from each other or a similar foundation to validate each other seems circular and likely to confirm only that they all test the same construct, but not that this construct is necessarily pain. This is a problem often faced by researchers attempting to validate tools assessing a construct where there is no clear gold standard.89,94 To accept these correlations as evidence of the validity of the FLACC scale, the validity of the comparator scale must be established for the population and circumstances to which they are applied for comparison purposes. The authors of the studies included in this review using this approach claim validity for the comparator scales but frequently cite unconvincing evidence.

In the absence of a gold standard for comparison, researchers use multiple approaches to validation. Responsiveness is an example of a technique considered well suited to assessing the validity of pain scores. Unfortunately, many of the studies where the responsiveness of the FLACC scale to anticipated changes in pain is demonstrated did not blind observers to these circumstances. In their study measuring the reliability of the FLACC scale applied to children receiving an immunisation, the data from 2 studies confirm the notion that clinicians alter their scores to account for the circumstances at the time.33 Furthermore, as pain behaviours may also be the behaviours of a frightened child, responsiveness may in fact be to changes in the child's levels of fear and anxiety.

Psychometric properties are not intrinsic to the scale but rather an interaction between the scale, the population to which it is applied and the circumstances under which it is applied.94 This review has drawn attention to the diversity of populations and circumstances to which the FLACC scale has been tested and applied. Previous reviews have largely focused on the distinction between assessment of pain in children experiencing postoperative and procedural pain and have made recommendations for use of the FLACC scale for these 2 groups. It is tempting to consider that the data supporting the psychometrics of the scale applied to one cohort can be unreservedly contributed to the evidence for the psychometrics of another. However, it is widely proposed and supported by a growing body of evidence that the pain behaviours of children vary significantly with age. A number of studies show that the behaviours of premature neonates are blunted when compared with those of their term contemporaries.24,60,61,78,79,111 The results of 2 studies from this review also challenge the notion that pain behaviours are consistent across age groups.83,112 Correlations between FLACC scores and self-report varied across the age groups included in these studies, 3 to 7 years old112 and 5 to 16 years old.83 It is not clear where the boundaries exist to distinguish one age related cohort demonstrating unique pain behaviours from another. However, scales have been developed to assess neonates and preterm infants on the assumption that they are likely to exhibit unique response to pain compared with older infants. Until data are available to demonstrate that infants less than 2 months of age and children older than 7 years behave similarly to infants and children aged 2 months to 7 years, studies including infants and children outside the age limits for which FLACC was originally developed and tested must be considered to include a new population and will need reliability and validity data to address application in this population. Similarly, studies addressing application of the scale where there are other population or circumstantial differences are needed to provide psychometric data.

Unfortunately, the small numbers of studies evaluating the measurement properties of the FLACC scale in discrete and definable groups, the limitations in the methods of many of these studies, and the concerns about the validity of the item descriptors make it difficult to confidently make recommendations about the psychometrics of the FLACC scale. However, clinicians and researchers continue to seek guidance about the role of the FLACC scale in assessing pain in different populations of children and circumstances so despite these significant limitations, recommendations based on the strength of available evidence are provided.

4.1. Recommendations

The weight of evidence for the reliability and validity of the FLACC scale applied to infants and children aged 2 months to 7 years experiencing postoperative pain is sufficient to recommend the scales' use under these circumstances. However, in the absence of feasibility data, it is not possible to recommend the scale as feasible for practice. Similarly, as the evidence is limited to the results of 1 study, it is only possible to suggest that the scale may have clinical utility when applied to this population and setting.

4.1.1. Age

There are insufficient data supporting the measurement properties of the scale used to assess postoperative pain in infants younger than 2 months of age, in particular, in neonates, to recommend the FLACC for this age group. The body of evidence addressing application of the scale to older children is larger and is sufficient to cautiously recommend the FLACC scale as valid for assessing postoperative pain in older children. However, no recommendations about the scale's feasibility or clinical utility under these circumstances can be made.

4.1.2. Cognitive impairment

There are also sufficient data to cautiously recommend the FLACC scale, particularly after modification to better suit the child, as valid for assessing pain in children aged 4 to 19 years with cognitive impairment. Based on limited data, application in this cohort is probably feasible and likely to be clinically useful, but it is insufficient to make a stronger assertion.

4.1.3. Procedural pain

For reasons elucidated earlier, accepting that available evidence supports the psychometrics of the FLACC scale used to measure procedural pain is problematic. Since publication of the 2007 reviews, data have been published that contribute some evidence to reliability, add to the validity, and suggest a clinician preference for an alternative to measure procedural pain but continue to leave some key questions unanswered. It is no longer possible to recommend the FLACC scale for procedural pain assessment despite the absence of an acceptable alternative.

4.1.4. Language and other modifications

The FLACC scale has been evaluated after modification, most often translation to another language. However, most are single studies of variable quality testing the measurement properties of the modified scale. Many of these studies report positive results but in the absence of a greater weight of evidence, no recommendation to support the scale's use in these circumstances can be made.

4.2. Future directions

There are a number of concerns regarding the FLACC scale that need to be addressed before recommendations for its use in clinical and research practice can be confirmed. Review of the appropriateness of the descriptors for the items of the scale, specifically the faces item, is urgently needed. This may culminate in modification of the scale, provoking the need for evaluation of the measurement properties of the newly modified scale. Before attempts to validate this or a modified version of the scale take place, the feasibility of using the scale in various populations and in a range of circumstances and clinical settings should also be explored. Adaptation of the scale or the descriptors to account for the circumstances of a procedure, eg, restraint, may also be needed to improve the feasibility and validity of the scale for these circumstances. Furthermore, modifications to the scale should be informed by a better understanding of the measurement properties of individual scale items and their relationship with the other scale items and the total score. To date, there are insufficient data for the various populations and circumstances to which the scale is applied in these studies to support the measurement properties of the scale items.

Additionally, the capacity for scales to discriminate between pain and distress is a cause for concern. Blount contends that behavioural indices are unlikely to be specific to pain or distress, which would make this discrimination unachievable.11 However, to continue using the FLACC scale in circumstances where the aim is to measure pain independently of distress, for example, in studies evaluating the efficacy of pain relief, data demonstrating the capacity of the FLACC scale to discriminate between pain and distress are required. Finally, more compelling evidence demonstrating reliability, validity, and clinical utility for the range of populations, circumstances, and settings to which the scale is applied is also needed.

To provide this psychometric data, careful study design using robust methods for validation is required to reduce the bias that detracts from the results of many existing studies. The tools used in this review such as the COSMIN checklist, although not designed for this purpose, could be used to guide the development of the methods for future psychometric evaluation studies.

Finally, there are some promising data to support future research efforts focusing on the FLACC scale. However, review of alternate behavioural scales and pain assessment modalities to identify a method with better psychometric and practical properties than the FLACC scale for the assessment of pain in children is also warranted.

4.3. Limitations

There are a number of limitations to this review. A positive publication bias may well mean that unpublished data are available which conflicts with the bulk of the published data supporting the reliability, validity, and clinical utility of the FLACC scale. Similarly, excluding studies not published in English may have denied a source of data that support translation of FLACC to another language. Restriction of the focus of this study to patients from birth to 18 years means that data examining the measurement properties of FLACC in adults have not been included. The role of FLACC to assess pain in nonverbal adults is being increasingly explored but readers should be cautioned against generalising the results of this review to this population.

Only a small number of studies provided data addressing detailed item analysis and none applying the FLACC under consistent circumstances or to a consistent population. Hence, item analysis has not been provided in this review.

There were no appropriate objective criteria available to quantify the weight of evidence considered sufficient to demonstrate the measurement properties of an assessment tool. In this review, the weight of evidence considered adequate was derived from a subjective assessment by the authors of this review.

5. Conclusions

The results of this review challenge the long-held view that the strong psychometric properties of the FLACC scale supports its use to assess pain in children from infancy to adolescents in a range of circumstances. The data used to support the psychometric properties of the scale, in particular validity, are either absent or limited and frequently derived from studies with methodological flaws. Continued application of a scale designed for postoperative pain assessment to procedural pain assessment is unsupported. It is clear that further work is required to provide a foundation from which confident recommendations about the future of the FLACC scale in paediatric pain assessment can be made.

Conflict of interest statement

The authors have no conflicts of interest to declare.

Acknowledgements

D. J. Crellin is a trainee member of Pain in Child Health, a research training initiative of the Canadian Institutes of Health Research.

Supplemental Digital Content

Supplemental Digital Content associated with this article can be found online at http://links.lww.com/PAIN/A134, and http://links.lww.com/PAIN/A135.

References

[1]. Ahn Y, Jun Y. Measurement of pain-like response to various NICU stimulants for high-risk infants. Early Hum Dev 2007;83:255–62.
[2]. Amin SM. Evaluation of gabapentin and dexamethasone alone or in combination for pain control after adenotonsillectomy in children. Saudi J Anaesth 2014;8:317.
[3]. Anand VG, Kannan M, Thavamani A, Bridgit MJ. Effects of dexmedetomidine added to caudal ropivacaine in paediatric lower abdominal surgeries. Indian J Anaesth 2011;55:340–6.
[4]. Ashrey E, Bosat B. Single-injection penile block versus caudal block in penile pediatric surgery. Ain-Shams J Anaesthesiol 2014;7:428.
[5]. Babl FE, Crellin D, Cheng J, Sullivan TP, O'Sullivan R, Hutchinson A. The use of the faces, legs, activity, cry and consolability scale to assess procedural pain and distress in young children. Pediatr Emerg Care 2012;28:1281–96.
[6]. Babl FE, Goldfinch C, Mandrawa C, Crellin D, O'Sullivan R, Donath S. Does nebulized lidocaine reduce the pain and distress of nasogastric tube insertion in young children? A randomized, double-blind, placebo-controlled trial. Pediatrics 2009;123:1548–55.
[7]. Bai J, Hsu L, Tang Y, van Dijk M. Validation of the COMFORT Behavior scale and the FLACC scale for pain assessment in Chinese children after cardiac surgery. Pain Manag Nurs 2012;13:18–26.
[8]. Barrier G, Attia J, Mayer MN, Amiel-Tison C, Shnider SM. Measurement of post-operative pain and narcotic administration in infants using a new clinical scoring system. Intensive Care Med 1989;15(suppl 1):S37–39.
[9]. Batra YK, Rajeev S, Panda NB, Lokesh VC, Rao KLN. Intrathecal neostigmine with bupivacaine for infants undergoing lower abdominal and urogenital procedures: dose response. Acta Anaesthesiol Scand 2009;53:470–5.
[10]. Bharti N, Praveen R, Bala I. A dose-response study of caudal dexmedetomidine with ropivacaine in pediatric day care patients undergoing lower abdominal and perineal surgeries: a randomized controlled trial. Paediatr Anaesth 2014;24:1158–63.
[11]. Blount RL, Loiselle KA. Behavioural assessment of pediatric pain. Pain Res Manag 2009;14:47–52.
[12]. Boots BK, Edmundson EE. A controlled, randomised trial comparing single to multiple application lidocaine analgesia in paediatric patients undergoing urethral catheterisation procedures. J Clin Nurs 2010;19:744–8.
[13]. Breau LM, McGrath PJ, Craig KD, Santor D, Cassidy KL, Reid GJ. Facial expression of children receiving immunizations: a principal components analysis of the child facial coding system. Clin J Pain 2001;17:178–86.
[14]. Bringuier S, Picot MC, Dadure C, Rochette A, Raux O, Boulhais M, Capdevila X. A prospective comparison of post-surgical behavioral pain scales in preschoolers highlighting the risk of false evaluations. PAIN 2009;145:60–8.
[15]. Broadman L, Rice L, Hannallah RS. Testing the validity of an objective pain scale for infants and children. Anesthesiology 1988;69:A770.
[16]. Brown NJ, Kimble RM, Rodger S, Ware RS, Cuttle L. Play and heal: randomized controlled trial of Ditto intervention efficacy on improving re-epithelialization in pediatric burns. Burns 2014;40:204–13.
[17]. Chadha NK, Lam GOA, Ludemann JP, Kozak FK. Intranasal topical local anesthetic and decongestant for flexible nasendoscopy in children: a randomized, double-blind, placebo-controlled trial. JAMA Otolaryngol Head Neck Surg 2013;139:1301–5.
[18]. Chandler JR, Myers D, Mehta D, Whyte E, Groberman MK, Montgomery CJ, Ansermino JM. Emergence delirium in children: a randomized trial to compare total intravenous anesthesia with propofol and remifentanil to inhalational sevoflurane anesthesia. Paediatr Anaesth 2013;23:309–15.
[19]. Chang J, Versloot J, Fashler SR, McCrystal KN, Craig KD. Pain assessment in children: validity of facial expression items in observational pain scales. Clin J Pain 2015;31:189–97.
[20]. Cho JE, Kim JY, Hong JY, Kil HK. The addition of fentanyl to 1.5 mg/ml ropivacaine has no advantage for paediatric epidural analgesia. Acta Anaesthesiol Scand 2009;53:1084–7.
[21]. Chorney JM, Tan ET, Martin SR, Fortier MA, Kain ZN. Children's behavior in the postanesthesia care unit: the development of the child behavior coding system-PACU (CBCS-P). J Pediatr Psychol 2011;37:338–47.
[22]. Cochrane Collaboration. Data collection form for intervention reviews: RCTs and non-RCTs. The Cochrane Collaboration, 2014. http://www.cochrane.org/search/site/Data%20collection%20form%20Intervention%20review%20%E2%80%93%20RCTs%20and%20non-RCTs. Accessed 8 September 2014.
[23]. Cohen LL, La Greca AM, Blount RL, Kazak AE, Holmbeck GN, Lemanek KL. Introduction to special issue: evidence-based assessment in pediatric psychology. J Pediatr Psychol 2008;33:911–15.
[24]. Craig KD, Whitfield MF, Grunau RV, Linton J, Hadjistavropoulos HD. Pain in the preterm neonate: behavioural and physiological indices. PAIN 1993;52:287–99.
[25]. Crellin D, Babl FE, Sullivan TP, Cheng J, O'Sullivan R, Hutchinson A. Procedural restraint use in preverbal and early-verbal children. Pediatr Emerg Care 2011;27:622–7.
[26]. Crellin D, Santamaria N, Babl F, Harrison D. A systematic review of the Faces Legs Activity Cry Consolability (FLACC) pain scale for assessment of pain in children. 2014. PROSPERO. International Prospective Register of Systematic Reviews. Available at: http://www.crd.york.ac.uk/PROSPERO/display_record.asp?ID=CRD42014014296. Accessed 17 October 2014.
[27]. Crellin D, Sullivan TP, Babl FE, O'Sullivan R, Hutchinson A. Analysis of the validation of existing behavioral pain and distress scales for use in the procedural setting. Paediatr Anaesth 2007;17:720–33.
[28]. Crichton A, Knight S, Oakley E, Babl FE, Anderson V. Fatigue in child chronic health conditions: a systematic review of assessment instruments. Pediatrics 2015;135:e1015–31.
[29]. Curry DM, Brown C, Wrona S. Effectiveness of oral sucrose for pain management in infants during immunizations. Pain Manag Nurs 2012;13:139–49.
[30]. Curtis SJ, Jou H, Ali S, Vandermeer B, Klassen T. A randomized controlled trial of sucrose and/or pacifier as analgesia for infants receiving venipuncture in a pediatric emergency department. BMC Pediatr 2007;7:27.
[31]. da Silva FC, Santos Thuler LC, de Leon-Casasola OA. Validity and reliability of two pain assessment tools in Brazilian children and adolescents. J Clin Nurs 2011;20:1842–8.
[32]. da Silva FC, Thuler LCS. Cross-cultural adaptation and translation of two pain assessment tools in children and adolescents. J Pediatr 2008;84:344–9.
[33]. De Ruddere L, Goubert L, Stevens MA, Deveugele M, Craig KD, Crombez G. Health care professionals' reactions to patient pain: impact of knowledge about medical evidence and psychosocial influences. J Pain 2014;15:262–70.
[34]. de Tovar C, von Baeyer CL, Wood C, Alibeu JP, Houfani M, Arvieux C. Postoperative self-report of pain in children: interscale agreement, response to analgesic, and preference for a faces scale and a visual analogue scale. Pain Res Manag 2010;15:163–8.
[35]. Decruynaere C, Thonnard JL, Plaghki L. How many response levels do children distinguish on faces scales for pain assessment? Eur J Pain 2009;13:641–8.
[36]. Dewhirst E, Fedel G, Raman V, Rice J, Barry ND, Jatana KR, Elmaraghy C, Merz M, Tobias JD. Pain management following myringotomy and tube placement: intranasal dexmedetomidine versus intranasal fentanyl. Int J Pediatr Otorhinolaryngol 2014;78:1090–4.
[37]. Diao M, Li L, Cheng W. To drain or not to drain in Roux-en-Y hepatojejunostomy for children with choledochal cysts in the laparoscopic era: a prospective randomized study. J Pediatr Surg 2012;47:1485–9.
[38]. Duhn LJ, Medves JM. A systematic integrative review of infant pain assessment tools. Adv Neonatal Care 2004;4:126–40.
[39]. Ekman P, Friesen WV. Facial action coding system: A technique for the measurement of facial movement. Palo Alto: Consulting Psychologists Press, 1978.
[40]. El-Sharkawi HFA, El-Housseiny AA, Aly AM. Effectiveness of new distraction technique on pain associated with injection of local anesthesia for children. Pediatr Dent 2012;34:e35–38.
[41]. Elshammaa N, Chidambaran V, Housny W, Thomas J, Zhang X, Michael R. Ketamine as an adjunct to fentanyl improves postoperative analgesia and hastens discharge in children following tonsillectomy—a prospective, double-blinded, randomized study. Paediatr Anaesth 2011;21:1009–14.
[42]. Fernandes ML, Pires KCC, Tiburcio MA, Gomez RS. Caudal bupivacaine supplemented with morphine or clonidine, or supplemented with morphine plus clonidine in children undergoing infra-umbilical urological and genital procedures: a prospective, randomized and double-blind study. J Anesth 2012;26:213–18.
[43]. Fournier-Charrière E, Tourniaire B, Carbajal R, Cimerman P, Lassauge F, Ricard C, Reiter F, Turquin P, Lombart B, Letierce A. EVENDOL, a new behavioral pain scale for children ages 0 to 7 years in the emergency department: design and validation. PAIN 2012;153:1573–82.
[44]. Frawley GP, Downie S, Huang GH. Levobupivacaine caudal anesthesia in children: a randomized double-blind comparison with bupivacaine. Paediatr Anaesth 2006;16:754–60.
[45]. Ghai B, Ram J, Makkar JK, Wig J, Kaushik S. Subtenon block compared to intravenous fentanyl for perioperative analgesia in pediatric cataract surgery. Anesth Analg 2009;108:1132–8.
[46]. Gilbert CA, Lilley CM, Craig KD, McGrath PJ, Court CA, Bennett SM, Montgomery CJ. Postoperative pain expression in preschool children: validation of the child facial coding system. Clin J Pain 1999;15:192–200.
[47]. Gomez RJ, Barrowman N, Elia S, Manias E, Royle J, Harrison D. Establishing intra- and inter-rater agreement of the face, legs, activity, cry, consolability scale for evaluating pain in toddlers during immunization. Pain Res Manag 2013;18:e124–8.
[48]. Grove GL, Zerweck CR, Ekholm BP, Smith GE, Koski NI. Randomized comparison of a silicone tape and a paper tape for gentleness in healthy children. J Wound Ostomy Continence Nurs 2014;41:40–8.
[49]. Hall NJ, Pacilli M, Eaton S, Reblock K, Gaines BA, Pastor A, Langer JC, Koivusalo AI, Pakarinen MP, Stroedter L, Beyerlein S, Haddad M, Clarke S, Ford H, Pierro A. Recovery after open versus laparoscopic pyloromyotomy for pyloric stenosis: a double-blind multicentre randomised controlled trial. Lancet 2009;373:390–8.
[50]. Hamers JP, Huijer Abu-Saad H, Geisler FE, van den Hout MA, Schouten HJ, Halfens RJ, van Suijlekom HA. The effect of paracetamol, fentanyl, and systematic assessments on children's pain after tonsillectomy and adenoidectomy. J Perianesth Nurs 1999;14:357–66.
[51]. Harrison D, Sampson M, Reszel J, Abdulla K, Barrowman N, Cumber J, Fuller A, Li C, Nicholls S, Pound CM. Too many crying babies: a systematic review of pain management practices during immunizations on YouTube. BMC Pediatr 2014;14:134.
[52]. Hartrick CT, Kovan JP. Pain assessment following general anesthesia using the Toddler Preschooler Postoperative Pain Scale: a comparative study. J Clin Anesth 2002;14:411–15.
[53]. Hippard HK, Govindan K, Friedman EM, Sulek M, Giannoni C, Larrier D, Minard CG, Watcha MF. Postoperative analgesic and behavioral effects of intranasal fentanyl, intravenous morphine, and intramuscular morphine in pediatric patients undergoing bilateral myringotomy and placement of ventilating tubes. Anesth Analg 2012;115:356–63.
[54]. Hong JY, Han S, Kim W, Kim E, Kil H. Effect of dexamethasone in combination with caudal analgesia on postoperative pain control in day-case paediatric orchiopexy. Br J Anaesth 2010;105:506–10.
[55]. Hong JY, Lee I, Shin S, Park E, Ban S, Cho J, Kil H. Caudal midazolam does not affect sevoflurane requirements and recovery in pediatric day‐case hernioplasty. Acta Anaesthesiol Scand 2008;52:1411–14.
[56]. Hughes J, Lindup M, Wright S, Naik M, Dhesi R, Howard R, Sommerlad B, Kangesu L, Sury M. Does nasogastric feeding reduce distress after cleft palate repair in infants? Nurs Child Young People 2013;25:26–30.
[57]. Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJ, Gavaghan DJ, McQuay HJ. Assessing the quality of reports of randomized clinical trials: is blinding necessary? Control Clin Trials 1996;17:1–12.
[58]. Jindal P, Khurana G, Dvivedi S, Sharma JP. Intra and postoperative outcome of adding clonidine to bupivacaine in infraorbital nerve block for young children undergoing cleft lip surgery. Saudi J Anaesth 2011;5:289–94.
[59]. Johansson M, Kokinsky E. The COMFORT behavioural scale and the modified FLACC scale in paediatric intensive care. Nurs Crit Care 2009;14:122–30.
[60]. Johnston CC, Stevens B, Craig KD, Grunau RV. Developmental changes in pain expression in premature, full-term, two- and four-month-old infants. PAIN 1993;52:201–8.
[61]. Johnston CC, Stevens B, Yang F, Horton L. Developmental changes in response to heelstick in preterm infants: a prospective cohort study. Dev Med Child Neurol 1996;38:438–45.
[62]. Jonnavithula N, Durga P, Kulkarni DK, Ramachandran G. Bilateral intra-oral, infra-orbital nerve block for postoperative analgesia following cleft lip repair in paediatric patients: comparison of bupivacaine vs bupivacaine-pethidine combination. Anaesthesia 2007;62:581–5.
[63]. Jonnavithula N, Durga P, Madduri V, Ramachandran G, Nuvvula R, Srikanth R, Damalcheruvu MR. Efficacy of palatal block for analgesia following palatoplasty in children with cleft palate. Paediatr Anaesth 2010;20:727–33.
[64]. Kil HK, Kim WO, Han SW, Kwon Y, Lee A, Hong JY. Psychological and behavioral effects of chloral hydrate in day-case pediatric surgery: a randomized, observer-blinded study. J Pediatr Surg 2012;47:1592–9.
[65]. Kim CH, Yoon JU, Lee HJ, Shin SW, Yoon JY, Byeon GJ. Availability of a 5% lidocaine patch used prophylactically for venipuncture- or injection-related pain in children. J Anesth 2012;26:552–5.
[66]. Kim NY, Kim SY, Yoon HJ, Kil HK. Effect of dexmedetomidine on sevoflurane requirements and emergence agitation in children undergoing ambulatory surgery. Yonsei Med J 2014;55:209–15.
[67]. Kundu A, Lin Y, Oron AP, Doorenbos AZ. Reiki therapy for postoperative oral pain in pediatric patients: pilot data from a double-blind, randomized clinical trial. Complement ther Clin Pract 2014;20:21–5.
[68]. Loetwiriyakul W, Asampinwat T. Caudal block with 3 mg/Kg Bupivacaine for intraabdominal surgery in pediatric patients: a randomized study. Asian Biomed 2011;5:93–9.
[69]. Lorenzo AJ, Lynch J, Matava C, El-Beheiry H, Hayes J. Ultrasound guided transversus abdominis plane vs surgeon administered intraoperative regional field infiltration with bupivacaine for early postoperative pain control in children undergoing open pyeloplasty. J Urol 2014;192:207–13.
[70]. Lucas NP, Macaskill P, Irwig L, Bogduk N. The development of a quality appraisal tool for studies of diagnostic reliability (QAREL). J Clin Epidemiol 2010;63:854–61.
[71]. Malviya S, Voepel-Lewis T, Burke C, Merkel S, Tait AR. The revised FLACC observational pain tool: improved reliability and validity for pain assessment in children with cognitive impairment. Paediatr Anaesth 2006;16:258–65.
[72]. Manworren RCB, Hynan LS. Clinical validation of FLACC: preverbal patient pain scale. Pediatr Nurs 2003;29:140–6.
[73]. McGrath PJ, Johnson G, Goodman JT, Schillinger J, Dunn J, Chapman J. CHEOPS: a behavioural scale for rating postoperative pain in children. In: Fields HL, Dubner R, Cervero F, editors. Advances in Pain Research and Therapy, Volume 9. New York: Raven Press, 1985. p. 395–402.
[74]. Merkel SI, Voepel-Lewis T, Shayevitz JR, Malviya S. The FLACC: a behavioral scale for scoring postoperative pain in young children. Pediatr Nurs 1997;23:293–7.
[75]. Miller K, Rodger S, Bucolo S, Greer R, Kimble RM. Multi-modal distraction. Using technology to combat pain in young children with burn injuries. Burns 2010;36:647–58.
[76]. Miller K, Rodger S, Kipping B, Kimble R. A novel technology approach to pain management in children with burns: a prospective randomized controlled trial. Burns 2011;37:395–405.
[77]. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 2009;6:e1000097.
[78]. Morison SJ, Grunau RE, Oberlander TF, Whitfield MF. Relations between behavioral and cardiac autonomic reactivity to acute pain in preterm neonates. Clin J Pain 2001;17:350–8.
[79]. Morison SJ, Holsti L, Grunau RE, Whitfield MF, Oberlander TF, Chan HW, Williams L. Are there developmentally distinct motor indicators of pain in preterm infants? Early Hum Dev 2003;72:131–46.
[80]. Natarajan Surendar M, Kumar Pandey R, Kumar Saksena A, Kumar R, Chandra G. A comparative evaluation of intranasal dexmedetomidine, midazolam and ketamine for their sedative and analgesic properties: a triple blind randomized study. J Clin Pediatr Dent 2014;38:255–61.
[81]. Newbury C, Herd DW. Amethocaine versus EMLA for successful intravenous cannulation in a children's emergency department: a randomised controlled study. Emerg Med J 2009;26:487–91.
[82]. Nilsson S, Enskär K, Hallqvist C, Kokinsky E. Active and passive distraction in children undergoing wound dressings. J Pediatr Nurs 2013;28:158–66.
[83]. Nilsson S, Finnstrom B, Kokinsky E. The FLACC behavioral scale for procedural pain assessment in children aged 5-16 years. Paediatr Anaesth 2008;18:767–74.
[84]. Nilsson S, Kokinsky E, Nilsson U, Sidenvall B, Enskar K. School-aged children's experiences of postoperative music medicine on pain, distress, and anxiety. Paediatr Anaesth 2009;19:1184–90.
[85]. Nord D, Belew J. Effectiveness of the essential oils lavender and ginger in promoting children's comfort in a perianesthesia setting. J Perianesthesia Nurs 2009;24:307–12.
[86]. Ramelet AS, Rees N, McDonald S, Bulsara M, Abu-Saad HH. Development and preliminary psychometric testing of the Multidimensional Assessment of Pain Scale: MAPS. Paediatr Anaesth 2007;17:333–40.
[87]. Ramelet AS, Rees NW, McDonald S, Bulsara MK, Huijer Abu-Saad H. Clinical validation of the multidimensional assessment of pain scale. Paediatr Anaesth 2007;17:1156–65.
[88]. Ranger M, Celeste Johnston C, Rennick JE, Limperopoulos C, Heldt T, du Plessis AJ. A multidimensional approach to pain assessment in critically Ill infants during a painful procedure. Clin J Pain 2013;29:613–20.
[89]. Rutjes AWS, Reitsma JB, Coomarasamy A, Khan KS, Bossuyt PMM. Evaluation of diagnostic tests when there is no gold standard. A review of methods. Health Technol Assess 2007;11:iii, ix–51.
[90]. Saha N, Saha DK, Rahman MA, Islam MK, Aziz MA. Comparison of post operative morbidity between laparoscopic and open appendectomy in children. Mymensingh Med J 2010;19:348–52.
[91]. Sethi S, Ghai B, Ram J, Wig J. Postoperative emergence delirium in pediatric patients undergoing cataract surgery—a comparison of desflurane and sevoflurane. Paediatr Anaesth 2013;23:1131–7.
[92]. Singh R, Kharbanda M, Sood N, Mahajan V, Chatterji C. Comparative evaluation of incidence of emergence agitation and post-operative recovery profile in paediatric patients after isoflurane, sevoflurane and desflurane anaesthesia. Indian J Anaesth 2012;56:156–61.
[93]. Stanford EA, Chambers CT, Craig KD. The role of developmental factors in predicting young children's use of a self-report scale for pain. PAIN 2006;120(1-2):16–23.
[94]. Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. Oxford University Press: Oxford, United Kingdom, 2008.
[95]. Stuth EA, Berens RJ, Staudt S, Robertson FA, Scott JP, Stucke AG, Hoffman GM, Troshynski TJ, Tweddell JS, Zuperku EJ. The effect of caudal vs intravenous morphine on early extubation and postoperative analgesic requirements for stage 2 and 3 single‐ventricle palliation: a double blind randomized trial. Pediatr Anesth 2011;21:441–53.
[96]. Suraseranivongse S, Kraiprasit K, Petcharatana S. Postoperative pain assessment in ambulatory pediatric patients by parents. J Med Assoc Thai 2002;85(suppl 3):S917–922.
[97]. Suraseranivongse S, Santawat U, Kraiprasit K, Petcharatana S, Prakkamodom S, Muntraporn N. Cross-validation of a composite pain scale for preschool children within 24 hours of surgery. Br J Anaesth 2001;87:400–5.
[98]. Taddio A, Hogan ME, Moyer P, Girgis A, Gerges S, Wang L, Ipp M. Evaluation of the reliability, validity and practicality of 3 measures of acute pain in infants undergoing immunization injections. Vaccine 2011;29:1390–4.
[99]. Takmaz SA, Uysal HY, Uysal A, Kocer U, Dikmen B, Baltaci B. Bilateral extraoral, infraorbital nerve block for postoperative pain relief after cleft lip repair in pediatric patients: a randomized, double-blind controlled study. Ann Plast Surg 2009;63:59–62.
[100]. Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res 2012;21:651–7.
[101]. Townsend JA, Ganzberg S, Thikkurissy S. The effect of local anesthetic on quality of recovery characteristics following dental rehabilitation under general anesthesia in children. Anesth Prog 2009;56:115–22.
[102]. Tsze DS, von Baeyer CL, Bulloch B, Dayan PS. Validation of Self-Report Pain Scales in Children. Pediatrics 2013;132:e971–9.
[103]. van Tulder MP, Furlan AMD, Bombardier CMDF, Bouter LP; The Editorial Board of the Cochrane Collaboration Back Review G. Updated method guidelines for systematic reviews in the cochrane collaboration back review group. Spine 2003;28:1290–9.
[104]. Vaughan M, Paton EA, Bush A, Pershad J. Does lidocaine gel alleviate the pain of bladder catheterization in young children? A randomized, controlled trial. Pediatrics 2005;116:917–20.
[105]. Voepel-Lewis T, Malviya S, Tait AR. Validity of parent ratings as proxy measures of pain in children with cognitive impairment. Pain Manag Nurs 2005;6:168–74.
[106]. Voepel-Lewis T, Malviya S, Tait AR, Merkel S, Foster R, Krane EJ, Davis PJ. A comparison of the clinical utility of pain assessment tools for children with cognitive impairment. Anesth Analg 2008;106:72–8, table of contents.
[107]. Voepel-Lewis T, Merkel S, Tait AR, Trzcinka A, Malviya S. The reliability and validity of the Face, Legs, Activity, Cry, Consolability observational tool as a measure of pain in children with cognitive impairment. Anesth Analg 2002;95:1224–9.
[108]. Voepel-Lewis T, Zanotti J, Dammeyer JA, Merkel S. Reliability and validity of the face, legs, activity, cry, consolability behavioral tool in assessing acute pain in critically ill patients. Am J Crit Care 2010;19:55–61; quiz 62.
[109]. Voepel-Lewis TD, Malviya S, Burke C, D'Agostino R, Hadden SM, Siewert M, Tait AR. Evaluation of simethicone for the treatment of postoperative abdominal discomfort in infants. J Clin Anesth 1998;10:91–4.
[110]. von Baeyer CL, Spagrud LJ. Systematic review of observational (behavioral) measures of pain for children and adolescents aged 3 to 18 years. PAIN 2007;127:150.
[111]. Williams AL, Khattak AZ, Garza CN, Lasky RE. The behavioral pain response to heelstick in preterm neonates studied longitudinally: description, development, determinants, and components. Early Hum Dev 2009;85:369–74.
[112]. Willis MHW, Merkel SI, Voepel-Lewis T, Malviya S. FLACC behavioural pain assessment scale: a comparison with a child's self-report. Pediatr Nurs 2003;29:195–8.
[113]. Zhou H, Roberts P, Horgan L. Association between self-report pain ratings of child and parent, child and nurse and parent and nurse dyads: meta-analysis. J Adv Nurs 2008;63:334–42.
[114]. Zier JL, Rivard PF, Krach LE, Wendorf HR. Effectiveness of sedation using nitrous oxide compared with enteral midazolam for botulinum toxin A injections in children. Dev Med Child Neurol 2008;50:854–8.
Keywords:

Pain assessment; FLACC scale; Infant; Child; Psychometric properties

Supplemental Digital Content

© 2015 International Association for the Study of Pain