Czaikowski, Brianna L.; Liang, Hong; Stewart, C. Todd
Despite advances in technology and state-of-the-art monitoring devices, a comprehensive clinical assessment is the means to recognize subtle changes in a patient’s neurological status, whether conscious or not. It also must allow for accurate and consistent communication of these changes by nurses and other healthcare providers. To assist in this assessment and thereby provide quality patient care, several validated neurological assessment tools have been developed over the past 50 years.
In 1966, one of the first neurological assessment tools developed was the Ommaya “vital sign” card (Cohen, 2009; Ommaya, 1966). In 1974, it was expanded by Teasdale and Jennett (1974) into the Coma Index, then became known as the Glasgow Coma Scale (GCS). The GCS is used internationally, in both prehospital and hospital settings, to predict morbidity, mortality, and long-term outcomes in acute neuroscience patients. Over the years, many efforts have been made to improve the GCS, yet these tools have rarely been accepted into practice (Cohen, 2009).
The Full Outline of UnResponsiveness (FOUR) Score Coma Scale was developed in 2005 by researchers at the Mayo Clinic, more specifically, Dr. Eelco Wijdicks, to enhance the clinical assessment of patients by improving communication among healthcare personnel (Wijdicks, Bamlet, Maramattom, Manno, & McClelland, 2005). This original FOUR Score Scale was aimed toward adult patients who were not sedated; however, further studies showed its simplicity, ability to be used outside of the neurological intensive care unit (ICU), and ability to assess patients receiving mild sedation.
The FOUR Score Scale was validated in the adult population and proven to be an effective alternative to the GCS (which was tested and validated in only adult trauma, mainly head injury, patients); however, it was not validated in a pediatric population until 2009. The Cohen (2009) study focused on validating the FOUR Score Scale in pediatric patients, with the exclusion of patients receiving sedatives and/or neuromuscular blocking agents. Cohen found the overall reliability of the FOUR Score Scale to be excellent in this population. On the basis of these findings, we wanted to expand on this area of work by modifying the original FOUR Score Scale specifically for pediatric patients of all age groups and across all developmental stages.
The purpose of this study was to determine whether the Pediatric FOUR Score Scale (PFSS) could enhance the clinical neurological assessment of pediatric intensive care patients, including those intubated and/or sedated, at our children’s hospital. This would be assessed through analysis of the interrater reliability of the PFSS used by nurses and the evaluation of the validity of the PFSS in predicting morbidity, mortality, and long-term outcomes compared with the GCS.
Loss of consciousness can occur any time there is an interruption in blood flow or oxygenation to the brain, because the brain is the most metabolically active organ in the body, and it has no effective way to store oxygen and glucose. Accurate assessments of the level of consciousness lost are necessary in evaluating a patient’s neurological status and therefore determining appropriate medical care (Cohen, 2009).
The GCS measures three aspects of consciousness: eye, motor, and verbal responses. A numeric value is given dependent on the responses, with the greatest score being 15 and lowest score of 3 given to brain death patients (Matis & Birbilis, 2008; Rabiu, 2011). Although the GCS has been widely used in all aspects of healthcare, limitations exist, especially in the ICUs because it was only tested and validated in adult trauma, mostly head injury, patients. These limitations include, but are not limited to, the following: (1) the verbal component cannot be tested in intubated patients; (2) sedated patients cannot respond appropriately; (3) it does not include brainstem reflexes or changes in breathing patterns, which reflect severity of coma; and (4) it does not include pediatric developmental milestones (Cohen, 2009; Matis & Birbilis, 2008; Rabiu, 2011; Wijdicks et al., 2005). Other scales have been developed to overcome some of the limitations of the GCS, but they have fallen short on interrater reliability and are seldom used outside their country of origin or area of medical specialty.
The FOUR Score Scale consists of four components (eye response, motor response, brainstem reflexes, and respiration). It also recognizes the locked-in syndrome and stages of herniation, providing families with a better understanding of neurological predictive outcome. The scale has been validated among the adult population, and its interrater reliability and validity have also been documented within multiple areas of nursing, such as the neuro-ICU, medical ICU (MICU), emergency department (ED), and pediatric populations (Cohen, 2009; Iyer et al., 2009; Stead et al., 2009).
The original study (Wijdicks et al., 2005) included 120 adult intensive care patients, categorized to cover all realms of the neurological population, and compared the interrater reliability among neurointensive nurses, neurology residents, and neurointensivists to the GCS. The scale showed excellent interrater reliability (κw = 0.82; 95% confidence interval [CI] [0.77, 0.88]) and a high degree of internal consistency (Cronbach’s α = 0.86 for the first rater and 0.87 for the second rater). To assess predictive validity of the FOUR Score Scale, sensitivity and specificity of it and the GCS were compared. The Modified Rankin Scale was used as the “gold standard” to assess the outcome of the patients, including in-hospital mortality and clinical diagnosis of brain death, and for comparison with the FOUR Score Scale and the GCS. The FOUR Score Scale was also found to provide greater neurological detail as compared with the GCS (Wijdicks et al., 2005). These significant advantages included elimination of the verbal component and incorporation of brain stem reflexes.
In 2008, Wijdicks published another cohort study composed of 69 patients with neurological complaints admitted in the ED. The raters using the FOUR Score Scale and the GCS were ED physicians, residents, and nurses. The overall kappa reported for the FOUR Score Scale was 0.882, compared with 0.862 for the GCS. Limitations included that approximately half of the studied population consisted of “alert” patients; therefore, incorporating more stuporous or comatose patients would have been desirable (Stead et al., 2009).
Dr. Wijdicks expanded his studies in 2009 to other populations of inpatients, including those in the MICU. The raters for this study included nurses, fellows, and ICU staff consultants. Excellent interrater reliability among staff with a κ value of 0.96 or greater was reported. Also, the FOUR Score Scale correctly identified 100 MICU patients with neurological disability of acute metabolic derangements, sepsis, and shock, meaning it matched the outcome of their other assessment criteria, the GCS. This study concluded that the FOUR Score Scale was a good predictor of the patient’s prognosis, even when expanding its use to include critically ill patients, and again pointed out many advantages over the GCS in another hospital setting (Iyer et al., 2009).
Also in 2009, the FOUR Score Scale was validated in a pediatric population. This application study followed 60 neuroscience patients, aged 2–18 years, over a 1-year period. Patients who received sedatives and neuromuscular blockades were excluded. Consistent with the original 2005 study, the Modified Rankin Scale was used to compare neurological functional outcomes. The raters for this study were 35 pediatric critical care nurses with clinical care experience ranging from 1 to 40 years. Comparatively, the overall reliability for the pediatric GCS was good (κw = 0.74, 95% CI [0.59, 0.87]) but was found to be excellent for the FOUR Score Scale (κw = 0.95, 95% CI [0.91, 0.99]). Sensitivity and specificity were incorporated into the calculation of the predictive validity (Cohen, 2009).
When doing a literature search to determine if the FOUR Score Scale was right for our children’s hospital, we noted that there was no current literature on a PFSS and that there were some limitations to the previous studies related to our specific patient population. Our eight-bed pediatric ICU (PICU) has an average census of 4.1 patients and a death rate of 2% (per 100 patients), based on a total average of 503 patients over a 1-year period. The limitations we identified included our small neuroscience and brain death population, our large sedated population, and various developmental levels of patients.
Because of these various limitations, we planned to modify the FOUR Score Scale for pediatric patients with the goal of prospectively studying the PFSS in ICU patients at our children’s hospital. Our current practice for assessing a patient’s neurological status is the use of the pediatric GCS for patients under the age of 2 years and the original GCS for all other patients, whereas the Richmond Agitation Sedation Scale (RASS) score is used as a sedation scale. However, this practice does not reflect an accurate neurological assessment, especially in intubated and/or sedated pediatric individuals. The GCS was developed to assess the level of consciousness of patients to aid in predicting their neurological prognosis. However, it was only tested and validated in adult patients with head injuries and requires a verbal response that cannot be obtained from intubated patients. The RASS relies on a patient’s auditory and visual acuity, so it is not suitable for patients with severe impairments. It also does not take into account sedation effects on neurological status and was never validated in pediatric patients. Thus, the PFSS would essentially combine these two scales (GCS and RASS) into one, which could be used to assess any pediatric patient despite their diagnosis and developmental level and whether they are intubated and/or sedated.
The PFSS includes (1) age appropriate responses for children, inclusive of all developmental milestones; (2) diagnosis, inclusive of sedated patients; and (3) age-appropriate respiratory rates. We included five different developmental categories based on the American Association of Critical Care Nurses guidelines: infant (0–12 months), toddler (1–3 years), preschooler (3–5 years), school-age (6–12 years), and adolescent (13+ years; Slota, 2006a).
Healthy infants will track objects and open their eyes spontaneously but will not follow direct commands. They also have a higher respiratory rate and an irregular breathing pattern compared with toddlers and adults. Healthy infants can sit supported, crawl, or walk depending on their age in months. They are also nonverbal but communicate well in other ways such as crying, smiling, and interacting with adults and other children (Ely et al., 2003); however, they cannot make the thumbs-up or peace-sign hand gestures. Toddlers provide a little more information. They begin with one-word sentences and expand their vocabulary quickly in a short time. Healthy toddlers can follow simple commands. Toddlers have a higher respiratory rate than adults, but lower than infants, and tend to also breathe in an irregular pattern; however, this does not mean they have increased work of breathing (Ely et al., 2003). Preschool and school-aged children can take simple commands and incorporate more difficult tasks. Adolescents are very similar to the adult population (Ely et al., 2003).
We also created the PFSS with the consideration of children in each of these developmental categories who are developmentally and/or physically disabled (e.g., Down’s syndrome, cerebral palsy, paralysis). Because the PFSS was modified from a developmental standpoint, our hopes were that the outcomes will not vary from those seen with the original FOUR Score Scale.
Normal respiratory rates for children were used from the Pediatric Basic Life Support algorithm: infant, <1 year, 30–60 breaths per minute (bpm); toddler, 1–3 years, 24–40 bpm; preschool, 4–5 years, 22–34 bpm; school-age, 6–12 years, 18–30 bpm; and adolescent, 13–18 years, 12–16 bpm (Berg et al., 2010). These rates, along with the developmental categories, are part of a normal physical examination of a child’s neurological system, which according to American Association of Critical Care Nurses guidelines, also includes general appearance, skull examination, level of consciousness, motor function including reflexes, sensory function, cerebellar function, cranial nerve function, fundoscopic examination, and vital signs (Slota, 2006b). The PFSS incorporates several of these examination parameters.
In the original FOUR Score Scale study, the Modified Rankin Scale was used as the “gold standard” to compare and contrast the adult GCS and the FOUR Score Scale. The Modified Rankin Scale was developed and validated on adult patients and is primarily used for stroke disability assessment (Wilson et al., 2005). It is not commonly used on children, because it lacks formal validation in pediatric populations. Because we have modified everything for the pediatric population, our “gold standard” was the Pediatric Cerebral Performance Category (PCPC) Scale, which was validated in the pediatric population. The PCPC was developed and used as a tool to assess functional morbidity and cognitive impairment after critical illness or injury in pediatric patients (Fiser, 1992). It projects probable outcomes when more extensive psychometric testing is not realistic or desirable and has been found to be significantly related to the Pediatric Risk of Mortality Score, morbidity, length of stay, and total hospital charges (Fiser et al., 2000). Both the Rankin and PCPC scales have also been tested and validated in the past years inclusive of interrater reliability (Fiser et al., 2000; Wijdicks et al., 2005). The interrater reliability was showed as excellent in the previous studies for the PCPC scale.
If the correlation is high between the two scales (PFSS and PCPC) and low between the GCS and PCPC, the PFSS will show association between parameters such as morbidity, length of stay, and severity of illness or injury and in turn disprove the GCS, like previous studies cited (Fiser et al., 2000). Therefore, the PFSS should potentially provide greater information than the formally used GCS when assessing critically ill neurologically impaired pediatric patients.
Before creating the PFSS (Figure 1), consent from Dr. Wijdicks to modify the original scale was obtained. The modifications made to the adult FOUR Score Scale to create the PFSS included (1) the removal of “or opened” from category 4 of the eye response. This was done to allow better assessment of the very young who cannot blink on command and certain developmentally disabled patients, such as those with cerebral palsy, whose eyes may be open but are not neurologically intact. (2) The addition of “age appropriate spontaneous movement without stimulation” to category 4 of the motor response was done to allow better assessment of all ages and developmental levels, because very young patients and some others with developmental and/or physical disabilities cannot give the thumbs-up or peace-sign hand gesture or make a fist. (3) “Not intubated” and “age appropriate” were added to the breathing pattern categories of the respirations section. This was intended to clarify when no intubation was in place and to point out that children of different ages have differing respiration rates. These modifications were made by the author (BLC) utilizing the Guidelines for Age-Appropriate Assessment and Nonpharmacologic Management of Pain (Slota, 2006b, 2006c) and then reviewed and approved by a pediatric intensivist (CTS) and pediatric neurologist (MAK) based on guidelines they follow from the American Academy of Neurology. We realize that the changes made to the adult FOUR Score Scale were minimal but felt they were necessary to improve the assessments made in a pediatric patient population, including those intubated and/or sedated as well as developmentally and/or physically disabled.
After institutional review board approval of this prospective study, PICU nurses were asked for their assistance in accomplishing the research. Eight PICU nurses volunteered and were trained as “raters.” Clinical PICU experience ranged from 1 year to 30 years. The training included a 1-hour session for review of the GCS scale (both the original for assessment of children 2 to 19 years and the pediatric for assessment of children up to 23 months of age), RASS, and PCPC and overview of the PFSS. Each nurse was also given an instructional card they could use as a reference during their assessments. They were then given four sample patients and asked to assess the patients using the PFSS. After the training session, each nurse agreed not to discuss their assessments with each other or any other nurses, so as not to bias the nurses’ care of the patients as well as affect the study outcomes. The principal investigator expressed the importance of this during the project, as we were studying the interrater reliability of nurses.
Eighty patients were consented, consecutively enrolled, and participated in the study. These included a variety of patients (cardiac, neurological, postoperative, and trauma) who were admitted to the multidisciplinary PICU, and patients could be intubated and/or receiving sedative agents, such as lorazepam, fentanyl, morphine, precedex, propofol, and versed. Patients receiving barbiturates, ketamine, and neuromuscular blockers were excluded because of their effects on intracranial pressure. Pregnant patients were also excluded.
Consistent with the previous studies (Cohen, 2009; Iyer et al., 2009; Stead et al., 2009), patients were categorized into four neurological groups based on their level of consciousness: (1) alert; (2) clouding of consciousness: “reduced wakefulness, confusion, and alternating drowsiness and hyperexcitablity”; (3) obtundation: “mild to moderate reduction in alertness, reduced interest in the environment, and increased periods of sleep”; and (4) stupor/coma: “unresponsive except to vigorous and repeated stimuli or no verbal or motor response to environmental stimuli” (Slota, 2006d). The nurses participating in the study assigned each patient to a neurological category at the start of each assessment session. This was done because a patient’s neurological status can change frequently, either improving or declining, during their PICU stay and/or because of any sedation that may be given. Therefore, the same patient could have been assigned to more than one category during their study participation.
Seven trained nurses could have assessed each subject during their PICU stay. Two different nurses independently, yet simultaneously (or within minutes of one another), assessed the PICU subjects using the PFSS, GCS, and RASS on three different occasions (between PICU admission and end of day 1, days 2–3, and day 4 or after, until PICU discharge). The nurses were not allowed to assess the same patient twice, and the nurse–rater teams and order in which the assessments (PFSS, GCS, RASS) were performed varied to reduce bias. The data were then compared with the PCPC as “the point of reference.” The PCPC of each subject was assessed by only one nurse (who did not use the PFSS, GCS, or RASS) at PICU admission and discharge to ensure consistency and to prevent the other nurses doing the PFSS, GCS, and RASS assessments from being biased by the PCPC data, which could also influence patient care. Outcome was assessed using the PCPC scale: 1 = normal, 2 = mild disability, 3 = moderate disability, 4 = severe disability, 5 = coma or vegetative state, and 6 = brain death (Fiser et al., 2000).
Descriptive statistics were used to summarize the patients’ demographic and clinical characteristics. To estimate interrater agreement, each patient (at several different time points) received a pairwise independent test by two different raters, and the agreement of the first and second test (of each score and total score for the PFSS, GCS, and RASS) was used. Weighted kappa (κw) values and their 95% CI as well as standard error (SE) were calculated to evaluate the reliabilities for the PFSS, GCS, and RASS. In our study, a κw statistic of greater than 0.80 was considered excellent agreement. Cronbach’s alpha coefficients were computed to assess the internal consistencies of the PFSS and GCS. Spearman’s correlation coefficient between the PFSS and the GCS was calculated to assess the criterion validity of the PFSS.
To assess the predictive validity of the PFSS, sensitivity and specificity of the PFSS and GCS were compared with the PCPC scale for the prediction of poor outcomes and in-hospital mortality at PICU admission and discharge. The area under the receiver operating characteristic (ROC) curve (AUC) and its 95% CI were calculated for PFSS and GCS using the average rating of the two nurses at each assessment time point. An ROC analysis curve compares the sensitivity (true rate of the outcome) with the false rate (1 = specificity), and as an AUC gets closer to 1.00, it indicates that the assessment tool is better able to identify the outcome state (Schonjans, 2008).
Of the 80 patients who participated in the study, 78 had a total of 121 paired-wised ratings and were used in the analysis. The average age of the patients was 6.34 years, and there were seven adverse events (deaths). Nine of the seventy-eight patients (11.5%) had a developmental and/or physical disability (i.e., autism, cerebral palsy, cerebral vascular accident, Guillame–Barre syndrome, spinal muscular atrophy type II). There were a total of 115 records categorizing the patients into the four neurological groups, with the distribution as follows: Alert, 62 patient times (53.91%); clouding of consciousness, 13 patient times (11.30%); obtundation, 27 patient times (23.48%); and stupor/coma, 13 patient times (11.30%). Other demographics and clinic outcomes are shown in Table 1.
The overall reliability was excellent for both the PFSS (κw = 0.89, 95% CI [0.83, 0.94]) and the GCS (κw = 0.89, 95% CI [0.84, 0.94]), whereas the κw of RASS was 0.67, which suggests good interrater agreement (Table 2). Table 2 also shows the reliabilities classified by intubation and day of PICU hospitalization. Cronbach’s alpha coefficient showed good internal consistency for the PFSS (Cronbach’s alpha = 0.78 for the first rating and 0.79 for the second rating) and the GCS (Cronbach’s alpha = 0.76 for the first rating and 0.77 for the second rating). Spearman’s correlation coefficients between the GCS and PFSS were high (p = .87 for the first rating and .89 for the second rating).
ROC analysis curves were estimated to compare prediction of poor outcome (defined as PCPC scores of 4–6) between the PFSS and the GCS. The AUC for the PFSS and GCS total scores were 0.9043 and 0.9054, respectively. The AUC of the PFSS was almost the same as for the GCS by comparing 0.9043 with 0.9054 (p = 0.9552). Sensitivity and specificity were 0.9333 and 0.7903, respectively, for a PFSS total score of 12 and 0.9333 and 0.8387, respectively, for a GCS total score of 9 (Figure 2). Unfortunately, a separate analysis could not be done to predict outcome in only the sedated and/or intubated patients because of the small sample sizes, n = 31 and n = 34, respectively.
To compare prediction of in-hospital mortality between the PFSS and the GCS, additional ROC analysis curves were estimated. The AUC for the PFSS and GCS total scores were 0.9296 and 0.9095, respectively. The AUC of the PFSS was slightly better than for the GCS by comparing 0.9296 with 0.9095 (p = 0.2113), which did not reach the statistical significance level of .05, and may be because of the small sample size (only seven death events) in this study. Sensitivity and specificity were 0.8574 and 0.9437, respectively, for a PFSS total score of 7 and 0.8571 and 0.9296, respectively, for a GCS total score of 6 (Figure 3).
The results of this study show that the PFSS is excellent for interrater reliabilities and for prediction of poor outcome and in-hospital mortality in a pediatric population. The κw values in the excellent range indicate that there is homogeneity in the nurse raters’ use of the PFSS, suggesting that it is an easy assessment tool that can be used reliably and consistently by nurses of varying experience levels. It can be used in sedated and/or intubated patients and at varying times during a patient’s hospitalization based on the weighted kappa values presented in Table 2. The PFSS’s highly accurate prediction value for poor outcome and in-hospital mortality makes it an informative tool for neurological assessments.
However, the study failed to show that the PFSS is better than the GCS. The results did not indicate any statistically significant difference in the interrater reliabilities for nurse–rater pairs under various situations between the PFSS and the GCS, and the prediction analyses were quite similar when correlated to the PCPC. This was similar to Cohen’s (2009) study that showed that the original FOUR Score Scale and GCS were comparable in predicting outcome in their pediatric study population.
Although our study did not show that the PFSS is better than the GCS, it also did not indicate that it was any worse than the GCS, which has only been validated in adult head trauma patients. Other beneficial findings from our study are that our patient population included (1) intubated and/or sedated patients and that the weighted kappa values for intubated and nonintubated patients were similar and (2) patients of all developmental stages, including those with developmental disabilities (e.g., Down’s syndrome) and physical disabilities (e.g., Cerebral palsy, paralysis). Hence, it can be used across a broad pediatric patient population. This study also expands on the work done by Cohen (2009).
The largest limitation of our study was the small sample size. Additional limitations included changes in the staffing of nurses and a large turnover rate of patients; some patients were only assessed one time and then discharged to the regular pediatric floor or home. Also, ~54% of our population was assessed to the “alert” neurological category. In addition, the use of sedative medications, including benzodiazepines and opiates, alters the brain causing central nervous system depression and can affect each patient differently. However, there was still correlation between the PFSS and GCS in sedated patients.
Enhancing the neurological assessment of a pediatric patient by the nursing community allows for the early intervention of medical emergencies, whether this means an immediate computed tomography or magnetic resonance imaging, additional laboratory values, the placement of an intracranial monitoring device, or emergency surgery. Also, having the potential to provide greater knowledge of the neurological assessment in sedated and/or intubated patients, thereby decreasing the number of ventilator days, incidence of ventilator-associated pneumonia, and accidental extubations, would be of benefit. This would also allow the nursing staff to provide the parents of patients who are intubated and/or sedated with a better understanding of their child’s neurological status. Therefore, having an assessment tool that is easy to use, has excellent interrater reliability, and can accurately predict outcomes in a broader pediatric patient population could greatly improve on the quality of patient care.
The demonstration of the reliabilities of the PFSS used by a nursing staff was a step in the right direction in getting this assessment tool noticed by the medical community that cares for pediatric neurological patients. The next step is to evaluate the PFSS in a larger number of patients across a broader spectrum of the pediatric population (i.e., range of ages, intubated and/or sedated, various developmental and/or physical disabilities), so it can eventually be accepted for use.
The authors dedicate this article to Dr. C. Todd Stewart, who tragically passed away after its acceptance. It was an honor and a pleasure to work with him for so many years on this project. Without his dedication, this article would not have been possible.
The authors thank Monica A. Koehn, MD, Department of Neurology (Pediatric Division) for her expert review of the PFSS and the Marshfield Clinic Research Foundation’s Office of Scientific Writing and Publication for assistance in the preparation of this manuscript.