Share this article on:

Reproducibility and Inter-rater Reliability of 2 Paediatric Nutritional Screening Tools

Galera-Martínez, Rafael*,†; Moráis-López, Ana*,‡; Rivero de la Rosa, Maria del C.*,§; Escartín-Madurga, Laura||; López-Ruzafa, Encarnación*,†; Ros-Arnal, Ignacio*,¶; Ruiz-Bartolomé, Hector; Rodríguez-Martínez, Gerardo*,||; Lama-More, Rosa A.*,#

Journal of Pediatric Gastroenterology and Nutrition: March 2017 - Volume 64 - Issue 3 - p e65–e70
doi: 10.1097/MPG.0000000000001287
Original Article: Nutrition

Objectives: The aim of the present study was to assess reproducibility and inter-rater reliability of 2 nutritional screening tools (NST): Screening Tool for Risk on Nutritional Status and Growth (STRONGkids) and Screening Tool for the Assessment of Malnutrition in Paediatrics (STAMP).

Methods: Prospective observational multicentre study. Patients ages 1 month or older admitted to paediatric or surgical wards were tested within 24 hours of admission by 2 independent observers: experts specialized in paediatric nutrition (physicians or dieticians) and clinical staff nonexpert in nutrition. Diagnosis on admission, underlying diseases, and length of stay were registered. Statistical analysis: Kappa index (κ) to evaluate agreement between observers.

Results: A total of 223 patients were included (53.4% boys), with mean age of 5.59 (95% confidence interval 4.94–6.22) years. Experts classified 9.9% of patients at high risk with STRONGkids and 19.7% using STAMP, whereas nonexpert staff assigned 6.7% of patients to the high-risk category with STRONGkids and 21.9% with STAMP. Agreement between expert and nonexpert staff was good: 94.78% for STRONGkids (κ 0.72 [P < 0.001]); 92.55% for STAMP (κ 0.74 [P < 0.001]). The rate of malnutrition was significantly higher among high-risk patients with both NST, independent of examiner experience. After adjusting for age, both STRONGkids and STAMP high-risk scores predicted longer length of stay, whether assessed by experts or nonexperts, although differences were higher with STRONGkids.

Conclusions: Agreement between experts and nonexpert staff in nutrition was good, producing a similar high-risk patient profile. Our results demonstrate that these NSTs are appropriate for nutritional screening in settings in which users have no previous experience in the field.

*On behalf of GETNI (Grupo Español de Trabajo en Nutrición Infantil)

Complejo Hospitalario Torrecárdenas, Almería

Hospital Universitario La Paz, Madrid

§Hospital Universitario Virgen Macarena, Sevilla

||IIS Aragón, Hospital Clínico Universitario Lozano Blesa Zaragoza

Hospital Infantil Universitario Miguel Servet, Zaragoza

#Centro Médico D-Medical, Madrid, Spain.

Address correspondence and reprint requests to Rafael Galera-Martínez, MD, PhD, Complejo Hospitalario Torrecárdenas, Camino de los Parrales 328, 1-4. CP 04720, Aguadulce, Almería, Spain (e-mail:

Received 19 March, 2016

Accepted 1 June, 2016

All procedures followed were in accordance with the ethical standards of the responsible Clinical Research Ethics Committee and with the WMA Declaration of Helsinki.

Informed consent was obtained from all patients (parents or legal guardians) for being included in the study.

The authors report no conflicts of interest.

What Is Known

  • There is no consensus regarding the best nutritional screening tool for paediatric patients.
  • Little data are available about the reliability of nutritional screening tool applied by health care professionals without specialized training in nutritional assessment, despite the fact that these users would have to complete the questionnaires on admission in clinical practice.

What Is New

  • Our results demonstrate that STRONGkids and STAMP are reliable and reproducible, independent of the expertise of the examiner.
  • Both nutritional screening tools were useful to detect patients with a higher prevalence of malnutrition and a longer length of stay, both in expert and nonexpert hands.

Malnutrition in hospitalized patients is related to worse clinical evolution, prolonged stay, delayed recovery, and increased cost (1,2). In European countries, the prevalence of malnutrition among children admitted to hospital varies from 7.3% to 17.9% (2,3), and its early recognition and the prevention and treatment of the consequences have raised great interest in recent years (4,5).

Recently, several nutritional screening tools (NSTs) have been developed to identify patients at high risk of malnutrition and to design appropriate strategies to prevent adverse outcomes. In this context, screening all patients admitted to hospital is considered a standard of good practice for nutritional support (6,7). There is, however, no consensus on the best NST for paediatric patients. To date, there is insufficient evidence on predictive accuracy to justify the choice of one NST over another (8), which is secondary, partly due to the absence of a gold standard to define malnutrition in paediatrics and, more importantly, to define nutritional risk. Hence, it is essential that we consider factors such as inter-rater reliability when making this decision (6,8,9). Little data are, however, available about the reliability of NST when they are actually used by nurses or other health care professionals without specialized training in nutritional assessment, despite the fact that these users would have to complete the questionnaires on admission when implementing them in clinical practice (8,10).

Our aim was to assess the reproducibility and reliability of 2 NSTs: Screening Tool for Risk on Nutritional status and Growth (STRONGkids) and Screening Tool for the Assessment of Malnutrition in Paediatrics (STAMP). We compared the results obtained when these tools were used by experts in paediatric nutrition and clinical staff nonexpert in nutrition, especially focusing on patients classified as high nutritional risk, because they would need further assessment and, eventually, nutritional intervention. An additional aim was to evaluate the relation between NST results and nutritional status on admission and to determine whether or not they predict longer length of hospital stay.

Back to Top | Article Outline


A cross-sectional, multicentre study was performed in 5 Spanish hospitals (3 tertiary and 2 secondary centres). Over a period of 3 weeks, all patients 1 month or older admitted to paediatric and surgical wards were proposed for inclusion. Exclusion criteria were patients admitted to the Paediatric Intensive Care Unit, Oncology, or Day Surgery Ward or those who could not be weighed and measured at admission. All patients included were tested within 24 hours of admission by 2 independent observers: experts (registered dieticians or physicians specialized in paediatric nutrition working in the Paediatric Nutrition Unit of each hospital) and nonexpert staff (nonspecialized in nutrition nurses or paediatric residents who have never worked in a Paediatric Nutrition Unit). Assessed patients were excluded from the study if they were discharged within the first 24 hours.

Two NSTs were applied: STRONGkids and STAMP. STRONGkids was developed and tested in the Netherlands by Hulst et al (11). It assesses nutritional risk by asking 4 questions, 2 to be answered by the child's primary caregiver and the other 2 to be answered by the health care professional. Anthropometric data are not needed to calculate risk score, and it has been substituted by a subjective clinical assessment. STAMP was developed and validated for children in the UK by McCarthy et al (12). It involves a combination of 2 questions for the child's primary caregivers and assessment of nutritional status using height and weight. Both classify patients into 3 risk categories (low, medium, and high) according to scores defined for each. All nonexpert staff attended a 30-minute presentation on how to complete the tests. Physicians treating the patients were blinded to both the expert and nonexpert NST risk classifications.

Weight and supine length in infants (nude) or standing height in children (with minimal clothing) were recorded on admission, using calibrated standard equipment (digital scales, infantometer, or stadiometer) following standard methods (13). Malnutrition was defined using the body mass index standard deviation scale (BMI-SDS) according to previous recommendation (14). National growth charts for the Spanish population by Carrascosa et al (15) were used. Further information about age, sex, diagnosis at admission, underlying diagnosis of chronic illnesses, and length of stay (LOS) was recorded in all cases.

The field research was conducted in May 2013 and all actions, which complied with the 1975 WMA Declaration of Helsinki, were approved by the Research and Ethics Committees of the participating hospitals. Informed consent was obtained from parents or legal guardians and from the participants themselves, if they were 10 years or older.

Back to Top | Article Outline

Statistical Analysis

Sample size calculation was based on the Kappa index (κ). Previous data have shown a κ value of approximately 0.6 (12,16). Taking into account that the different NSTs classified between 8% (11) and 18% (12) of patients as being at high nutritional risk, with 95% confidence and confidence interval (CI) precision of ±0.2, 193 patients were required for the study. Qualitative variables were expressed as percentages with CIs (95% CI), and group differences were evaluated using the chi-squared test. Quantitative data were described by the mean and standard deviation. The Kolmogorov-Smirnov test was used to determine whether the analytical variables were normally distributed. Student t test was used to compare the means of parametric variables and the Mann-Whitney U test to compare nonparametric means. Weighted κ index was used to evaluate agreement between observers and was interpreted according to the Altman classification (17), which considers κ index values of 1 to 0.81 as very good agreement, 0.80 to 0.61 as good agreement, 0.60 to 0.41 as moderate, 0.40 to 0.21 as fair agreement, and <0.20 as poor agreement. Linear regression was used to assess the relation between risk scores by experts and nonexperts at admission and LOS. Other independent variables showing significant association with LOS in the binary models were considered for the multiple regression model. Variables showing colinearity with NST scores (BMI-SDS, diagnosis at admission, and chronic underlying disease at admission) were removed from the model. Linearity and homoscedasticity were assessed by examination of a scatter plots, and Durbin-Watson test was used to assess the independence hypothesis. R 2 was used to evaluate goodness of fit. A P value <0.05 was considered significant. Epidat 3.0 software was used for the sample size calculation and SPSS 15.0 for the statistical analysis.

Back to Top | Article Outline


Two hundred and twenty-three patients were included (53.4% boys), with a mean age of 5.59 (95% CI 4.94–6.22) years. Distribution by age groups was homogeneous (Table 1). Approximately 22.4% of patients had a BMI between −1 and −2 SDS, and 3.6% presented moderate/severe malnutrition (BMI <−2 SDS); 15.6% were admitted to a surgical ward. Among medical ward patients, infectious diseases were the most common cause of admission (41.2%) to a medical ward, followed by noninfectious gastrointestinal diseases (12.9%). Overall, 15.2% of patients had a chronic underlying disease at admission, with a higher prevalence in those 10 years or older (30.9%) compared with infants (11.8%) (P = 0.02). Mean LOS was 4.14 (95% CI 3.61–4.67) days.



The proportion of patients classified as medium or high risk by the 2 NSTs is shown in Figure 1. Agreement between expert and nonexpert staff was 94.78% for STRONGkids (κ 0.72 [95% CI 0.63–0.80] [P < 0.001]) and 92.55% for STAMP (κ 0.74; [95% CI 00.67–0.81] [P < 0.001]) (Table 2).





Differences were observed in the proportion of patients classified as medium or high risk between diagnosis categories (Fig. 2). We found a higher proportion of high-risk patients among those affected with a gastrointestinal noninfectious disease using both NSTs (27.6%–34.5% depending of the examiner expertise and the NST used). In the case of patients who have undergone surgery and those admitted because of an infectious disease, notable differences were observed between STRONGkids and STAMP classification (Fig. 2).



Focusing on high-risk patients, the mean age of STRONGkids high-risk patients was higher than that of the low- and medium-risk patients (P < 0.05); this was due to a lower proportion of infants and a higher number of children 10 years or older included in the high-risk category with STRONGkids (Table 1). No difference in age was observed between STAMP risk categories. Regarding the diagnosis, by comparison with the other patients, STRONGkids high-risk patients presented a significantly higher proportion of noninfectious gastrointestinal diseases, this being the most frequent cause of admission in this group. Similar differences were not found in STAMP high-risk patients.

High-risk patients, independent of the NST used and the expertise of the examiner, showed a higher prevalence of mild and severe malnutrition when compared to low- and medium-risk patients (Table 1). Likewise, high-risk scores on admission using either STRONGkids or STAMP were associated with longer LOS when compared with low- and medium-risk patients, whether assessed by expert or nonexpert staff (Table 1). After adjustment for age in the lineal regression, the relation between high-risk scores and longer LOS—both by experts and nonexperts—remained significant, although higher differences were observed for STRONGkids high-risk patients (Table 3).



Back to Top | Article Outline


The ideal NST should be reproducible and reliable in the identification of individuals at risk of malnutrition (12) and should be suitable for use by nurses or other health care professionals without specialized training in nutritional assessment (ie, nonmembers of the nutritional support team) during admission (9). Although with a similar goal, to detect those patients at high nutritional risk who would need further nutritional assessment, STRONGkids and STAMP differ in several aspects.

STRONGkids was designed to be applied by a physician rather than by nursing staff (11), and it has also been validated in other populations by physicians who are experts in paediatric nutrition (18). This tool, however, includes subjective items (such as subjective clinical assessment) that, a priori, could entail difficulties for staff without previous experience in paediatric nutrition. A study conducted in Belgium evaluated the inter-rater reliability of STRONGkids, although in only 29 patients out of the 368 enrolled in the study, and showed moderate concordance (κ 0.61) (16). More recently, Moeeni et al (10) studied the inter-rater reliability of a simplified version of the STRONGkids questionnaire in 162 children in New Zealand, also finding moderate agreement (κ 0.65). These results were slightly worse than those of the present study but, if we analyse the individual agreement shown by Moeeni et al, of the 15 nurses who participated, the vast majority had an individual agreement above κ 0.65.

On the contrary, STAMP was developed to be used by nurses, although it has also been externally validated by experts in nutrition (19). In the original validation study, the questionnaire completed by nurses was compared with a full nutritional assessment performed by a registered dietician. Concordance was fair to moderate (κ 0.54 [0.38–0.69]) (12), but in a convenience subsample of participants (20%), which was independently reviewed by a second registered dietician to assess the reliability of the classification of nutrition risk as determined by the full nutrition assessment, inter-rater reliability increased to 0.921. Our study is the largest studying reliability for STRONGkids and STAMP and the first to provide separate data to compare it. Global inter-rater agreements between assessors of the 3 NSTs (Paediatric Yorkhill Malnutrition Score, STAMP, and STRONGkids) analysed by Moeeni et al (20) in 15 patients were high for the 3 tools (κ 0.89–0.93), but separate data were not provided.

One important difference between STAMP and STRONGkids is the rate of children classified as high risk: STAMP figures double those of STRONGkids, both when used by expert and nonexpert staff. These results agree with studies that have analysed each NST separately on admitted patients (11,12,16,19,20). Only 2 studies, however, compare STRONGkids and STAMP and in both cases the NSTs were applied by skilled paediatricians. Ling et al (18) compared these NSTs in a sample of 43 patients in which the high-risk patient rate was much higher with STAMP. Moreover, Moeeni et al (20), in a sample of 162 children, obtained 27% of high-risk patients with STAMP and 4% using STRONGkids. In our study, these marked differences between both tools are in part explained by a higher proportion of surgical patients and patients with infectious diseases classified as high risk by STAMP (Fig. 2, Table 1). These results are in agreement with those previously observed by Ling et al (18). It is still to be answered whether STAMP detects a large number of false positive or not: data are needed about the outcome of those patients in terms of a worse nutritional and/or clinical evolution. We also can compare our figures in admitted patients with data available in the primary care using STAMP applied by nurses; in this setting the high-risk rate was 6.6%, as might be expected, lower than that in admitted patients (21). Finally, although we did not record data about time spent by different observers to apply NST, this could be another difference between both tools; an estimation made in the study by Ling et al suggests that STAMP is considerably more cumbersome to apply than STRONGkids when it is applied by skilled paediatricians, in part because the latter do not included weight and height data, but further work is needed to verify this (18).

In the present study, prevalence of acute malnutrition was lower than that in recently reported European figures from a large multicentre study (7%) (2). In that study, which did not include Spanish patient data, malnutrition prevalence varied greatly between countries (range 4.0%–9.3% across countries). Seasonal differences (our study was conducted in the Spring) and patient characteristics given the complexity of each hospital could partly explain these differences. Even so, a recent multicentre study in Spain has shown a reduction in malnutrition figures with respect to previous data while establishing a close relation with diagnosis at admission (22).

Focusing on high-risk patients, malnutrition rates were significantly higher than that in low- and medium-risk patients both in experts and nonexperts hands. Similar results have been reported previously (11,12,16,20,23). Another difference observed in the STRONGkids high-risk category was a mean age above the overall average. This is partly due to the significantly higher proportion of underlying chronic diseases in our sample among children older than 10 years, but it is also explained by the small number of infants classified as high risk by STRONGkids. The latter has been previously described in other studies (11,23) and could reflect a worse sensitivity inside this age group. Once again, information about nutritional and clinical evolution would be needed to assess this point, but it is necessary to keep this fact in mind when STRONGkids is applied.

In relation to diagnosis on admission, we observed a higher proportion of noninfectious gastrointestinal disease among STRONGkids high-risk patients when compared with the rest of the sample. One other study has shown this tendency in a small group of patients both for STRONGkids and STAMP (18). Moreover, in a sample of 46 paediatric patients experiencing inflammatory bowel disease, 40% were classified as high risk by both STAMP and STRONGkids (24). Taking into account the number of high-risk patients, and the greater rate of chronic underlying diseases in the >10 years age group in our sample, no firm conclusions can be reached at this point and further studies should corroborate this observation.

Finally, STRONGkids has previously shown a good correlation with patient LOS both when used by skilled paediatricians (11,18,20) and in the Belgian study, in which it was applied by nurses, a paediatric resident, and a dietician in training (16). In contrast, the STAMP high-risk category was not related to longer LOS according to Moeeni et al (20). In our sample, high-risk patients had longer LOS for both NSTs, independent of the expertise of the examiner who applied the test, but the differences were greater for STRONGkids. This association remained significant even after adjusting for age in the multiple regression analysis. At this point, we must consider the fact that patient LOS depends on many factors, not only nutritional risk and age. It, however, remains important to stress the ability of the NST to select a small group of patients with a remarkably longer LOS (eg, 6.7% STRONGkids high-risk patients identified by nonexpert staff showed a 5.79 days longer stay than low- and medium-risk patients). These data could reflect the ability of the NST to predict adverse outcomes.

Back to Top | Article Outline


Firstly, we have not contemplated corrections for conditions, such as fluid overload, that would affect patient weight at admission. This fact could have affected the STAMP score of some patients, although our sample size contributes to minimize the effect of this possible bias. Secondly, the main objective of the present study was to analyse the reproducibility and reliability of 2 NSTs, once they had been validated by experts. Our findings alone, however, do not sufficiently distinguish one from the other, although, in our opinion, they do provide data relevant to making such a choice. We have shown that NST in nonexpert hands can identify patients with acute malnutrition and patients with longer LOS but NST still need to demonstrate their usefulness in contributing to better patient outcomes after screening and subsequent nutritional intervention, and their cost-effectiveness.

Back to Top | Article Outline


We conclude that STRONGkids and STAMP are reliable and reproducible, independent of the expertise of the examiner. Agreement between expert and nonexpert staff was good, producing a similar high-risk patient profile, although with different proportion of high-risk patients. Our results demonstrate that these NSTs are useful in nutritional screening in settings in which users have no previous experience in the field. Although high-risk patients of both NSTs were characterized by a higher prevalence of malnutrition and a longer LOS, the figures were higher with STRONGkids. Nevertheless, which NST offers the highest utility for each condition (surgery, critical care, infectious disease, etc) remains undefined. Further studies are required to determine the precise influence of nutritional characteristics on LOS, taking into account other variables such us underlying conditions, disease epidemiology, food delivery in the hospital setting, or existence of nutrition teams.

Back to Top | Article Outline


The authors would like to thank all the participant children and their parents, and the nursing staff and paediatric residents for their cooperation.

Back to Top | Article Outline


1. Saunders J, Smith T. Malnutrition: causes and consequences. Clin Med 2010; 10:624–627.
2. Hecht C, Weber M, Grote V, et al. Disease associated malnutrition correlates with length of hospital stay in children. Clin Nutr 2015; 34:53–59.
3. Joosten KF, Zwart H, Hop WC, et al. National malnutrition screening days in hospitalised children in The Netherlands. Arch Dis Child 2010; 95:141–145.
4. Lakdawalla DN, Mascarenhas M, Jena AB, et al. Impact of oral nutrition supplements on hospital outcomes in pediatric patients. JPEN J Parenter Enteral Nutr 2014; 38 (2 suppl):42S–49S.
5. Huysentruyt K, Goyens P, Alliet P, et al. More training and awareness are needed to improve the recognition of undernutrition in hospitalised children. Acta Paediatr 2015; 104:801–807.
6. Kondrup J, Allison SP, Elia M, et al. Educational and Clinical Practice Committee; European Society of Parenteral and Enteral Nutrition (ESPEN)ESPEN guidelines for nutrition screening 2002. Clin Nutr 2003; 22:415–421.
7. Corkins MR, Griggs KC, Groh-Wargo S, et al. Standards for nutrition support: pediatric hospitalized patients. Nutr Clin Pract 2013; 28:263–276.
8. Huysentruyt K, Devreker T, Dejonckheere J, et al. Accuracy of nutritional screening tools in assessing the risk of undernutrition in hospitalized children. J Pediatr Gastroenterol Nutr 2015; 61:159–166.
9. Joosten KF, Hulst JM. Nutritional screening tools for hospitalized children: methodological considerations. Clin Nutr 2014; 33:1–5.
10. Moeeni V, Walls T, Day AS. The STRONGkids nutritional risk screening tool can be used by paediatric nurses to identify hospitalised children at risk. Acta Paediatr 2014; 103:e528–e531.
11. Hulst JM, Zwart H, Hop WC, et al. Dutch national survey to test the STRONGkids nutritional risk screening tool in hospitalized children. Clin Nutr 2010; 29:106–111.
12. McCarthy H, Dixon M, Crabtree I, et al. The development and evaluation of the Screening Tool for the Assessment of Malnutrition in Paediatrics (STAMP©) for use by healthcare staff. J Hum Nutr Diet 2012; 25:311–318.
13. Lee RD, Nieman DC. Anthropometry. Nutritional Assessment. New York: McGraw-Hill; 2009. 160–213.
14. Joosten KF, Hulst JM. Malnutrition in pediatric hospital patients: current issues. Nutrition 2011; 27:133–137.
15. Carrascosa A, Fernández M, Ferrández A, et al. Estudios Españoles de Crecimiento. 2010. Accessed January 17, 2016.
16. Huysentruyt K, Alliet P, Muyshont L, et al. The STRONGkids nutritional screening tool in hospitalized children: a validation study. Nutrition 2013; 29:1356–1361.
17. Altman DG. Practical Statistics for Medical Research. New York: Chapman and Hall; 1991.
18. Ling RE, Hedges V, Sullivan PB. Nutritional risk in hospitalised children: an assessment of two instruments. E-SPEN Eur E J Clin Nutr Metab 2011; 6:e153–e157.
19. Lama More RA, Moráis López A, Herrero Álvarez M, et al. Validation of a nutritional screening tool for hospitalized pediatric patients. Nutr Hosp 2012; 27:1429–1436.
20. Moeeni V, Walls T, Day AS. Nutritional status and nutrition risk screening in hospitalized children in New Zealand. Acta Paediatr 2013; 102:e419–e423.
21. Rub G, Marderfeld L, Poraz I, et al. Validation of a nutritional screening tool for ambulatory use in pediatrics. J Pediatr Gastroenterol Nutr 2016; 62:771–775.
22. Moreno Villares JM, Varea Calderón V, Bousoño García C, et al. Sociedad Española de Gastroenterología. Nutrition status on pediatric admissions in Spanish hospitals; DHOSPE study. Nutr Hosp 2013; 28:709–718.
23. Moeeni V, Walls T, Day AS. Assessment of nutritional status and nutritional risk in hospitalized Iranian children. Acta Paediatr 2012; 101:e446–e451.
24. Wiskin AE, Owens DR, Cornelius VR, et al. Paediatric nutrition risk scores in clinical practice: children with inflammatory bowel disease. J Hum Nutr Diet 2012; 25:319–322.

children; clinical outcome; hospitalized; malnutrition; nutritional screening

© 2017 by European Society for Pediatric Gastroenterology, Hepatology, and Nutrition and North American Society for Pediatric Gastroenterology,