Validation of an Instrument to Measure Quality of Life in British Children With Inflammatory Bowel Disease

Ogden, C.A.*; Akobeng, A.K.; Abbott, J.*; Aggett, P.; Sood, M.R.; Thomas, A.G.

Journal of Pediatric Gastroenterology & Nutrition: September 2011 - Volume 53 - Issue 3 - p 280–286
doi: 10.1097/MPG.0b013e3182165d10
Original Articles: Gastroenterology

Objective: To validate IMPACT-III (UK), a health-related quality of life (HRQoL) instrument, in British children with inflammatory bowel disease (IBD).

Patients and Methods: One hundred six children and parents were invited to participate. IMPACT-III (UK) was validated by inspection by health professionals and children to assess face and content validity, factor analysis to determine optimum domain structure, use of Cronbach alpha coefficients to test internal reliability, ANOVA to assess discriminant validity, correlation with the Child Health Questionnaire to assess concurrent validity, and use of intraclass correlation coefficients to assess test-retest reliability. The independent samples t test was used to measure differences between sexes and age groups, and between paper and computerised versions of IMPACT-III (UK).

Results: IMPACT-III (UK) had good face and content validity. The most robust factor solution was a 5-domain structure: body image, embarrassment, energy, IBD symptoms, and worries/concerns about IBD, all of which demonstrated good internal reliability (α = 0.74–0.88). Discriminant validity was demonstrated by significant (*P < 0.05, **P < 0.01) differences in HRQoL scores between the severe, moderate, and inactive/mild symptom severity groups for the embarrassment scale (63.7* vs 81.0 vs 81.2), IBD symptom scale (45.0** vs 64.2* vs 80.6), and the energy scale (46.4* vs 62.1* vs 77.7). Concurrent validity of IMPACT-III (UK) with comparable domains of the Child Health Questionnaire was confirmed. Test-retest reliability was confirmed with good intraclass correlation coefficients of 0.66 to 0.84. Paper and computer versions of IMPACT-III (UK) collected comparable scores, and there were no differences between the sexes and age groups.

Conclusions: IMPACT-III (UK) appears to be a useful tool to measure HRQoL in British children with IBD.

*University of Central Lancashire, Preston

Royal Manchester Children's Hospital, Manchester

School of Medicine and Health, Lancaster University, Lancaster, UK.

Address correspondence and reprint requests to Dr A.G. Thomas, Royal Manchester Children's Hospital, Oxford Road, Manchester, M13 9WL, UK (e-mail:

Received 27 March, 2010

Accepted 4 October, 2010

The authors report no conflicts of interest.

As well as being symptomatically disabling, inflammatory bowel disease (IBD) can have major psychological and social effects (1,2). Children with IBD are more likely to have psychiatric morbidity than children with other chronic disorders (3). Health-related quality of life (HRQoL) research can assess how health, disease, or impairment affects physical, mental, cultural, environmental, and economic aspects of living (4). Outcome measures in IBD clinical trials have traditionally relied on disease activity indices (5), but these measures fail to assess the patients’ subjective view of their experience. HRQoL instruments provide data that are meaningful to health professionals, patients, and their families (6–8). They complement information from disease activity indices and can evaluate how treatments affect patients’ lives. Generic instruments are useful for comparing HRQoL in different groups of patients and healthy individuals. Disease-specific instruments are more sensitive to changes in HRQoL in specific patient groups. Instruments developed to measure HRQoL in adults with IBD include the inflammatory bowel disease questionnaire and the rating form of IBD patient concerns (5,9), but these are not suitable for children.

The first disease-specific HRQoL instrument for children with IBD (IMPACT) was developed and validated in Canada (10,11). It consisted of 6 domains: bowel symptoms, body image, functional/social impairment, emotional impairment, test/treatments, and systemic impairment. To determine whether IMPACT could be used in the United Kingdom we tested whether the concerns of British and Canadian children with IBD were similar. Significant agreement was found between these groups (12). Unfortunately, British and Dutch children found the questions in IMPACT too complicated and/or upsetting. IMPACT was translated, the questions were simplified, and the modified 35-item IMPACT (IMPACT-II) was validated in the Netherlands (13). Seventy-two children with IBD from Canada, Amsterdam, and Manchester completed the original IMPACT and IMPACT-II, with most favouring IMPACT-II (14). A further study in the Netherlands found that Dutch children preferred a Likert scale to a visual analogue scale (VAS) (15). We subsequently undertook a pilot study with 20 British children with IBD who completed IMPACT-II twice, with either a VAS or a Likert scale (16). Fifteen children preferred the Likert scale, but 5 expressed problems with language or phrasing. It was suggested that after rewording the difficult words and inserting a Likert scale, the instrument should be suitable for use in the United Kingdom following psychometric evaluation to ensure that the instrument is reliable and valid. This version is known as IMPACT-III (UK) and differs from previous versions because it contains a Likert scale.

Alongside the development of the paper version of IMPACT-III (UK), we have developed a computerised touch screen version (17). Nineteen children completed (in random order) both paper and computer versions of IMPACT-III (UK) to evaluate which mode of administration they preferred. Seventeen preferred the computer version because it was more fun, easier to understand, and they liked computers (18).

The objective of this study was to ensure that the IMPACT-III (UK) questionnaire was a reliable and valid instrument to measure HRQoL in British children with IBD.

One hundred ten children with IBD were considered for participation in the study. The parents/guardians1Although some of the adults who took part in the study were not the children's biological parents we will refer to the parents/guardians as parents and when discussing gender we will refer to ‘mothers’ and ‘fathers’ to avoid any cumbersome phrasing. and children were given information about the study before informed consent was obtained. Children completed the symptom checklist, IMPACT-III (UK) (randomised by computer to either paper or computer versions), the patient sensibility questionnaire, and the Child Health Questionnaire 87-item, child version (CHQ). Parents completed a demographic form. To assess test-retest reliability the children were asked to return to the clinic in 4 to 8 weeks to complete IMPACT-III (UK) and the symptom checklist again. Unfortunately, most parents and children were unable to return to the clinic within this time period, so the test-retest reliability was repeated after the main validation in 50 different children by sending 2 copies of IMPACT-III (UK) and the symptom checklist to the children's homes at 4- to 8-week intervals. The study was approved by the Salford and Trafford research and ethics committee.

Demographic Form

The demographic form documented child's age, sex, disease type, whether he or she was recruited from the ward or clinic, and the parent's sex, age, marital status.

Symptom Checklist

Used to assess symptom severity from the child's perspective, this information was then used to assess discriminant validity and test-retest reliability. It included diarrhoea, blood in stools, stomachaches, fever, and weight loss. A score of 0 to 2 represented no symptoms or mild symptoms; 3 to 5, moderate symptoms; and 6 to 12, severe symptoms (14). A change of 1 or 0 between the 2 time points indicated stable symptoms for test-retest reliability, whereas a score of >1 indicated a change in symptoms.

Patient Sensibility Questionnaire

This questionnaire helped to assess face validity by asking how well the children understood the questions, phrasing, and layout, and whether they would be happy to complete IMPACT-III (UK) again. It also helped to assess content validity by asking whether there were any other questions that the children believed should have been included.

Child Health Questionnaire Child Form-87

The CHQ is a generic HRQoL questionnaire for children (19). In the absence of a criterion standard to measure HRQoL in children with IBD, the CHQ was used to evaluate concurrent validity. This indicates whether domains of IMPACT-III (UK) correlate with comparable domains of the CHQ. IMPACT-III (UK) version was adapted from IMPACT-II, and the VAS was replaced with a Likert scale.

Statistical Analysis

SPSS version 13.0 (SPSS, Chicago, IL) was used to perform statistical analyses. Factor analysis was performed on data collected using computer and paper versions of IMPACT-III (UK) to help construct domains (20). Factor analysis is a statistical technique that analyses interrelations among a large number of variables and helps to explain these variables in terms of their common underlying factors. The related variables are then placed into a small number of domains. Principal components analysis was conducted for 4-, 5-, 6-, 7-, and 8-factor (domain) structures using varimax rotation (a statistical technique to maximise the variance of each of the factors) (21). Items were included in a domain when factor loadings (correlation between the item and the overall factor or domain) exceeded 0.4. Based on these results a sociologist, a psychologist, and a consultant paediatric gastroenterologist inspected the items within the proposed domains to determine clinical relevance. Internal reliability was tested using Cronbach alpha coefficients. An independent samples t test was used to check whether there were any differences in scores across the 5 domains between those completing computer or paper versions of IMPACT-III (UK), those ages 8 to 12 years versus 13 to 17 years, and also between male and female participants. ANOVA was used to test the ability of IMPACT-III (UK) to distinguish differences in HRQoL scores in patients with different symptom scores (discriminant validity). To compare CHQ and IMPACT-III (UK) (concurrent validity), the Likert scale was coded from 0 to 4 for each item (total score 0–140). The scores from IMPACT-III (UK) were transformed to a scale of 0 (worst possible HRQoL) to 100 (best possible HRQoL). The Pearson correlation coefficient assessed concurrence between appropriate domains of the 2 instruments. Data collected from stable patients who had completed IMPACT-III (UK) twice were analysed to assess test-retest reliability. A 2-way random model and intraclass correlation coefficients (ICCs) were used to estimate the degree of association between the scores.

Back to Top | Article Outline


Of the 110 children who were considered for the study, 4 were deemed to be too depressed by the senior doctor responsible for their care and were therefore not approached. One hundred six children and their parents were therefore invited to participate in the study. Four parents did not have time and 5 children did not wish to participate. The remaining 97 children agreed to participate and completed the initial validation (Table 1). There were no significant differences between the assessed demographic and clinical characteristics of participants and nonparticipants.

Face and Content Validity

A paediatric gastroenterologist, a psychologist, a sociologist, and an IBD nurse examined IMPACT-III (UK) and believed the questions were suitable for children ages 8 to 17 years with IBD. Ninety-three of the 94 children who completed the patient sensibility questionnaire believed that IMPACT-III (UK) was easy to understand. Ninety believed that the questions were not too personal, and 87 said they would be happy to complete IMPACT-III (UK) again if requested.

Factor Structure and Internal Reliability of IMPACT-III (UK)

The most statistically robust solution of the principal components analysis was a 5-factor structure. The 5 domain titles (and Cronbach alpha coefficients) chosen to describe these factors were (Table 2): worries/concerns about IBD (α 0.88), embarrassment (α 0.79), body image (α 0.79), energy (α 0.74), and IBD symptoms (α 0.83). Item 9 (How often did you have to miss out on certain things because of your IBD?) loaded equally across 3 domains (worries/concerns about living with IBD, IBD symptoms, and energy), suggesting that the question was vague and ambiguous. Questions 30 (Do you feel there is someone you can talk to about your IBD?) and 31 (How often did you have to pass wind in the last 2 weeks?) loaded into specific domains (body image and IBD symptoms, respectively), but their factor loadings were below 0.4, suggesting that they were not relevant.

Paper Versus Computer Questionnaire

There were no statistically significant differences between domain scores for those completing computer (n = 47) or paper (n = 50) versions of IMPACT-III (UK). The computer group, however, had consistently lower scores than the paper group. The characteristics of participants in the 2 groups were examined to detect any differences (eg, diagnosis, symptom score, drug therapy). The computer group had a higher mean symptom score (2.65) compared with the paper group (2.17) and were more likely to take steroids (19 [40%] cf 8 [16%]) and/or azathioprine (16 [34%] cf 11 [22%]). A general linear model (Table 3) was constructed to measure the proportion of variability in HRQoL scores explained by each independent variable. The percentage of variability in scores between the 2 modes of administration was 37% for the energy domain, 28% for the symptom domain, and 14% for the embarrassment domain. The mode of administration of IMPACT-III (UK) is unlikely to be the reason for this variation because all of the P values were >0.3. The variability was mainly explained by the symptom score (P < 0.01, <0.01, 0.01) and, to some extent, by steroid usage (P < 0.01, 0.55, 0.07). The worries/concerns about having IBD and the body image domains showed smaller differences in HRQoL scores.

Discriminant Validity

ANOVA showed significant differences in HRQoL scores between different symptom severity groups for the embarrassment scale (F = 3.24, P = 0.044), symptom scale (F = 13.56, P < 0.001), and the energy scale (F = 11.18, P < 0.001) (Table 4). The body image and worries/concerns about IBD domains did not show any statistically significant differences in mean HRQoL scores between different symptom severity groups (P = 0.25 and 0.41, respectively); however, scores were in the predicted direction.

Concurrent Validity

Comparing similar domains of IMPACT-III (UK) and the CHQ showed the following significant correlations (P < 0.001 in all cases):

1. Symptoms cf physical functioning (r = 0.52), bodily pain (r = 0.72), and global health (r = 0.66)

2. Body image cf self-esteem (r = 0.59) and mental health (r = 0.51)

3. Worries/concerns about IBD cf general health perceptions (r = 0.55) and mental health (r = 0.47)

4. Energy cf physical functioning (r = 0.63), self-esteem (r = 0.58), and mental health (r = 0.61)

It was not considered appropriate to correlate the embarrassment domain of IMPACT-III (UK) with any of the CHQ domains because none appeared comparable.

Sex and Age

There were no statistically significant differences between domain scores for younger (8–12 years) versus older (13–17 years) children or for boys versus girls.

Test-Retest Reliability

Twenty-six girls and 24 boys, ages 9 to 16 years with IBD (Crohn disease 33, ulcerative colitis 8, indeterminate colitis 9) completed the paper version of IMPACT-III (UK) twice for the assessment of test-retest reliability. Thirty-two had stable disease activity (change in symptom score 0 or 1). ICCs ranged from 0.66 to 0.84 (Table 5). It was not possible to assess test-retest reliability of the computer version because most of the children were unable to return to the outpatient clinic in the required timescale.

IMPACT-III (UK) is suitable for assessing HRQoL in children with IBD. It should be more sensitive to changes in HRQoL than a generic instrument such as the CHQ. It has good face and content validity; children ages 8 to 17 years are able to understand it and they are happy to complete it in the clinical setting. Different instruments have been validated for use in adults with IBD (5,9). Psychometric evaluation (factor analysis and Cronbach alpha coefficients) suggests that IMPACT-III (UK) has a good internal structure and each domain demonstrated good internal reliability. The computer version can be used in a clinical setting because results of the general linear modelling indicated that the 2 modes of administration collected similar results. IMPACT-III (UK) can successfully discriminate between patients with different symptom severity. Concurrent validity of IMPACT-III (UK) was confirmed because there was significant correlation between similar domains of IMPACT-III (UK) and the CHQ. Test-retest reliability was good, with ICC values between 0.66 and 0.84 (20).

Dutch (13) and Canadian (10,11) versions of IMPACT both contain 6 domains, whereas a recent multicentre study from the United States by Perrin et al (22) suggested that a 4-factor structure was preferable. Our data suggest that a 5-factor structure is more suitable for IMPACT-III (UK) and that 3 of the questions appear to be ambiguous or not relevant. However, these findings are based upon a single study and would require further confirmation before any changes were made to IMPACT-III, which has now been translated into more than 14 languages. A variety of reasons can explain why low statistical robustness was found for 4- and 6-factor analyses for the UK version. Dutch and Canadian researchers did not perform factor analysis and their 6-domain structure was based on clinical relevance. The factor analysis performed by Perrin et al (22) used the IMPACT-II questionnaire, which was developed in Holland and was substantially different from IMPACT-III (UK) because it incorporated a VAS and consisted of 35 items. Finally, results of the different studies could reflect the different cultures in which the studies took place, which may affect the way that HRQoL instruments were interpreted. Each study has attempted to confirm a structure that reflects appropriateness for their population. Cronbach alpha coefficients showed good internal reliability (0.74–0.88) for all 5 domains in IMPACT-III (UK), suggesting that items within each domain are related and measure the concept intended. The 5-domain structure of IMPACT-III (UK) therefore appears most appropriate for measuring HRQoL in British children with IBD. Further research is needed to determine whether we can establish a factor structure that is robust in all languages and countries where IMPACT is being used because this would facilitate multinational research.

Children in different symptom severity groups showed statistically significant differences in scores in the symptom, embarrassment, and energy domains, but not in the worries/concerns about IBD and body image domains. The extent to which children worry about their IBD or body image may be more dependent on factors other than symptom severity. Therefore, discriminant ability of these 2 domains may not necessarily be expected. The symptom checklist relies on patients’ reports of their symptoms, and its value was confirmed in the Dutch validation of IMPACT-II (13). We chose this in preference to a measure of disease activity such as the Paediatric Crohn's Disease Activity Index, which includes blood results. Although this is a potential limitation of the study, we felt it was an ethical issue because children do not routinely have blood tests in the outpatient clinic; it was also believed that if they were worried or upset about having a blood test, this may affect their HRQoL score. Furthermore, symptom severity does not always correlate well with disease activity. HRQoL is a much broader concept than disease activity and is more meaningful to the patients. We were not attempting to develop a new measure of disease activity but wanted to see whether IMPACT-III (UK) could discriminate between patients who had no or mild symptoms and those with more severe symptoms.

In the absence of a criterion standard to measure HRQoL in children with IBD, concurrent validity was demonstrated by comparing scores from appropriate domains between IMPACT-III (UK) and the CHQ. It was not appropriate to correlate the embarrassment scale IMPACT-III (UK) with any of the CHQ scales because none appeared comparable. A major reason for developing a disease-specific instrument for children with IBD is that it should be more sensitive to changes in HRQoL than a generic instrument such as the CHQ. This requires confirmation in future studies.

IMPACT-III (UK) should be useful for assessing patients in clinical practice as well as being a potentially useful outcome measure in clinical trials involving children with IBD. Two longitudinal studies in North America showed improvements in total IMPACT-II scores (23,24). The first study (23) assessed HRQoL in children before after attending an IBD camp, whereas the second study assessed HRQoL during the first year after a diagnosis of IBD (24). There is no consensus as to what is a clinically (rather than statistically) significant change in IMPACT score. This should become apparent when a number of clinical trials using IMPACT are completed. The development of IMPACT for use in the United Kingdom should help us to understand more about the effect of IBD on the HRQoL of affected children. Further research is required to develop a disease-specific HRQoL instrument suitable for younger children who are likely to have difficulty reading and answering the questions in IMPACT-III (UK). As more is known about the HRQoL in children with IBD, we can start to explore which interventions are most likely to lead to an improvement in HRQoL.

We are grateful for the considerable help and advice in the planning and conduct of this study from our colleagues in Canada and the Netherlands: Dr B. Derkx, Dr A. Griffiths, Dr H. Loonen, and Dr A. Otley.

