Cross-cultural Adaptation, Reliability, Validity, and Responsiveness of the Simplified-Chinese Version of Neck Disability Index

Lim, Hanniel Han Rong M ClinPhysa; Tang, Zhi Yin BSc (Hons)a; Hashim, Masayu Afiqah Binte Masagoes BSc (Hons)a; Yang, Mingxing MMuscsklSportPhysioa; Koh, Eileen Yi Ling BScb; Koh, Kim Hwee FCFPS, FAMSc

doi: 10.1097/BRS.0000000000003325

Neck pain is a major health problem worldwide, affecting more than 30% of the general population annually.1–3 It is a highly prevalent musculoskeletal condition and poses significant economic and health burden.4 In Singapore, neck pain remains the highest reported musculoskeletal disorder in office workers.5 Although many definitions of neck pain exist, the International Association for the Study of Pain has defined neck pain as pain perceived anywhere in the posterior region of the cervical spine, from the superior nuchal line to the first thoracic spinous process.6

Neck pain can have numerous negative effects on a person's functional ability, work activities, and quality of life.7–9 The intensity of symptoms can also vary largely, causing a similarly large variance in self-reported disability as a result of neck pain.10 Quantification of neck pain is, therefore, necessary to determine and understand how it may impact patients’ perception of disability and assessment of clinical outcomes.11 Consequently, this would empower clinicians’ decision-making in the management of these patients.12

The Neck Disability Index (NDI) is the most commonly used patient self-reported measure of neck pain symptoms and its effect on functions and activities.13,14 It has been shown to be a valid and reliable tool15 that has been translated and validated in various languages.16–28 To our knowledge, there is currently no simplified-Chinese language for neck pain and disability measures. Translating the NDI into the simplified-Chinese language instead of development of a new comprehensive instrument would allow for comparisons of different populations, and permits clinicians and researchers to exchange information across cultural and linguistic barriers. Therefore, the aims of this study are to translate and culturally adapt the NDI into the simplified-Chinese version and to evaluate the reliability, validity, and responsiveness of the new questionnaire in patients with neck pain,




The NDI is a condition-specific instrument for self-report of disability adapted from the Oswestry Low Back Pain Questionnaire.29 It is a valid and reliable questionnaire30 and comprises of 10 items relating to pain intensity, personal care, lifting, reading, concentration, work, driving, sleeping, recreation, and headache. Each question has six possible responses that score between 0 (no pain and no functional limitation) and 5 (worst pain and maximal limitation). Patients are required to choose an answer that best reflects their condition at the present time. The total score is then presented as a percentage with higher scores representing greater disability.

Translation and Cultural Adaptation

The linguistic validation procedure was initiated after contacting the developer and copyright holder of the instrument with permission obtained for the purpose of this study. This procedure was based on previous guidelines established by Beaton et al.31 Two independent physiotherapists translated the questionnaire into simplified-Chinese (forward translation). The mother tongue of both translators is the simplified-Chinese language and they are also proficient in the English language. Both forward translations were then compared and discussed by the translators and authors to obtain consensus. The consensus version was then backward translated by two other independent English-speaking translators. Both were unaware of the questionnaire concept and had not seen the original English questionnaire. The expert review committee consisted of all the translators, the authors, and two other experienced physiotherapists reviewed all translations for semantic, idiomatic, experiential, and conceptual equivalence. After reaching consensus, a preliminary final simplified-Chinese version (NDI-SC) was determined. The preliminary final version was then tested on a small sample of 10 patients with neck pain to determine whether all questions are clear and understandable. No modifications were needed following the preliminary test.

Visual Analogue Scale for Pain

The Visual Analogue Scale (VAS) consists of an 11-point scale on a 100-mm horizontal line with 0 representing the words “no pain” and 10 being the “worst possible pain” at each ends.32 Patients were asked to quantify their current neck pain by drawing a vertical mark on the area of the horizontal line that best represented their current perception of pain level. The VAS has been shown to be a reliable and valid tool to measure pain intensity.33,34

Global Rating of Change

The Global Rating of Change (GROC) assesses self-perception of change in the patient's condition between sessions.35 Participants are asked to rate the change in their condition on a 15-point transitional scale from −7 (a very great deal worse) to 7 (a very great deal better).


The NDI-SC questionnaire was administered to patients with neck pain, seeking physiotherapy care within a primary healthcare polyclinic in Singapore. Patients eligible for the study were consecutively recruited between January and June 2019. Screening of the participants was carried out by physiotherapists with more than 9 years of experience. Eligibility criteria were: the presence of neck pain, age between 21 and 70 years, the ability to read simplified-Chinese, and absence of symptoms below the elbows related to specific neck disorders. Patients were excluded if they had any of the following co-morbid diagnoses: inflammatory diseases, current infection, cancer or suspected tumors, history of fracture and surgery on the cervical spine, cervical myelopathy or radiculopathy, or clinically recognizable cognitive impairments. All participants provided written informed consent before the study. This study was approved by the SingHealth Centralized Institutional Review Board (CIRB reference 2018/3054), Singapore.


During the first visit, all participants completed the NDI-SC and VAS questionnaires. Demographic information such as age, sex, current educational level, and duration of pain were recorded. Participants then returned at 1 to 2 weeks later to complete the VAS, GROC, and NDI-SC with changed reordered items. The selection of this interval period to assess reproducibility by retest was to allow more realistic estimates of the variability to be observed among control subjects in a longitudinal study.36

Statistical Analyses

All statistical procedures were conducted using IBM SPSS 25 (IBM Corp., Armonk, NY). The critical values for significance were set at P < 0.05. All data were assessed for completeness. Descriptive statistics were used to describe the demographic and clinical characteristics of all participants. Mean scores and standard deviations (SD) were calculated at item-level and total scores for both administrations of the NDI-SC.


Reliability was assessed through internal consistency, test-retest reliability, and measurement errors. Cronbach α was calculated to determine the internal consistency of the NDI-SC.37 α values >0.8 are considered to be good-excellent.38 Test-retest reliability was calculated using Intraclass Correlation Coefficient (ICC) and Bland and Altman method. The primary reliability measure was based on an ICC (2,1) model with two-way random effects model of variance, and absolute agreement definition reporting single measures, as participants completed the NDI-SC only once in each session.39 ICC values >0.75 are indicative of good-excellent reliability.39 A Bland and Altman plot illustrates the spread of the difference in scores between the test and retest scores for each individual. It is expected that 95% of the differences to be less than two SD.40 The size of the retest sample was estimated based on a method developed to calculate the required number of subjects in a reliability study.41 Parameters regarding the probability of error type I and type II were α = 0.05 and β = 0.20, respectively. Following these assumptions, at least 46 participants will be necessary for the test-retest analysis. Participants who scored between −3 and +3 on the GROC were included in the test-retest analysis and are assumed that they did not demonstrate any clinically relevant changes during this interval period. Measurement errors were determined by calculating the standard error of measurement (SEM) and the minimal detectable change (MDC).

Content Validity

Content validity was assessed by the completeness of the item responses in NDI-SC and the size of floor and ceiling effects. Floor and ceiling effects were considered to be present if >15% of the participants achieved the lowest or highest possible total score, respectively.41

Construct Validity

Construct validity was assessed through exploratory factor analysis using varimax rotation and confirmatory factor analysis.42 Pearson correlation was assessed to examine the correlation between NDI-SC and VAS. In accordance to COSMIN guidelines,43 70 participants will be necessary as minimally seven samples are required for each item of the NDI-SC. To account for an estimated 10% drop-out rate, at least 80 participants will be recruited.


Responsiveness was analyzed by analyzing the Spearman and Pearson correlation coefficients to quantify the relationship between NDI-SC change scores to the GROC, and also the change scores of the NDI-SC with the change scores of the VAS, respectively. Both Pearson and Spearman correlation coefficient values >0.6 are considered to be strong.44



Eighty-five patients with neck pain were enrolled in this study. Of these, six did not meet the inclusion criteria and were excluded from the study. Of the remaining subjects who were eligible, 70 provided informed consent and participated in the study. Of those, 14 (20%) did not return or complete the questionnaires for a second time. The remaining questionnaires were completed and none had more than two items of the NDI-SC missing. The final sample consisted of 70 participants with valid NDI-SC scores for statistical analysis (Figure 1).

Figure 1
Figure 1:
Flowchart of participants.

All descriptive statistics were reported using mean ± SD. For the study, there were 70 participants (37 male, 33 female) with a mean age of 44 ± 12 years. Demographic and clinical characteristics are summarized in Table 1.

Participant Characteristics (n = 70)

NDI-SC Instrument

Of the 70 NDI-SC scores included in the analyses, 35 (50%) had no missing NDI-SC item, 34 (48.6%) had one missing item and one (1.4%) had two missing items. Where missing items were present, the total score was presented as a percentage by aggregating the total item scores and dividing by the maximum score possible derived from the remaining items. Notably, most of the missing values were from the item related to driving as most of the participants do not drive.

Test-Retest Reliability

Fifty-six participants returned at the second session to complete the questionnaires. The mean duration interval between the first and second session was 13 ± 5 days. 50 participants who scored between -3 and +3 on the GROC were included in the test-retest analysis. ICC values were 0.85, indicating good reliability. The Bland and Altman analysis showed that the mean of the difference was −1.6 ± 11.9 (Figure 2). SEM and MDC for the NDI-SC scores were 4.2 and 11.7, respectively.

Figure 2
Figure 2:
Bland-Altman plot illustrating the test-retest reliability of the NDI-SC. The central line represents the mean difference between test and retest scores, and the outer reference lines represent the 95% limits of agreement. NDI-SC indicates Neck Disability Index into the simplified-Chinese language.

Internal Consistency

Cronbach α for the NDI-SC was 0.92 indicating excellent internal consistency. The item-scale correlations between single items and total scores of the NDI-SC were fair to strong with correlation coefficients ranging from 0.27 to 0.72, confirming internal consistency of the NDI-SC (Table 2).

Test-Retest Reliability, Measurement Errors, and Item-scale Correlations, n = 50

Content Validity

No floor and ceiling effects were observed for the total scores as only three participants (0.5%) had the lowest score and none had the highest score. However, all the items with the exception of “Pain Intensity” had floor effects, with 37.7% of all item entries scoring the lowest possible value. There were no ceiling effects for all of the individual items.

Construct Validity

A strong correlation was found between NDI-SC and VAS (Rp = 0.61, P < 0.001).

The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy (0.81), and Bartlett test of sphericity (194.25, P < 0.001) showed sufficient sample size to conduct a satisfactory factor analysis. A two-factor structure with Eigen values >1 was extracted by factor analysis, which explained over a total of 66% of the reliable variance of the item scores. The Scree test also showed two factors further confirming internal construct validity (Figure 3). Factor loadings of all the items were similar, showed that rotated component loadings ranged from 0.664 to 0.832 for factor one and 0.507 to 0.905 for factor two, with Item 2 (Personal Care) and Item 5 (Headaches) receiving the highest value in factor one and factor two, respectively (Table 3).

Figure 3
Figure 3:
Scree plot for the 10 items of the simplified-Chinese Neck Disability Index.
Exploratory Factor Analysis for the NDI-SC


The NDI-SC showed moderate responsiveness with GROC (Rs = 0.46, P < 0.001). Table 4 represents the descriptive statistics of the mean change NDI-SC scores according to each GROC grading.

Mean Score Changes for the NDI-SC, According to GROC Grading, n = 56

The correlation between NDI-SC change scores and VAS change scores was also moderate (Rp = 0.59, P < 0.001).


The aims of this study were to translate the NDI into the simplified-Chinese version and to evaluate its psychometric properties. The results showed that the NDI-SC had good reliability, validity, and responsiveness in patients with neck pain.

In our present study, the participants had a mean age of 44 ± 12 years which was as expected. Although the variation in age was large, similar studies conducted by Shaheen et al,21 Wu et al,26 and Cramer et al27 also reported having age ranges between 41 ± 10 years, 43 ± 13 years, and 49 ± 16 years, respectively.

Notably, in our present study, the mean NDI-SC score was lower than that of the previous studies,22,26,27 indicating that participants only had mild disability. This may be attributed to the nature of the primary healthcare setting where participants received care at an earlier onset of neck pain.

The NDI-SC showed good internal consistency with Cronbach α value of 0.92, which is comparable to values reported in earlier studies (0.74–0.97).16–28 Testing of reliability was also done within a short time interval to minimize changes in participants’ condition. The ICC2,1 values were found to be 0.85, indicating good reliability and is in line with other studies (0.81–0.92).16–28 For previous studies that reported higher ICC values, we were unable to ascertain the ICC modeling that were used.23,24 The results of SEM and MDC were also similar to those of other NDI versions.20,24 The MDC value of the NDI-SC in this study is higher than that reported by Bakhtadze et al,16 but comparable to values reported by other studies.24,30

Given the embracement of a car-lite society in Singapore, there were considerably more missing “driving” responses. This finding was not a translation issue and was in line with previous studies.23,25,26 Similar to previous studies, the present study did not find any floor or ceiling effects for the NDI-SC total scores.23,28 However, with the only exception of “Pain Intensity”, all other individual components of the NDI-SC were observed to have floor effects. Both of these phenomena were not unique and were also consistent with previous studies reported.20,22,23,28

Factor analysis revealed a two-factor structure related to “Pain and Disability” and “Brain Processing and Function” that explained 66% of the total variance. The percentage of the variance was comparable with previous studies that also reported having a two-factor structure.19,20–22,25 Some controversies exist in relation to the factorial structure of the NDI as some studies revealed a one-factor structure for the NDI instead.17,23,24,27 However, the discrepancies found in the factor structure of the current study as compared to other studies may be attributed to the influence of cultural differences.45

The correlation between the NDI-SC and VAS was significantly strong (Rp = 0.61). This is in accordance to previous studies that found similar values (0.48–0.75) between the NDI and VAS.22,24–26 Regarding responsiveness, a significant and strong correlation was also observed between NDI-SC change scores and GROC values (Rs = 0.46). The NDI-SC also showed a strong significant correlation between its change scores and VAS change scores (Rp = 0.59), which agreed with the results of earlier studies.26–28 Where a longer test-retest interval may reflect a greater responsiveness in the NDI-SC over time, our choice interval duration was consistent with those of previous studies.21,46


Criterion validity was not assessed in this study as we did not use an alternative criterion standard for the health-related questionnaire. Furthermore, only the VAS and GROC were used to compare with the NDI-SC. Although the GROC is commonly used due to it being quick and simple to use, the degree of severity of the patient's condition at the time of scoring may influence the scoring. Patients with lower symptom severity are likely to score a greater positive change of the GROC on their retest session.35 Another limitation of the study was the relatively smaller sample size. However, the KMO value of 0.81 suggested that the study had adequate statistical power for the psychometric evaluation of NDI-SC.


The NDI has been successfully cross-culturally adapted and translated into the simplified-Chinese version. The NDI-SC is shown to be a reliable, valid, and responsive measurement tool of pain and functional limitation in simplified-Chinese speaking patients with neck pain.

Key Points

  • The NDI was translated into the simplified-Chinese language and culturally adapted for Chinese-speaking patients with neck pain.
  • The NDI-SC demonstrated an excellent level of internal consistency, strong test-retest reliability, content and construct validity, two-factor subscales, and responsiveness.
  • The NDI-SC is a reliable, valid, and responsive instrument to measure functional limitations in patients with neck pain.


The authors acknowledge the support from Goh Boon Kwang, the Head of the Department of Allied Health, and Dr. Tan Ngiap Chuan, the Research Director of SingHealth Polyclinics. The authors also thank the physiotherapy team for their assistance in the enrolment and recruitment process.


