Generalized anxiety disorder (GAD) is one of the most common mental disorders worldwide. Anxiety symptoms are more prevalent during the prenatal period, with an incidence ranging from 15% to 23%, much higher than 3.1% in the general population[1–4]. Maternal anxiety during pregnancy is associated with several adverse outcomes, including spontaneous abortion, preeclampsia, placental abruption, preterm labor, and low birth weight, smaller head circumference, and lower mental developmental scores in infants[5–8]. If it is left untreated, the mother–infant relationship is influenced. Antenatal anxiety has also been considered to be one of the strongest predictors of postpartum anxiety and depression[10–12]. Early detection of anxiety disorders was strongly recommended by the American College of Obstetricians and Gynecologists in 2018.
Prenatal anxiety diagnosis is vulnerable to interference. First, pregnant women experience an increase in fear related to the health of their babies, their own health, financial matters, childcare, and parenting. Symptoms, such as fatigue, muscle tension, poor concentration, sleep difficulties, irritability, and restlessness, appear gradually, which may lead physicians to overlook the clinical diagnosis of GAD[14,15]. Second, GAD is a highly comorbid disorder, particularly accompanied by depression. The presence of one or more psychiatric comorbidities increases the severity of symptoms and decreases the accuracy of diagnosis of anxiety. To effectively detect and diagnose GAD during pregnancy, which require the use of reliable and valid screening tools, we used GAD 7-item (GAD-7) scale, one of the most widely used self-report tools for anxiety symptoms available in multiple languages. The GAD-7 scale is a self-rated assessment tool developed by Spitzer et al. to screen for GAD in primary care populations. In China, the GAD-7 scale was first validated in general hospital outpatients in 2010. Globally, among clinical and general population samples, the GAD-7 scale has demonstrated good reliability and cross-cultural validity as a measure of GAD[4,18]. In 2014, the National Institute for Health and Care Excellence first commended the use of the GAD-7 scale to assess prenatal anxiety.
Although the GAD-7 scale has proven to be a useful screening tool for GAD in primary care populations, its use as a screening tool for GAD in early pregnant women has not been evaluated. Therefore, this study aimed to test the reliability and validity of the GAD-7 scale in early pregnant women.
Materials and methods
Research design and participants
In this cross-sectional study, 30,823 early pregnant women were recruited from the Obstetrics and Gynecology Hospital of Fudan University, Shanghai, China from January 2017 to December 2020. The inclusion criteria were gestational age of <14 weeks, age range of 20–50 years, and normal cognition and communication ability. The participants were asked to complete the Chinese version of the GAD-7 scale independently via a mobile phone application. This study was approved by the institutional review board of the Obstetrics and Gynecology Hospital of Fudan University (Shanghai, China) (2017-73). Informed consent was obtained from all participants.
After recruitment, 23,339 women were randomly divided into three subgroups using SPSS (IBM SPSS Statistics version 20.0; IBM Corporation, Chicago, IL, USA). The first subgroup was used for the item response theory (IRT). The second subgroup was used for exploratory factor analyses (EFAs). The third subgroup was used for confirmatory factor analyses (CFAs). A subset of the participants (n = 7484) was selected for construct validity, criterion validity, and reliability tests (Supplementary Fig. 1; https://links.lww.com/RDM/A10).
Measurement and procedure
The GAD-7 scale is a 7-item questionnaire developed to identify probable cases of GAD and measure the severity of GAD symptoms. The GAD-7 scale items include the following: 1) nervousness, 2) inability to stop worrying, 3) excessive worry, 4) restlessness, 5) difficulty in relaxing, 6) easy irritation, and 7) fear of something awful happening. Participants were asked to rate how often they have been bothered by each of these seven core symptoms over the past 2 weeks. The response categories are “not at all,” “several days,” “more than half the days,” and “nearly every day,” scored as 0, 1, 2, and 3, respectively. The total GAD-7 scale score ranges from 0 to 21.
The patient health questionnaire-9 item (PHQ-9) is a self-administered questionnaire that assesses anxiety and depressive symptoms. It evaluates the frequency of certain symptoms within the past 2 weeks. The scale has nine items: 1) anhedonia, 2) depressed mood, 3) trouble sleeping, 4) feeling tired, 5) change in appetite, 6) guilt, self-blame, or worthlessness, 7) trouble concentrating, 8) feeling slowed or restless, and 9) thoughts of being better off dead or hurting oneself; the items are scored from 0 to 3 (0 = not at all, 3 = nearly every day), and the total score ranges from 0 to 27. A higher score reflects a more severe degree of abnormal symptoms[20,21].
Basic calculations were performed using the IBM SPSS version 20.0. The EFA and IRT were conducted using Mplus version 8.3 and the CFA using IBM SPSS version 20.0. Statistical significance was set at P <0.05, and all P values were two-tailed.
Discrimination test via the IRT
To evaluate the discriminative ability of each item, we used a graded response model of the IRT in group 1.1. Item characteristic curves (ICCs) were constructed for each item to reflect the discrimination and difficulty. The discrimination (α) and difficulty parameters (β) were presented. α indicated how well an item differentiated between anxious and non-anxious women. The higher the discrimination parameter, the better the respondents were at differentiating. β represented the level of the trait needed when 50% of the respondents endorsed this item. The higher the difficulty parameter, the more anxious the respondents were. According to the rules suggested by Baker and Kim, an item with a discrimination parameter of >0.65 was considered to have a moderate or good discrimination power and would be retained. The IRT can provide two useful measures, difficulty and discrimination, both of which are technical properties of ICCs. ICCs are a nonlinear regression of the probability of a correct response to each item. The difficulty parameter is the ability value associated with a 50% probability of scoring 1 (rather than 0) on an individual item.
Structural validity test via factor analyses
The structural validity of the scale and the latent correlations of the items were analyzed using factor analyses. Items deleted on the basis of the IRT would not be included in the factor analyses. We performed the EFA in group 1.2. The Kaiser–Meyer–Olkin (KMO) statistic and Bartlett’s test were performed first. A parallel analysis was applied to determine the number of factors by comparing the eigenvalues of the actual data with the corresponding eigenvalues of the random data. According to the number of factors, the maximum likelihood method and Promax rotation were applied to extract the variables for each factor. Items with a factor loading of greater than 0.3 after rotation were preserved.
We further executed the CFA using the maximum likelihood algorithm in groups 1.3. Items with a factor loading of <0.4 and standardized residual covariance of >1.96 or ≤1.96 were removed from the model. The χ2/df (expected to be <5), goodness-of-fit index (expected to be >0.9), comparative fit index (CFI, expected to be >0.9), root mean square error of approximation (RMSEA, expected to be <0.05), and standardized root mean square residual (SRMR, expected to be <0.05) were used to assess the goodness-of-fit of the model.
The remaining items were assessed for their reliability. We calculated the Cronbach’s alpha coefficient and Guttman split-half coefficient of each domain and the total scale in group 2. Cronbach’s alpha coefficients of ≥0.7 and <0.5 indicate good and unacceptable correlations, respectively.
Criterion validity test
To assess the criterion validity, we performed a correlation analysis between the GAD-7 scale and PHQ-9 scores in group 2.
Characteristics of the patients
A summary of the selected sociodemographic and reproductive characteristics of the patients is presented in Supplementary Table 1; https://links.lww.com/RDM/A11. A total of 30,823 participants between the age of 20 and 50 years (mean age = 31.20 years; standard deviation = 3.99 years) were included. Most participants had a college education. No significant differences were observed in the gestational age, educational level, occupation, nulliparity, and body mass index between groups 1.1, 1.2, and 1.3 owing to the random split. The GAD-7 scale score generally ranges from 0 to 21, with 5, 10, and 15 representing mild, moderate, and severe levels of anxiety symptoms. Herein, there were no significant differences among the proportions of women with the three levels of anxiety severity in each group.
Item discrimination according to the IRT in group 1.1
The discrimination parameter (α) for each item varied from 1.831 to 3.742 (Table 1). Among the seven items, all exhibited a high discrimination power (α >0.65). The difficulty parameter (β) ranged from −0.017 to 7.951. The ICCs for each item are presented in Fig. 1. All items showed a high discrimination power.
Table 1. -
Item response theory results (graded response model) in the first random subgroup of group 1.1 (n
||Feeling nervous, anxious or on edge
||Not being able to stop or control warrying
||Worrying too much about different things
||Being so restless that it is hard to sit still
||Becoming easily annoyed or irritable
||Feeling afraid as if something awful might happen
α: discrimination parameter; β1: difficulty parameter 1; β2: difficulty parameter 2; β3: difficulty parameter 3; GAD-7: A brief measure for assessing generalized anxiety disorder; IRT: item response theory; SE: standard error.
β1 Stands for the level of anxiety trait at which respondents have the same probability to choose scores 1 or 0.
β2 Stands for the level of anxiety trait at which respondents have the same probability to choose scores 2 or 1.
β3 Stands for the level of anxiety trait at which respondents have the same probability to choose scores 3 or 2.
EFA results in group 1.2
All items were included in the EFA. The KMO statistic value was 0.911, while Bartlett’s test result was significant (χ2 = 28826.865, P <0.001) for the remaining seven items in group 1.2, indicating the potential correlation of the items. In the parallel analysis, the eigenvalue of the first factor in the actual dataset was 4.335. Only one factor was identified (Table 2), which explained 61.930% of the variance. The factor loadings of all items were >0.3.
Table 2. -
Exploratory factor analysis results (Promax rotation) in the second random subgroup (n
CFA results in group 1.3
The CFA was conducted in group 1.3 based on the single-factor model from the EFA. Based on the fit indices, the model fitted the data well (χ2 = 23.991, df = 7, RMSEA = 0.018, SRMR = 0.004, CFI = 0.999, TLI = 0.998) (Table 3).
Table 3. -
Confirmatory factor analysis results in the third random subgroup (n
Structural and criterion validities and reliabilities in group 2 (external validation sample)
The final model was also verified in the external validation sample with a fairly good fit (χ2 = 18.020, df = 5, RMSEA = 0.019, SRMR = 0.004, CFI = 1.000, and TLI = 0.998). The correlation coefficient between the GAD-7 scale and PHQ-9 score was 0.639 (P <0.001). The Cronbach’s alpha coefficient of the seven items was 0.891, while the Guttman split-half coefficient was 0.794.
Our study is the first to examine the psychometric properties (discriminative ability, reliability, and criterion validity) and factor structure of the GAD-7 scale in early pregnant women. Our results indicate that the GAD-7 scale is a reliable tool for detecting GAD in this population. All items exhibited a high discrimination power. The reliability of the Chinese version of the GAD-7 scale was good (Cronbach’s alpha coefficient = 0.891; Guttman split-half coefficient = 0.794). The factor analyses confirmed the unidimensional structure of the GAD-7 scale. The GAD-7 scale score had a strong positive correlation with the PHQ-9 score, which is consistent with previous results.
Several measures have been used to assess anxiety. The Hospital Anxiety and Depression scale (HADS) and State-Trait Anxiety Inventory (STAI) both contain items where the ratings may be confounded by symptoms of normal pregnancy (eg, HADS: “I can sit at ease and feel relaxed”; STAI: “I tire quickly” and “I feel rested”), potentially increasing the incidence of false-positive results. In a Canadian study, 240 perinatal women (155 pregnant and 85 postpartum women) were enrolled and completed the GAD-7 scale and EPDS. The analysis indicated that the psychometric properties of the GAD-7 scale were slightly better than those of the EPDS and EPDS-3A in the detection of GAD in this population. At an optimal cutoff score of 13, the GAD-7 scale yielded fair sensitivity (61.3%) and specificity (72.7%), whereas the Edinburgh Postnatal Depression scale (EPDS) yielded poor specificity (<40%). AUC analyses showed that the accuracy of the GAD-7 scale in detecting GAD was moderate (0.71), whereas that of the EPDS was lower (0.62). Meanwhile, the Mini International Neuropsychiatric Interview is administered by trained professionals and takes approximately 15 minutes to complete. Other physicians cannot confirm GAD using this method because of a lack of professional skills and knowledge. Good reliability and factorial validity of the GAD-7 scale have been recognized in pregnant European and American women. In another study, Spanish-speaking pregnant women (n = 385) were recruited and completed the GAD-7 scale at three time points once per trimester. In the first trimester, the GAD-7 scale demonstrated a good internal consistency (α = 0.89). The proposed one-factor structure was found using an EFA and optimal implementation of a parallel analysis. Another study included 2,978 Peruvian women who attended their first perinatal care visit and underwent GAD-7 scale screening. Therein, the reliability of the GAD-7 scale was good (Cronbach’s alpha coefficient = 0.891). A cutoff score of 7 or higher, maximizing the Youden index, yielded a sensitivity of 73.3% and a specificity of 67.3%. The one-factor structure of the GAD-7 scale was confirmed using EFA and CFA. This study confirmed the reliability and validity of the Chinese version of the GAD-7 scale in a sample of early pregnant women. Furthermore, the discrimination power of each item was high, with items “Not being able to stop or control worrying,” “Worrying too much about different things,” and “Trouble relaxing” showing a higher discrimination power.
The findings regarding the factorial structure of the GAD-7 scale were mixed. Some studies supported the two-factor structure[33,34]. Items 4 (“Trouble relaxing”), 5 (“Being so restless that it is hard to sit still”), and 6 (“Becoming easily annoyed or irritable”) shared unique variance beyond the unidimensional structure, reflecting agitation and irritability mood states and suggesting a somatic tension/autonomic arousal factor. Meanwhile, different population studies supported the one-factor model[18,35–37]. In a previous study, 2740 adult patients completed the GAD-7 scale in 15 primary care clinics in the United States. The factorial validity demonstrated unidimensionality, and the factor loadings ranged from 0.69 to 0.81. In our study, the population included early pregnant women, and both EFA and CFA confirmed the unidimensional structure of the GAD-7 scale.
This study had several strengths, including the large sample size and the execution of a rigorous analytic plan. Our study expands the literature by including an assessment of the Chinese version of the GAD-7 scale in early pregnant women. The GAD-7 scale takes less than 3 minutes to complete and is easy to score. Despite these strengths, this study had several limitations. Sensitivity, specificity, and cutoff values were not presented in the study. Another study conducted in our hospital (n = 170) suggested that at the maximum Youden index of 0.53, the optimal cutoff score for the GAD-7 scale among pregnant women was 7, and an AUC value of 0.83, sensitivity of 96.8%, and specificity of 56.1% were obtained. Moreover, all participants were in their first trimester of pregnancy. Further validation of the scale in samples from different trimesters of pregnancy is strongly recommended.
In conclusion, our results suggest that the Chinese version of the GAD-7 scale may be used as a screening tool for early pregnancy. The GAD-7 scale has good reliability and factorial and concurrent validities. Women who screen positive may require further investigation to confirm GAD diagnosis. Future studies on the development of new screening tools for GAD with superior psychometric properties specifically validated in pregnant women in all three trimesters and postpartum women are encouraged.
Supplementary information is linked to the online version of the paper on the Reproductive and Developmental Medicine website.
The authors sincerely thank all the investigators and coordinators who contributed to this study: the colleagues in the Obstetrics and Gynecology Hospital of Fudan University and School of Psychology and Cognitive Science, East China Normal University for data collection and cooperation.
L.G. and X.L.X. designed the study and analyzed the data. W.H., Y.S. and Y.N. collected the data. L.G. acquired funding for the project. X.L.X., L.G. and S.L. drafted the manuscript. X.X. and J.L. reviewed and edited the manuscript.
This work was supported by the National Natural Science Foundation of China (82101786) and China Medical Board (21-428). The funders had no role in the study design, data collection or analysis, decision to publish, or manuscript preparation.
Conflicts of interest
All authors declare no conflict of interest.
. Kessler RC, Chiu WT, Demler O, et al. Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Arch Gen Psychiatry. 2005;62(6):617–627. doi:10.1001/archpsyc.62.6.617.
. Fairbrother N, Young AH, Janssen P, et al. Depression and anxiety during the perinatal period. BMC Psychiatry. 2015;15:206. doi:10.1186/s12888-015-0526-6.
. Soto-Balbuena C, Rodríguez MF, Escudero Gomis AI, et al. Incidence, prevalence and risk factors related to anxiety symptoms during pregnancy. Psicothema. 2018;30(3):257–263. doi:10.7334/psicothema2017.379.
. Sinesi A, Maxwell M, O’Carroll R, et al. Anxiety scales used in pregnancy: systematic review. BJPsych Open. 2019;5(1):e5. doi:10.1192/bjo.2018.75.
. Talge NM, Neal C, Glover V, et al. Antenatal maternal stress and long-term effects on child neurodevelopment: how and why? J Child Psychol Psychiatry. 2007;48(3-4):245–261. doi:10.1111/j.1469-7610.2006.01714.x.
. Ding XX, Wu YL, Xu SJ, et al. Maternal anxiety during pregnancy and adverse birth outcomes: a systematic review and meta-analysis of prospective cohort studies. J Affect Disord. 2014;159:103–110. doi:10.1016/j.jad.2014.02.027.
. Stein A, Pearson RM, Goodman SH, et al. Effects of perinatal mental disorders on the fetus and child. Lancet. 2014;384(9956):1800–1819. doi:10.1016/S0140-6736(14)61277-0.
. Chen XN, Hu Y, Hu XW, et al. Risk of adverse perinatal outcomes and antenatal depression based on the Zung self-rating depression scale. Reprod Dev Med. 2021;5(1):23–29. doi:10.4103/2096-2924.313683.
. Farré-Sender B, Torres A, Gelabert E, et al. Mother-infant bonding in the postpartum period: assessment of the impact of pre-delivery factors in a clinical sample. Arch Womens Ment Health. 2018;21(3):287–297. doi:10.1007/s00737-017-0785-y.
. Grant KA, McMahon C, Austin MP. Maternal anxiety during the transition to parenthood: a prospective study. J Affect Disord. 2008;108(1-2):101–111. doi:10.1016/j.jad.2007.10.002.
. Milgrom J, Gemmill AW, Bilszta JL, et al. Antenatal risk factors for postnatal depression: a large prospective study. J Affect Disord. 2008;108(1-2):147–157. doi:10.1016/j.jad.2007.10.014.
. Verreault N, Da Costa D, Marchand A, et al. Rates and risk factors associated with depressive symptoms during pregnancy and with postpartum onset. J Psychosom Obstet Gynaecol. 2014;35(3):84–91. doi:10.3109/0167482X.2014.947953.
. American College of Obstetricians and Gynecologists. ACOG committee opinion no. 757: screening for perinatal depression. Obstet Gynecol. 2018;132(5):e208–e212. doi:10.1097/AOG.0000000000002927.
. Weisberg RB, Paquette JA. Screening and treatment of anxiety disorders in pregnant and lactating women. Womens Health Issues. 2002;12(1):32–36. doi:10.1016/s1049-3867(01)00140-2.
. Simpson W, Glazer M, Michalski N, et al. Comparative efficacy of the generalized anxiety disorder 7-item scale and the Edinburgh Postnatal Depression Scale as screening tools for generalized anxiety disorder in pregnancy and the postpartum period. Can J Psychiatry. 2014;59(8):434–440. doi:10.1177/070674371405900806.
. Lieb R, Becker E, Altamura C. The epidemiology of generalized anxiety disorder in Europe. Eur Neuropsychopharmacol. 2005;15(4):445–452. doi:10.1016/j.euroneuro.2005.04.010.
. Spitzer RL, Kroenke K, Williams JB, et al. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. 2006;166(10):1092–1097. doi:10.1001/archinte.166.10.1092.
. Löwe B, Decker O, Müller S, et al. Validation and standardization of the Generalized Anxiety Disorder Screener (GAD-7) in the general population. Med Care. 2008;46(3):266–274. doi:10.1097/MLR.0b013e318160d093.
. National Collaborating Centre for Mental Health. Antenatal and Postnatal Mental Health: Clinical Management and Service Guidance: Updated Edition. Leicester, UK: British Psychological Society2014.
. Xia NG, Lin JH, Ding SQ, et al. Reliability and validity of the Chinese version of the Patient Health Questionnaire 9 (C-PHQ-9) in patients with epilepsy. Epilepsy Behav. 2019;95:65–69. doi:10.1016/j.yebeh.2019.03.049.
. Gierk B, Kohlmann S, Kroenke K, et al. The somatic symptom scale-8 (SSS-8): a brief measure of somatic symptom burden. JAMA Intern Med. 2014;174(3):399–407. doi:10.1001/jamainternmed.2013.12179.
. Walters GD, Hagman BT, Cohn AM. Toward a hierarchical model of criminal thinking: evidence from item response theory and confirmatory factor analysis. Psychol Assess. 2011;23(4):925–936. doi:10.1037/a0024017.
. Baker FB, Kim SH. Item Response Theory: Parameter Estimation Techniques. 2nd
ed. Florida, America: CRC press2004.
. Mungas D, Reed BR, Kramer JH. Psychometrically matched measures of global cognition, memory, and executive function for assessment of cognitive decline in older persons. Neuropsychology. 2003;17(3):380–392. doi:10.1037/0894-4188.8.131.520.
. O’Connor BP. SPSS and SAS programs for determining the number of components using parallel analysis and Velicer’s MAP test. Behav Res Methods Instrum Comput. 2000;32(3):396–402. doi:10.3758/bf03200807.
. Barrett P. Structural equation modelling: adjudging model fit. Pers Individ Dif. 2007;42(5):815–824. doi:10.1016/J.PAID.2006.09.018.
. Moret L, Mesbah M, Chwalow J, et al. Internal validation of a measurement scale: relation between principal component analysis, Cronbach’s alpha coefficient and intra-class correlation coefficient. Rev Epidemiol Sante Publique. 1993;41(2):179–186.
. Zhang Z, Zhai A, Yang M, et al. Prevalence of depression and anxiety symptoms of high school students in Shandong province during the COVID-19 epidemic. Front Psychiatry. 2020;11:570096. doi:10.3389/fpsyt.2020.570096.
. Meades R, Ayers S. Anxiety measures validated in perinatal populations: a systematic review. J Affect Disord. 2011;133(1-2):1–15. doi:10.1016/j.jad.2010.10.009.
. Vasiliadis HM, Chudzinski V, Gontijo-Guerra S, et al. Screening instruments for a population of older adults: the 10-item Kessler Psychological Distress Scale (K10) and the 7-item Generalized Anxiety Disorder Scale (GAD-7). Psychiatry Res. 2015;228(1):89–94. doi:10.1016/j.psychres.2015.04.019.
. Soto-Balbuena C, Rodríguez-Muñoz MF, Le HN. Validation of the Generalized Anxiety Disorder Screener (GAD-7) in Spanish pregnant women. Psicothema. 2021;33(1):164–170. doi:10.7334/psicothema2020.167.
. Zhong QY, Gelaye B, Zaslavsky AM, et al. Diagnostic validity of the Generalized Anxiety Disorder-7 (GAD-7) among pregnant women. PLoS One. 2015;10(4):e0125096. doi:10.1371/journal.pone.0125096.
. Kertz S, Bigda-Peyton J, Bjorgvinsson T. Validity of the Generalized Anxiety Disorder-7 scale in an acute psychiatric sample. Clin Psychol Psychother. 2013;20(5):456–464. doi:10.1002/cpp.1802.
. Beard C, Björgvinsson T. Beyond generalized anxiety disorder: psychometric properties of the GAD-7 in a heterogeneous psychiatric sample. J Anxiety Disord. 2014;28(6):547–552. doi:10.1016/j.janxdis.2014.06.002.
. Donker T, van Straten A, Marks I, et al. Quick and easy self-rating of Generalized Anxiety Disorder: validity of the Dutch web-based GAD-7, GAD-2 and GAD-SI. Psychiatry Res. 2011;188(1):58–64. doi:10.1016/j.psychres.2011.01.016.
. Bártolo A, Monteiro S, Pereira A. Factor structure and construct validity of the Generalized Anxiety Disorder 7-item (GAD-7) among Portuguese college students. Cad Saude Publica. 2017;33(9):e00212716. doi:10.1590/0102-311X00212716.
. Maroufizadeh S, Omani-Samani R, Almasi-Hashiani A, et al. The reliability and validity of the Patient Health Questionnaire-9 (PHQ-9) and PHQ-2 in patients with infertility. Reprod Health. 2019;16(1):137. doi:10.1186/s12978-019-0802-x.
. Gong Y, Zhou H, Zhang Y, et al. Validation of the 7-item Generalized Anxiety Disorder scale (GAD-7) as a screening tool for anxiety among pregnant Chinese women. J Affect Disord. 2021;282:98–103. doi:10.1016/j.jad.2020.12.129.