Background. A number of self-administered questionnaires are available for assessing depression severity, including the 9-item Patient Health Questionnaire depression module (PHQ-9). Because even briefer measures might be desirable for use in busy clinical settings or as part of comprehensive health questionnaires, we evaluated a 2-item version of the PHQ depression module, the PHQ-2.
Methods. The PHQ-2 inquires about the frequency of depressed mood and anhedonia over the past 2 weeks, scoring each as 0 (“not at all”) to 3 (“nearly every day”). The PHQ-2 was completed by 6000 patients in 8 primary care clinics and 7 obstetrics–gynecology clinics. Construct validity was assessed using the 20-item Short-Form General Health Survey, self-reported sick days and clinic visits, and symptom-related difficulty. Criterion validity was assessed against an independent structured mental health professional (MHP) interview in a sample of 580 patients.
Results. As PHQ-2 depression severity increased from 0 to 6, there was a substantial decrease in functional status on all 6 SF-20 subscales. Also, symptom-related difficulty, sick days, and healthcare utilization increased. Using the MHP reinterview as the criterion standard, a PHQ-2 score ≥3 had a sensitivity of 83% and a specificity of 92% for major depression. Likelihood ratio and receiver operator characteristic analysis identified a PHQ-2 score of 3 as the optimal cutpoint for screening purposes. Results were similar in the primary care and obstetrics–gynecology samples.
Conclusion. The construct and criterion validity of the PHQ-2 make it an attractive measure for depression screening.
Depression is a prevalent and disabling condition in the general medical setting. Although many patients with depression receive care exclusively in the primary care rather than mental health sector, up to half of depression cases in primary care go unrecognized. 1,2 The U.S. Preventive Services Task Force recently concluded that there is sufficient evidence to recommend periodic screening for depression. 1 Numerous well-validated questionnaires are available for depression screening and are similar in terms of their operating characteristics as case-finding instruments. 2–4 One particularly popular instrument is the 9-item depression module of the Patient Health Questionnaire, the PHQ-9. Validated in 6000 patients, the PHQ-9 serves as both a depression severity measure as well as a diagnostic instrument for the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV), depressive disorders. 5
However, even shorter measures could be desirable in some circumstances. First, the busy nature and competing demands of primary care practice make efficiency a particularly important attribute of any new measure. 6–8 Second, depression is only one of the many disorders for which screening in primary care is encouraged. Thus, brief depression measures could be incorporated as part of comprehensive health questionnaires administered to either patients new to a practice or established patients on a periodic basis. Keeping to a minimum the number of items asked about a single disorder is an important factor to maintain a reasonable length for such questionnaires. The same might be true for research studies in which depression is a secondary rather than primary variable, and asking a few rather than many items can reduce respondent burden.
Therefore, we examined the operating characteristics of 2 items from the PHQ-9, depressed mood and anhedonia, which we call the PHQ-2. Previous research has shown that a single question about depressed mood has a sensitivity of 85% to 90% for major depression, 3,9 and adding a second question about anhedonia increases the sensitivity to 95%. 3 The PHQ-2 asks respondents to estimate the frequency of these 2 symptoms over the past 2 weeks with 4 response options ranging from “not at all” to “nearly everyday.” Data are analyzed from the 2 major PHQ studies involving 3000 patients in 8 primary care clinics and 3000 patients in 7 obstetrics–gynecology clinics. 10,11 Our aims were to assess the criterion and construct validity of the PHQ-2.