Secondary Logo

Journal Logo

Methods: Brief Report

Reducing Socially Desirable Responses in Epidemiologic Surveys: An Extension of the Randomized-response Technique

Moshagen, Mortena,b; Musch, Jochena; Ostapczuk, Martina; Zhao, Zengmeia

Author Information
doi: 10.1097/EDE.0b013e3181d61dbc
  • Free

Abstract

Poor dental hygiene is a risk factor for dental diseases.1,2 In a survey of dental hygiene in China, 31% of 20–29 year olds admitted to brushing their teeth less than twice a day.3 The validity of these estimates may be questioned, however, because self-reported hygiene practices are likely to be distorted owing to socially desirable responses.4–6 Validation studies comparing self-report data against gold standard measures have repeatedly shown over-reporting of desirable behaviors such as physical activity7 and under-reporting of undesirable behaviors such as drug use,8,9 energy intake,10,11 and sexual risk behavior.12 Thus, there is reason to suspect that these reported prevalence estimates of dental hygiene habits may have been inflated by social desirability bias.

RANDOMIZED-RESPONSE TECHNIQUE

The randomized-response technique13 was developed to overcome this response bias by increasing the confidentiality of responses. The basic idea is to add random noise to the responses such that there is no direct link between a participant's response and his or her true status.14 In the forced-response variant15 of this technique, a randomization device (with known probability distribution) is used to determine whether participants are asked to respond truthfully or whether they are prompted to provide a prespecified response regardless of their true status. This procedure guarantees that affirmative responses are no longer unequivocally linked to a socially undesirable attribute and therefore no longer stigmatizing for the participants. Consequently, the randomized-response technique encourages more honest responses and, in turn, may provide more valid prevalence estimates of sensitive issues, such as drug use16–18 and sexual behavior.19,20

Despite its successful applications,21,22 the randomized-response technique has been criticized as being susceptible to respondents who are not answering as directed by the randomization device.23 The randomized-response technique underestimates the prevalence of sensitive behaviors to the extent that participants fail to comply with the instructions by denying a sensitive attribute even when prompted to admit to it by the randomization device.

Addressing this issue, Clark and Desharnais24 proposed a cheating-detection modification of the randomized-response technique to explicitly assume that some respondents might fail to comply with the instructions. The modification (Fig.) divides the population into 3 distinct and disjoint groups. The first group (π) consists of compliant respondents who honestly admit being carriers of the sensitive attribute. The second group (β) consists of compliant respondents who truthfully deny the sensitive attribute. The third group (γ = 1 − π − β) consists of noncompliant cheaters who do not conform to the instructions by denying the sensitive attribute irrespective of the randomization process. By symmetry, there may also be respondents who are not carriers of the sensitive attribute but claim it. However, we expect that such a self-incriminating behavior is rare, and we, therefore, ignore it in the model.24

FIGURE.
FIGURE.:
A multinomial representation of the cheating detection variant of the randomized response technique. To make the model identifiable, 2 independent random samples are questioned with different probabilities of being prompted to reply “no” (P1 and P2).

It is important to note that nothing is assumed regarding the true status of noncompliant respondents. It is conceivable that these respondents deny a sensitive behavior in which they have been engaged, but it is also possible that innocent respondents want to rule out even the slightest suspicion and therefore deny that they committed an undesirable act despite being told otherwise by the randomization procedure. The estimated proportion of cheaters can be used to compute an upper bound in a worst-case scenario, which assumes that all noncompliant respondents are, in fact, carriers of the sensitive attribute.25,26

To explore the magnitude of response bias in self-reported hygiene habits, the cheating-detection modification was employed to investigate teeth-brushing behavior among Chinese college students. In addition, the modification was compared with an anonymous self-report measure to estimate how much response bias can be reduced by this method.

METHODS

Participants

A total of 2254 (55% women; aged 18–24 years) undergraduates from the University of Beijing, China, volunteered to participate in this study. Students completed the questionnaire during their regular classes.

Measures and Procedures

The participants completed an anonymous questionnaire comprising demographic information, several questions not pertinent to this study, and the sensitive question: “Do you brush your teeth at least twice a day?” The participants were randomly assigned to 1 of 3 conditions. Two conditions with different probabilities of being prompted to reply “no” (P1 and P2) are required to make the cheating-detection modification identifiable.24 The participants' month of birth was used as the randomization device to keep the randomization procedure simple and transparent. In the low probability condition (P1: n = 900; 56% women), participants born in January or February were instructed to reply “no” independently of their true behavior, whereas participants born in another month were prompted to reply truthfully. In the high probability condition (P2: n = 891; 54% women), participants born in January or February were asked to respond truthfully, whereas the remaining participants were prompted to reply “no.” According to birth statistics provided by the National Bureau of Statistics of China, the randomization probabilities P1 and P2 approximated 0.17 and 0.83, respectively. In the direct questioning condition (n = 463, 54% women), participants were simply asked to respond truthfully.

Statistical Analysis

Closed-form solutions24 for parameter estimation in the cheating-detection modification do not allow a statistical comparison of subgroups. We, therefore, conducted our analysis within the more general framework of multinomial models.27–29 By converting the nonbinary tree model into a statistically equivalent binary tree representation (for details, see Ostapczuk et al25), established statistical procedures of multinomial modeling can be used to estimate the parameters and to test restrictions on them. Parameter estimates were obtained by minimizing the asymptotically χ2-distributed log-likelihood ratio statistic G2 using the EM-algorithm.27,30

RESULTS

There were sizeable differences in the proportion of men (35%; SE = 3.3) and women (10%; SE = 1.9) admitting insufficient teeth brushing behavior with direct questioning. The cheating-detection modification was therefore estimated separately by sex (Table).

TABLE
TABLE:
TABLE. Observed (Direct Questioning) and Estimated (Randomized Response) Percentages of “Yes” and “No” Responses to the Question, “Do You Brush Your Teeth at Least Twice a Day?”

Using the cheating-detection modification, πm = 51% (SE = 3.2) of men and πf = 20% (SE = 2.7) of women reported insufficient dental hygiene. The estimates were considerably higher than the estimates with direct questioning for both men and women, indicating substantial under-reporting with direct questioning. Moreover, a substantial proportion of noncompliance with the instructions was observed for both men (γm = 10.1%; SE = 2.4) and women (γf = 13.0%; SE = 2.5).

Depending on whether noncompliant respondents were considered to have engaged in insufficient teeth brushing, the lower-bound estimate for the proportion admitting to insufficient teeth brushing was πm = 51% for men and πf = 20% for women; the respective upper-bound estimate was πm + γ = 62% for men and πf + γ = 32% for women.

DISCUSSION

Survey data may reflect what respondents want to tell the investigator, rather than their actual behavior. We used a cheating detection modification of the randomized-response-technique to improve the validity of response data on dental hygiene habits in a sample of Chinese college students. Consistent with previous studies, only 35% of men and 10% of women reported insufficient dental hygiene habits when questioned directly. When the cheating-detection modification was employed, however, the proportions increased considerably for men, and almost doubled for women. Assuming that all noncompliant respondents in fact brushed their teeth less than twice a day, the upper-bound prevalence estimate of insufficient dental hygiene habits in the present sample was 62% for men and 32% for women. Prevalence estimates of dental hygiene habits may also be positively biased in other populations. More generally, direct questioning may provide strongly distorted prevalence estimates in surveys of socially undesirable behavior. The same is also true, however, for traditional variants of the randomized-response technique not capable of detecting cheating, because the prevalence of a sensitive attribute is underestimated to the extent there is noncompliance with instructions.

Several limitations should be considered. First, randomized-response models introduce random error and induce greater sampling variance. The randomized-response technique, therefore, requires considerably larger samples than a direct question. This loss of efficiency is outweighed by a gain in precision only when the attribute under investigation is sufficiently sensitive. Second, the randomized-response technique is more complicated to administer because the respondents have to understand how the randomized-response technique protects their privacy.31 Although the randomized-response technique has been successfully used with older and less educated respondents, noncompliance rates tend to increase in such populations.14,26,32,33 Finally, as the true status of any individual remains unknown, it is difficult to compute measures of association between an randomized-response-technique-variable and other variables of interest.34–36 Such limitations notwithstanding, the cheating-detection modification provides a means to improve prevalence estimates of sensitive behaviors, and may be useful in epidemiologic surveys of sensitive behaviors.

REFERENCES

1. Pihlstrom BL, Michalowicz BS, Johnson NW. Periodontal diseases. Lancet. 2005;366:1809–1820.
2. Bader JD, Shugars DA, Bonito AJ. A systematic review of selected caries prevention and management methods. Community Dent Oral Epidemiol. 2001;29:399–411.
3. Peng B, Petersen PE, Tai BJ, Yuan BY, Fan MW. Changes in oral health knowledge and behaviour 1987–95 among inhabitants of Wuhan City, PR China. Int Dent J. 1997;47:142–147.
4. Tourangeau R, Yan T. Sensitive questions in surveys. Psychol Bull. 2007;133:859–883.
5. Manun'Ebo M, Cousens S, Haggerty P, Kalengaie M, Ashworth A, Kirkwood B. Measuring hygiene practices: a comparison of questionnaires with direct observations in rural Zaire. Trop Med Int Health. 1997;2:1015–1021.
6. Curtis V, Cousens S, Mertens T, Traore E, Kanki B, Diallo I. Structured observations of hygiene behaviours in Burkina Faso: validity, variability, and utility. Bull World Health Organ. 1993;71:23–32.
7. Adams SA, Matthews CE, Ebbelin CB, et al. The effect of social desirability and social approval on self-reports on physical activity. Am J Epidemiol. 2005;161:389–398.
8. Colon HM, Robles RR, Sahai H. The validity of drug use responses in a household survey in Puerto Rico: comparison of survey responses of cocaine and heroin use with hair tests. Int J Epidemiol. 2001;30:1042–1049.
9. Johnson T, Fendrich M. Modeling sources of self-report bias in a survey of drug use epidemiology. Ann Epidemiol. 2005;15:381–389.
10. Hebert JR, Ma Y, Clemow L, et al. Gender differences in social desirability and social approval bias in dietary self-report. Am J Epidemiol. 1997;146:1046–1055.
11. Subar AF, Kipnis V, Troiano RP, et al. Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: The OPEN study. Am J Epidemiol. 2003;158:1–13.
12. Fennema JSA, van Ameijden EJC, Coutinho RA, van den Hoek JAR. Validity of self-reported sexually transmitted diseases in a cohort of drug-using prostitutes in Amsterdam: Trends from 1986 to 1992. Int J Epidemiol. 1995;24:1034–1041.
13. Warner S. Randomized-response: A survey technique for eliminating evasive answer bias. J Am Stat Assoc. 1965;60:63–69.
14. van der Heijden PGM, van Gils G, Bouts J, Hox JJ. A comparison of randomized response, computer-assisted self-interview, and face-to-face direct questioning: Eliciting sensitive information in the context of welfare and unemployment benefit. Sociol Methods Res. 2000;28:505–537.
15. Greenberg B, Abul-Ela A, Simmons W, Horvitz D. Unrelated question randomized response model: Theoretical framework. J Am Stat Assoc. 1969;64:520–539.
16. Fisher M, Kupferman LB, Lesser M. Substance use in a school-based clinic population: Use of the randomized response technique to estimate prevalence. J Adolesc Health. 1992;13:281–285.
17. Simon P, Striegel H, Aust F, Dietz K, Ulrich R. Doping in fitness sports: Estimated number of unreported cases and individual probability of doping. Addiction. 2006;101:1640–1644.
18. Weissman AN, Steer RA, Lipton DS. Estimating illicit drug use through telephone interviews and the randomized response technique. Drug Alcohol Depend. 1986;18:225–233.
19. Finkelhor D, Lewis IA. An epidemiologic approach to the study of child molestation. Ann N Y Acad Sci. 1988;528:64–78.
20. Zimmerman RS, Langer LM. Improving estimates of prevalence rates of sensitive behaviors: The randomized lists technique and consideration of self-reported honesty. J Sex Res. 1995;32:107–117.
21. Fox JA, Tracy PE. Randomized Response: A Method for Sensitive Surveys. Beverly Hills, CA: Sage; 1986.
22. Lensvelt-Mulders GJLM, Hox JJ, van der Heijden PGM, Maas CJM. Meta-analysis of randomized response research: Thirty-five years of validation. Sociol Methods Res. 2005;33:319–348.
23. Campbell A. Randomized response technique. Science. 1987;236:1049.
24. Clark SJ, Desharnais RA. Honest answers to embarrassing questions: Detecting cheating in the randomized response model. Psychol Methods. 1998;3:160–168.
25. Ostapczuk M, Moshagen M, Zhao Z, Musch J. Assessing sensitive attributes using the randomized-response-technique: Evidence for the importance of response symmetry. J Educ Behav Stat. 2009;34:267–287.
26. Ostapczuk M, Musch J, Moshagen M. A randomized-response investigation of the education effect in attitudes towards foreigners. Eur J Soc Psychol. 2009;39:920–931.
27. Hu X, Batchelder WH. The statistical analysis of general processing tree models with the EM algorithm. Psychometrika. 1994;59:21–47.
28. Batchelder WH, Riefer DM. Theoretical and empirical review of multinomial process tree modeling. Psychon Bull Rev. 1999;6:57–86.
29. Erdfelder E, Hilbig BE, Auer T, et al. Multinomial processing tree models: A review of the literature. Z Psychol J Psychol. 2009;217:108–124.
30. Moshagen M. multiTree: A computer program for the analysis of multinomial processing tree models. Behav Res Methods. 2010;42:42–54.
31. Landsheer JA, Van Der Heijden P, Van Gils G. Trust and understanding, two psychological aspects of randomized response. Qual Quant. 1999;33:1–12.
32. Böckenholt U, Van der Heijden PG. Item randomized-response models for measuring noncompliance: Risk-return perceptions, social influences, and self-protective responses. Psychometrika. 2007;72:245–262.
33. Chow LP, Gruhn W, Chang WP. Feasibility of the randomized response technique in rural Ethiopia. Am J Public Health. 1979;69:273–276.
34. Maddala GS. Limited Dependent and Qualitative Variables in Econometrics. Cambridge: Cambridge University Press; 1983.
35. Rittenhouse BE. Respondent-specific information from the randomized response interview: Compliance assessment. J Clin Epidemiol. 1996;49:545–549.
36. van den Hout A, van der Heijden P, Gilchrist R. The logistic regression model with response variables subject to randomized response. Comput Stat Data Anal. 2007;51:6060–6069.
© 2010 Lippincott Williams & Wilkins, Inc.