Surveys are an important part of health professions education (HPE) research,1 yet the HPE literature provides little information about their prevalence or response outcomes. Several studies have evaluated attending physician response rates and strategies (e.g., financial incentives) associated with improving response rates among this population,2–4 but no studies to our knowledge describe health professions trainees’ response rates or strategies to improve them.
Health professions trainees represent a unique demographic, based on their specific characteristics. Medical students in the United States are disproportionately Caucasian and are more educated but have lower incomes compared with their contemporaries nationally.5,6 Additionally, they are subject to the hierarchy of health professions (as subordinates in a culture that gives extensive deference to superiors)7 which may impact how they interact with surveys. Characterizing participation in surveys and understanding effective survey methods among specific populations are essential for improving survey quality8 and the quality of research based on those surveys. Likewise, characterizing nonrespondents and nonresponse bias among health professions trainees is essential to ensure that surveys accurately reflect the population of interest. Much of this information about survey-based research (nonresponse bias, response rates) is contained in paradata, or information about the survey methods, such as how the survey was delivered, if potential respondents received prenotification, or if the researchers offered incentives.
Because we could find no published information about how health professions trainees generally respond to surveys, we sought primarily to establish a baseline mean response rate reported in medical education journals. Our secondary end points were to evaluate (1) strategies associated with improved response rates and (2) the prevalence of nonresponse bias.
We searched all articles published in 2013 in Academic Medicine (12 issues), Medical Education (12 issues), and Advances in Health Sciences (5 issues) for original research articles that used any type of survey to study health professions trainees. We chose these journals because they were the three highest-impact-factor journals for general medical education in 2013.9
Definitions and survey characteristics
We defined “research articles” as full reports that presented original data, literature reviews, or meta-analyses. We explicitly excluded letters to the editor, commentaries, perspectives, conference abstracts, and journal announcements (e.g., editor explanations about a theme issue). We also excluded “Really Good Stuff” in Medical Education because this section features, by design, significantly abridged articles that did not provide all of the data necessary for our analyses.
We broadly defined “survey” as any research method that had predefined questions for respondents.1,10 We categorized a study as a “survey-based study” for purposes of analysis if it included original survey data, regardless of whether the investigator also used other methods (e.g., pre- and posttest exams) in the study. We designated studies that did not include any type of survey as “nonsurvey original research” and excluded them from the analysis.
Finally, we defined “health professions trainees” as anyone pursuing a terminal degree, preparatory degree, or technical training in health professions, including premedical students, medical students, residents, fellows, dentistry students, physical therapy students, nursing students, and students in any other allied health field.
We differentiated the method by which potential respondents were notified of the survey’s existence (“contact method”) and the method by which data were collected (“survey method”). Additionally, we defined “mixed methods” for both categories as a study using more than one contact or survey method for any reason.
We evaluated each article for how response rate was calculated, either by direct reference to the American Association of Public Opinion Research (AAPOR) definitions11 or by descriptive explanation. We did not contact authors for clarification. We attempted to interpret information presented in the methods sections of reports to categorize response rate definitions into the AAPOR definitions; however, we applied an AAPOR definition to a survey only if the report provided all necessary components. (The AAPOR definitions are the international gold standard for response rate definitions in the general survey and medical survey communities.8,12)
For each report, we evaluated strategies known to be associated with increased response rates in other fields.10,13 With regard specifically to incentives, we defined them as any gift to potential or actual respondents and recorded incentives as either given with the initial notification of the survey’s existence or on completion of the survey. We further classified incentives given on survey completion as “guaranteed” or based on “lottery.” We recorded “missing data” for any report that did not explicitly state whether or not the investigators provided incentives. We also determined, based on information supplied in the report, whether investigators provided survey prenotification and which survey modality (e.g., Web or postal) they used, because both prenotification and survey modality are associated with response rate differences in other research fields.10 Finally, we evaluated survey length, measured by number of questions and by any length of time reported.
Please see List 1 for a full list of recorded survey characteristics.
Characteristics of Surveys and Survey Distribution Recorded to Evaluate Response Rate for Survey-Based Research Published in Three Medical Education Journals in 2013a
- Sampling frame (group surveyed)
- Single- or multi-institution
- Whether or not prenotification was provided
- Survey delivery modality (e.g., Web based or in person)
- Response rate
- How the response rate was calculated
- Whether or not an incentive was offered
- Description of incentive
- When the incentive was provided
- Whether the incentive was a lottery or guaranteed
- Survey length (number of items and time)
aThe three journals are Academic Medicine, Medical Education, and Advances in Health Sciences Education.
If investigators used more than one survey for a particular sampling frame (the group the survey is targeting), we included only the first survey to avoid the risk of survey fatigue contributing to a difference in response rates. We excluded all surveys for a given sampling frame if it was unclear which survey (of multiple surveys) a group received first (e.g., medical students who received one survey on behaviors and one survey on opinions, rather than clear pre/post surveys). We excluded the multiple-survey studies because survey fatigue is not well understood in public or health professions trainee populations14; we chose a conservative approach to eliminate risk from a minimally understood phenomenon. We were not, unfortunately, able to evaluate for members of sampling frames taking surveys from other sources at a similar time (such as from commercial businesses).
We excluded reports that used data from compulsory surveys, such as course evaluations, because these results would theoretically inflate response rates artificially. We also excluded reports that combined both health professions trainees and other respondents who were not trainees (e.g., faculty) in the reported outcomes because the sampling frame was outside the scope of our study.
When a report supplied response rates for each institution in a multi-institutional study, we calculated the mean for the various institutions to provide an overall response rate and thereby allow more appropriate comparisons among the included studies.
We excluded reports that presented information from larger studies that referred to prior publications for detailed methods to prevent repeat analysis of essentially the same study and methods.
Four of us (A.W.P., B.T.F., A.U., and A.Q.T.) evaluated the reports published in 2013 from the three journals. We made decisions by group consensus via comment posts on a shared Google spreadsheet. This approach provided real-time assurance of equal application of the rules because everyone could refer to prior and active decisions. Supplemental Digital Appendix 1 (http://links.lww.com/ACADMED/A382) includes the formal rules we used that were not addressed in the methods section. Please contact the corresponding author (A.W.P.) to obtain the file of the spreadsheet we used to track all article decisions.
We used, as appropriate, one-way analysis of variance (ANOVA), t test, chi-square, and Spearman ρ for univariate analyses. We used equal variances for t tests only when confirmed by Levene test for equality of variances. We calculated achieved power using G*Power, and assumed a moderate effect size (d = 0.5 for t tests and F = 0.25 for ANOVAs) for all analyses.15
We imputed information into a Google Doc (Google, Inc., Mountain View, California) to share among coauthors, and we performed all statistics analyses using SPSS version 21 (IBM Corp., Armonk, New York).
Reports selected, response rates, and survey strategies
Of 732 total articles, 356 (48.6%) were reports of original research; of these 356 research reports, 185 (52.0%) used at least one survey. We excluded 89 of the 185 survey-based research reports because the sampling frame was a group other than health professions trainees. Further, we excluded 26 reports because they were longitudinal studies (as described in Method), and we excluded 4 because the surveys were compulsory (such as a required course component). Please see Figure 1. The 66 remaining survey-based research reports (35.6% of the 185 studies using surveys) included a total of 73 different surveys to unique sampling frames; some reports included surveys to multiple sampling frames (e.g., one survey for residents and another for medical students). Table 1 describes response rates stratified by strategic variables.
Of the 73 total surveys of health professions trainees, 33 (45.2%) were from Academic Medicine, 21 (28.8%) from Medical Education, and 19 (26.0%) from Advances in Health Sciences Education. A majority of the surveys (n = 45; 61.6%) evaluated medical students, followed by residents (n = 15; 20.5%). The remaining surveys evaluated a combination sampling frame (n = 6; 8.2%) and “other” (n = 7; 9.6%), such as physical therapy or premedical students (see Figure 2). Most surveys were delivered to a single institution (47 [68.1%] of the 69 surveys that reported distribution).
Of the 44 surveys for which a delivery method was reported, exactly half (n = 22; 50%) were delivered electronically (Internet or e-mail); 15 (34.1%) were in-person (i.e., a surveyor asked questions); 6 (13.6%) were paper handouts; and 1 (2.3%) used mixed methods (electronic and telephone). Figure 3 shows percentages in the context of all 73 surveys, including those for which a delivery method was not mentioned.
Of the 73 total surveys that met inclusion criteria, we found information about incentives for only 18 (24.7%); of these, 12 surveys (66.7%) included incentives (4 lottery and 8 guaranteed). Regarding prenotification, 33 (45.2%) of the 73 total surveys were reportedly delivered after a prenotification, while the reports for the remaining 54.8% of surveys did not mention whether or not prenotification was provided.
The mean number of questions was 28.9 (standard deviation [SD]: 23.4) for the 35 (47.9%) surveys for which the number of survey questions was reported. Two different studies reported an approximate pilot survey time of 10 minutes. An approximate mean or minimum/maximum actual survey time was reported for 12 surveys. Reporting variation precluded calculation of a mean because none reported precise means or minimums/maximums of time required to take the survey.
A response rate was provided for only 46 (63%) of the 73 surveys and ranged from 26.6% to 100%, mean 71.3% (SD = 19.5). The majority (n = 41 [89.1%]) of the 46 surveys did not include an explanation of how response rate was calculated. The 5 remaining surveys had unique, author-described definitions; no investigators explicitly reported a definition endorsed by the AAPOR.11
The only survey methodology factor that was significantly associated with a difference in response rates was single- vs. multi-institution sampling frames. The academic journal in which the survey was published was also significantly associated with a difference in response rates but is not related to survey methodology. Importantly, the null findings are in the context of underpowered univariate analyses. (Please see Table 1 for a summary of all comparisons and power analyses.)
With regard to academic journals, the mean response rates for Academic Medicine, Medical Education, and Advances in Health Sciences Education were, respectively, 63.0% (SD: 19.7%), 78.2% (SD: 17.3%), and 81.6% (SD: 19.5%). We detected a statistically significant difference among journals in our omnibus analysis F(2,43) = 5.065, P = .011. The only significant differences in our post hoc analysis (least significant difference) were Academic Medicine vs. Medical Education and vs. Advances in Health Sciences Education (95% confidence intervals [CIs], respectively, 54.5–71.5, 67.7–88.6, and 71.6–91.5). Of note, the 95% CIs overlap marginally, despite the statistically significant P values.
The only other factor associated with response rates was single- vs. multi-institution sampling frames (t(36.3) = 2.39, P = .022, assuming unequal variances based on a statistically significant Levene test for equality of variances). Mean response rates for single- vs. multi-institution surveys were, respectively, 74.6% (SD: 21.2%; 95% CI: 67–82.2) and 62% (SD: 12.8%; 95% CI 55–69).
Because data on all of our predefined variables (e.g., prenotification and survey modality) were lacking, we were limited to conducting only univariate analyses of any associations with response rate, rather than a regression analysis, as originally planned.
Four of the 73 (5.4%) health professions trainee surveys mentioned potential nonresponse bias in the limitations section, often using a different term such as “response bias” with supportive text consistent with the definition of nonresponse bias.8 No investigators performed a formal nonresponse bias analysis, so it is unknown whether survey-based research published in these three journals in 2013 suffered any nonresponse bias.
Our primary research aim was to establish a mean response rate among surveys of health professions trainees, and our secondary aims were (1) to determine strategies associated with higher response rates and (2) to estimate the presence of nonresponse bias. However, because of the prominent lack of reported paradata (survey method details), we could not fulfill our objectives.
Differing and absent response rate definitions precluded any reliable estimate of response rates among surveys of health professions trainees. No survey-based research reports published in Academic Medicine, Medical Education, or Advances in Health Sciences Education in 2013 used the AAPOR definitions of response rate, which is standard in public opinion research8 and a recommendation for all survey-based research submissions to the Journal of the American Medical Association.16
The 71.3% mean response rate, which should be interpreted as a rough estimate in the context of our limited data, is consistent with general population response rates for government surveys in the United States (approximately 70%) and higher than other national means (50%–60% for the United Kingdom, Hungary, and Sweden).17 Further, this rate should be considered in the context of response rates dropping internationally for the general population in recent decades.18 We do not know if the mean 71.3% response rate represents a lower mean response rate from health professions trainees than in prior decades.
A response rate was reported for only slightly more than half of the health professions trainee surveys we examined, and many reports did not include additional information about the survey methods, which left our study underpowered. For example, to achieve a minimal power of 80% for an independent t test to evaluate whether incentives improved the reported response rates, we would have needed to evaluate 102 reports providing both a response rate and information on other factors. Assuming that 2013 publications are representative of other years, we would have had to evaluate an additional 6.3 years’ worth of issues in the three journals (totaling 4,601 more articles). As noted, we initially planned to conduct a multivariate regression analysis to evaluate for characteristics associated with response rates, but doing so was not feasible with such low power.
Our study has nonetheless provided revealing descriptive findings. First, medical students were more commonly surveyed than residents. This may reflect greater researcher interest in or resources devoted to undergraduate medical education; alternatively, there may be a perception that medical students are a more reachable audience (despite no significant difference between medical student and resident response rates). Another interesting finding is that the majority of surveys were electronic, and only one survey used more than one survey delivery modality (e.g., electronic and paper), despite prior research in other settings that demonstrated improved response rates with mixed survey modalities.14,19
Notably, only 12 of 73 surveys (16.4%) provided incentives, which was fewer than anticipated because incentives are common in surveys outside of HPE research and have been shown to significantly improve response rates.13 Despite the low percentage of surveys reporting incentives, the mean response rate for all surveys was quite high, above 70%. The high response rate with the relatively low rate of reported incentives is most likely simply because investigators did not report incentives that were provided. Still, we cannot draw any definitive conclusions, given the low rate of reports explicitly reporting whether or not incentives were used. The high mean response rate may also reflect a publication bias for reports with high response rates in our sample of research published in prestigious medical education journals.
The lack of reported nonresponse bias analyses warrants further discussion. First, public polling experts recommend evaluating for nonresponse bias in all surveys, regardless of response rate.8,20 Moreover, high response rates do not preclude nonresponse bias, and low response rates are known to be only minimally associated with nonresponse bias.20,21 With this lack of data, we cannot determine whether nonresponse bias is frequently present in health professions trainee survey research, nor can we determine the characteristics of nonrespondents that are associated with potential nonresponse bias.
The significant difference in response rates between journals likely represents different publication thresholds. No specific, validated measures are available to determine the quality of survey-based research published in different journals as there are for randomized controlled trials (i.e., CONSORT).
Finally, the overall response rate of approximately 71% should be taken in the context of simply describing the penetration depth of the survey into the sampling frame.20 Response rate is not a good predictor of nonresponse bias, but it remains an important overall descriptor of survey outcomes.20,21
Our study has several limitations, chiefly the low power achieved for analyses with the limited paradata available. The null findings for various strategies to improve response rates should not be interpreted to suggest that these strategies are ineffective among health professions trainees. Additionally, we analyzed the reports of only three, highly regarded journals over the span of a single year. Another potential limitation is that we evaluated 73 surveys from 66 articles, so the paradata reporting was the same for 7 surveys; however, the reporting influences only what type of information was provided, not the characteristics for the different sampling frames in the different surveys. Finally, we did not perform a formal kappa analysis of report designation (exclusion criteria, categorization as a survey, etc.), but we believe the consensus approach better served this specific data set.
Overall, our findings must be interpreted cautiously, but suggest that medical trainees respond reasonably well to surveys and are more likely to respond if the survey is conducted only at their local institution. The most notable conclusion we drew from this analysis, however, is the importance of fully reporting survey methods, especially response rate and how it is calculated (response rate definition), so subsequent researchers can appropriately build on prior studies. This finding is consistent with recent calls for more robust qualitative research reporting.22
On the basis of general survey literature, HPE researchers reporting surveys should consider including—at a minimum—response rate, response rate definition, an explicit statement regarding nonresponse bias analysis (whether or not they conducted this analysis and why), survey delivery modality (e.g., paper versus electronic versus mixed methods), and explicit details about any incentives offered.20 HPE journals may also consider encouraging authors to use one of the internationally accepted AAPOR response rate definitions, as is already encouraged in at least one general medicine journal.16 We believe that by doing so, the HPE research community can better understand strategies that improve response rates and nonresponse bias in surveys of health professions trainees.
Acknowledgments: The authors would like to acknowledge the contributions of Lane Library research librarian Christopher Stave, MLS, for his assistance with the literature review. They wish to thank both David Schriger, MD, MPH, of the University of California, Los Angeles, Department of Emergency Medicine, for methodology input, and Cara Phillips for her graphics assistance. They also thank Michael Davern, PhD, of National Opinion Research Center at the University of Chicago, Chicago, Illinois, for his perspectives from the general survey community.
1. Artino AR Jr, La Rochelle JS, Dezee KJ, Gehlbach H. Developing questionnaires for educational research: AMEE guide no. 87. Med Teach. 2014;36:463474.
2. Robertson J, Walkom EJ, McGettigan P. Response rates and representativeness: A lottery incentive improves physician survey return rates. Pharmacoepidemiol Drug Saf. 2005;14:571577.
3. Cook JV, Dickinson HO, Eccles MP. Response rates in postal surveys of healthcare professionals between 1996 and 2005: An observational study. BMC Health Serv Res. 2009;9:160.
4. Willis GB, Smith T, Lee HJ. Do additional recontacts to increase response rate improve physician survey data quality? Med Care. 2013;51:945948.
5. Association of American Medical Colleges. FACTS: Applicants, matriculants, enrollment, graduates, MD-PhD, and residency applicants data 2016. https://www.aamc.org/data/facts/
. Accessed June 28, 2016.
6. Department of Commerce. United States Census Bureau. census.gov. http://www.census.gov/topics.html
. Accessed November 19, 2014.
7. Lempp H, Seale C. The hidden curriculum in undergraduate medical education: Qualitative study of medical students’ perceptions of teaching. BMJ. 2004;329:770773.
8. Tourangeau R, Plewes TJ. Nonresponse in Social Science Surveys: A Research Agenda. 2013.Washington, DC: National Academies Press.
9. Thomson Reuters. 2013 Journal Citation Report®
Science Edition. No longer available.
10. Dillman DA. Mail and Internet Surveys: The Tailored Design Method. 2000.2nd ed. New York, NY: John Wiley and Sons, Inc.
11. American Association for Public Opinion Research. Standard definitions: Final dispositions of case codes and outcome rates for surveys. 2016. http://www.aapor.org/Standards-Ethics/Standard-Definitions-(1).aspx
. Accessed June 28, 2016.
12. Johnson TP, Wislar JS. Response rates and nonresponse errors in surveys. JAMA. 2012;307:18051806.
13. Church AH. Estimating the effect of incentives on mail survey response rates: A meta-analysis. Public Opin Q. 1993;57:6279.
14. Millar MM, Dillman DA. Improving response to Web and mixed-mode surveys. Public Opin Q. 2011;72:270286.
15. Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39:175191.
16. Journal of the American Medical Association. JAMA instructions for authors. 2016. http://jama.jamanetwork.com/public/instructionsForAuthors.aspx
. Accessed June 6, 2016.
17. De Heer W. International response trends: Results of an international survey. J Off Stat. 1999;15:129142.
18. de Leeuw E, de Heer W. Groves RM, Dillman DA, Eltinge JL, Little RJ. Trends in household survey nonresponse: A longitudinal and international comparison. In: Survey Nonresponse. 2002.New York, NY: Wiley-Interscience.
19. Beebe TJ, Locke GR 3rd, Barnes SA, Davern ME, Anderson KJ. Mixing Web and mail methods in a survey of physicians. Health Serv Res. 2007;42(3 pt 1):12191234.
20. Phillips AW, Reddy S, Durning SJ. Improving response rates and evaluating nonresponse bias in surveys: AMEE guide no. 102. Med Teach. 2016;38:217228.
21. Groves RM. Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly. 2006;70:646675.
22. O’Brien BC, Harris IB, Beckman TJ, Reed DA, Cook DA. Standards for reporting qualitative research: A synthesis of recommendations. Acad Med. 2014;89:12451251.