Since the Association of American Medical Colleges (AAMC) began collecting data from graduating medical students in 1978, numerous studies have been published using data from the AAMC Graduation Questionnaire (GQ). Investigators have used the GQ to understand graduate career decisions and specialty choices1,2 or cited data from the GQ to document curricular deficiencies and evaluate outcomes.3,4,5
Medical schools often use the annual school report of their own GQ data to inform curricular discussions and stimulate change. Recently data have been made available online to supplement paper reports with graphic illustrations of multiple-year comparisons. In addition, the data can be purchased from the AAMC for analysis as SPSS data sets. When completing the questionnaire, students are asked for their permission to release their names so that their GQ responses can be merged with existing school data. This allows for a powerful curricular evaluation tool. For example, students' GQ responses to questions about their preparation for practice can be merged with data from alumni who reflect on their medical school preparation. Do students selecting different careers evaluate the adequacy of curricular topics differently? Questions such as these can help schools better understand the quality and effectiveness of their curricula.
Providing respondent anonymity or confidentiality is a characteristic method of survey data collection. Texts typically describe anonymity as the total separation of responses and respondents, whereas confidentiality allows the researcher to link responses to a respondent, (e.g. through name, unique identification), but the researcher promises not to do so in any way that would make an individual respondent identifiable. For example, confidential data allow the researcher to match an individual's response on the GQ to data collected by the American Medical Association (AMA) Physician Profile through unique identifiers or respondent names, whereas anonymous data include no association between responses and respondents, rendering such a linkage impossible. Both these methods are advocated to increase response rates and improve honesty and accuracy in responding—especially to sensitive questions.6 However, confidentiality is considered less successful in this regard. Furthermore, those concerned with the protection of human subjects promote anonymity, while demanding confidentiality as a minimum standard.
Anonymity places a number of restrictions on researchers. Anonymity severely limits the researcher's ability to follow up and to ensure a representative response rate. In addition, it makes linking data, especially data collected at different times and in different settings, nearly impossible. For medical schools, the latter difficulty is the most problematic. Here we ask whether the perceived benefits of anonymity—particularly honesty in responding—warrant the limited use of results.
Studies of the effects of anonymity versus confidentiality on responses have been contradictory. Most studies of anonymity have concluded that ensuring anonymity does not influence response rate.7 A review of 214 studies by Heberlein and Baumgartner found no benefit in the percentage responding when anonymous procedures were used.8 In terms of anonymity affecting subjects' responses on a survey, 90% of respondents in a mail study self-identified by putting their return addresses on the envelope, yet their responses were comparable to the responses of those maintaining anonymity.9 Another study examining the effects of setting (home versus work) and anonymity (identifier versus no identifier) on teachers' response rate, rapidity of response, and response about unions found no effect due to either the anonymity or the setting condition.10 Conversely, in another study, investigators compared college students' satisfaction with counseling services after they had completed treatment. The 25% of the respondents who elected to self-identify reported higher levels of treatment satisfaction than did those remaining anonymous.11 In the medical education context, a recent study examined any biasing effects of non-response in a sample of 508 residents. Residents were asked to permit follow-up evaluations of their performances. Those granting permission had significantly higher mean scores on the Medical College Admission Test and higher medical school GPAs, indicating a clear source of non-response bias.12
Anonymity and response bias in surveys have always been a concern to investigators. In this study, we examined the scores of students who did and did not agree to release their names for the GQ survey to test whether GQ scores vary between the two groups of students, thereby introducing bias.
A convenience sample of two medical schools provided data for this study: the University of California, San Francisco (UCSF) and the University of California, Los Angeles (UCLA). Three years of GQ data were analyzed, beginning with the most recent data available, those from 2001, and including GQ data from every other year after for a total of three years: 2001, 1999, and 1997.
The GQ contains sections that represent various constructs. The constructs and number of items per construct can vary year by year. Composite scores were created by taking the mean of the items in a section of the GQ. Only educational constructs reflecting experience of the medical school curriculum or satisfaction with the school's administration were used. In addition, the constructs had to be measured in each of the study years (see Table 1). For example, items asking about students' premedical experiences were not included, nor were items about specialty choice or expected practice location. This resulted in the eight constructs listed below. Finally, only constructs that had similar numbers of items per year were included. For example, items measuring students' evaluations of their clerkship objectives, abilities to provide feedback, etc., were assessed in all three years. However, these items were asked globally across clerkships in 1997 and 1999 (11 and 13 items, respectively), but asked for each clerkship separately in 2001 (59 items). The mean score per construct was calculated rather than a total score, because the items per construct sometimes varied even in the constructs chosen.
Five composite scores were created:
- Basic science preparation for clerkships—“Indicate how well you think that instruction in the following sciences basic to medicine prepared you for clinical clerkships and electives,” scale: 1 = excellent, 2 = good, 3 = fair, and 4 = poor.
- Evaluation of basic science course pedagogy—“Based on your experience, indicate whether you agree or disagree with the following statements about medical school,” scale: 1 = strongly agree, 2 = agree, 3 = no opinion, 4 = disagree, and 5 = strongly disagree.
- Time devoted to topics—“Do you believe the time devoted to your instruction in the following areas was inadequate, appropriate, or excessive,” scale: 1 = inadequate, 2 = appropriate, and 3 = excessive.
- Quality of clerkships—“Rate the quality of your educational experience in the following clinical clerkships,” scale: 1 = excellent, 2 = good, 3 = fair, and 4 = poor.
- Satisfaction with school resources—“Indicate your level of satisfaction with the following,” scale: 1 = very satisfied, 2 = satisfied, 3 = no opinion, 4 = dissatisfied, and 5 = very dissatisfied.
Three additional constructs were based on single items:
- 6. Overall satisfaction with education—“Overall I am satisfied with the quality of my medical education,” scale: 1 = strongly agree, 2 = agree, 3 = no opinion, 4 = disagree, and 5 = strongly disagree.
- 7. Confidence in clinical skills—“I am confident that I have acquired the clinical skills required to begin a residency program,” scale: 1 = strongly agree, 2 = agree, 3 = no opinion, 4 = disagree, and 5 = strongly disagree.
- 8. Medical school debt.
Medical school debt was included because it was not expected to be affected by student anonymity. The internal consistency of each construct was calculated using Cronbach's alpha.
A multivariate full factorial analysis of variance (MANOVA) was used to test the effects of students' releasing their names, schools, and graduation years on the eight constructs. Post-hoc univariate analyses were performed to test effects on the individual dependent variables (preparation for clerkships, basic science pedagogy, time devoted to topics, clerkship quality, satisfaction with school resources, overall satisfaction with education, confidence in skills, and medical school debt).
The GQ completion rates were similar by school and year (1997, 1999, 2001): 98%, 96%, and 96% for UCSF and 98%, 88%, and 94% for UCLA, respectively. The percentages of students who gave permission to release their names with their GQ data varied by school and year (1997, 1999, and 2001): at UCSF 29%, 58%, and 65%, respectively, gave their permission, while at UCLA 29%, 63%, and 81% did. In 2001, approximately 70% of students nationally allowed release of their names with their GQ data.13 The reliabilities (alpha) for the constructs were: preparation for clerkships, .83; basic science pedagogy, .88; time devoted to topics, .88; quality of clerkships, .59; and satisfaction with resources, .85.
There were significant main effects for students who released their names (F = 15.7, df = 6, p < .001), school (F = 20.4, df = 6, p < .001), and year (F = 8.2, df = 12, p < .001). Only one interaction term was significant, school by year (F = 7.7, df = 12, p < .001). Table 2 gives the means for the eight constructs by (1) permission to release names and (2) survey year. All dependent variables, except medical school debt, differed by anonymity of response. Students who gave permission to release their names consistently scored the curriculum and school resources more positively than did those students who did not. (Lower scores indicate more favorable responses than higher scores, except for the time-devoted-to-topics construct, which was scaled by the AAMC in the reverse direction from the other items.) Only basic science pedagogy and clerkship preparation significantly differed by school. All constructs but basic science pedagogy and confidence in skills varied by year. There was a significant interaction effect for school by year for clerkship quality and school resources.
This study examined whether anonymity on the AAMC GQ reflected response bias. In the case of the GQ, anonymity affects student responses. It was discovered that students whose data were not anonymous (i.e., students who gave permission to release their names on the AAMC GQ) consistently scored the curriculum and school higher than did those students whose data were anonymous. This is in contrast to the students' reporting of their medical school debt, which was not related to anonymity. These results suggest that the option to remain anonymous can influence individuals' responses to educational constructs.
As mentioned earlier, the GQ has been used by researchers to understand medical school graduates' career decisions and also to report curricular issues. However, using the GQ to track graduates and link graduate data to data collected at different times could lead to biased results. The views of students who perceive the curriculum more positively and are more satisfied with their medical school experience could dominate the results. Thus, caution should be exercised when linking GQ data to other data sources for research and curricular change purposes.
This study is limited to only two schools, both in the University of California system. Analysis of data sets from other schools may be needed to determine whether this phenomenon is generalizable to a larger group of schools. This study could not determine whether students whose data were not anonymous responded differently on other surveys. Neither of the two study schools rank their students; therefore, it is not possible to test whether student ranking is related to student anonymity or response bias. The 2001 GQ data supplied by the AAMC no longer include student gender or ethnicity if the student did not give permission to release his or her name. Therefore, it is not possible to determine whether these demographic variables are related to anonymity without further information from the AAMC. In the meantime, those schools using GQ data linked to other student data should be aware of the response bias present in the results.
1. Babbott D, Baldwin DC, Killian CD, Weaver SO. Racial—ethnic background and specialty choice: a study of U.S. medical school graduates in 1987. Acad Med. 1989;64:595–9.
2. Kassebaum DG, Szenas PL. Factors influencing the specialty choices of 1993 medical school graduates. Acad Med. 1994;69:163–70.
3. Hodgson CS. Tracking knowledge growth across an integrated nutrition curriculum. Acad Med. 2000;75(10 suppl):S12–S14.
4. Moberg TF, Whitcomb ME. Educational technology to facilitate medical students' learning: background paper 2 of the Medical School Objectives Project. Acad Med. 1999;74:1146–50.
5. Lockwood JH, Danoff D, Whitcomb ME. The AAMC's 2000 Graduation Questionnaire. JAMA. 2000;284:1080.
6. Babbie E. The essential wisdom of sociology. Teaching Sociol. 1990;18:526–30.
7. Dillman DA, Singer E, Clark J, Treat J. Effects of benefits appeals, mandatory appeals, and variations in statements of confidentiality on completion rates for census questionnaires. Public Opinion Q. 1996;60:376–89.
8. Heberlein TA, Baumgartner R. Factors affecting response rates to mailed questionnaires: a quantitative analysis of the published literature. Am Sociol Rev. 1978;43:447–62.
9. Skinner SJ, Childers TL. Respondent identification in mail surveys. J Advertising Res. 1980;20:57–61.
10. Wildman RC. Effects of anonymity and social setting on survey responses. Public Opinion Q. 1977;41:74–9.
11. Rapaport RJ. A Comparison of anonymous and self-identified respondents to a client satisfaction survey. Mich Personnel Guidance J. 1985;16:8–12.
12. Verhulst SJ, Distlehorst LH. Examination of non-response bias in a major residency follow-up study. Acad Med. 1993;68(2 suppl):S61–S63.
13. Lockwood John H., MCAT operations manager, Association of American Medical Colleges, Washington, DC. Personal communication, March 1, 2001.