Generalist physicians form the foundation of the health care system in the United States.1 The care that family physicians, general internists, and general pediatricians provide is both essential to maintaining the public health and cost-effective.2 Over the last decade, fewer medical students have chosen to pursue generalist careers.3–5 The situation may be reaching crisis proportions, as major shortfalls in the generalist workforce are projected for the near future.1,6 A recent study examining the potential primary care (PC) provider shortage predicted that, by 2025, there will be a deficit of 35,000 to 44,000 generalists to care for the aging U.S. adult population.7
Among a number of factors that may affect medical students' likelihood of pursuing generalist careers, one likely determinant is the medical school PC experience.8 Current evidence indicates that the quality of the PC experience varies considerably across institutions and that few students identify the experience as increasing the attractiveness of a generalist career.3 Such reports may help to explain the considerable variation among medical schools in the percentage of graduates matching into PC residency programs.9 The data from these reports also suggest that more medical students might pursue generalist careers if the PC experience was of a consistently higher quality across institutions. To achieve such consistent, high-quality experiences, medical school educators would need to understand the attributes that most strongly influence the PC medical school experience, yet surprisingly, academicians know little about those attributes. Beyond the potential utility of such information to medical school administrators (e.g., deans, generalist department chairs and division chiefs, predoctoral education directors) who seek to benchmark and improve the quality of the PC experience at their institutions, policy makers might use such information to guide broader efforts to strengthen PC education and training. Additionally, students interested in PC careers might use such information during the medical school application and interview process.
Prior studies have examined how specific aspects of the PC experience, such as clerkship experiences,8,10 the availability of PC role models,11 peer encouragement,11 and the “bad mouthing” of PC,11 have influenced students' PC attitudes, interest, and/or career choices. However, the relative importance of these previously studied aspects of the medical school PC experience is unclear. Additionally, other, as-yet-undetermined factors that influence the medical school PC experience may also exist. In this vacuum of information, the annual U.S. News & World Report (USN&WR) PC medical school ranking is the only nationally applied measure that purports to assess the quality of the medical school PC experience.12
Although the USN&WR rankings attract considerable attention among both lay and, more surprisingly, academic medical audiences,13 the ranking methodology has not been validated, and face validity problems exist with most elements (Table 1). For example, the peer (medical school administrator) and residency director assessments, which account for 40% of the USN&WR PC medical school ranking, each comprise a single question: “Please rate each school with which you are familiar on a scale from marginal (1) to outstanding (5).”12 Even if such measures were able to validly and reliably capture differences in the quality of the PC experience among medical schools, they are too global to provide useful direction to medical students, medical school administrators and educators, or policy makers.
In this study, we employed experts to identify key attributes that might shape medical students' PC attitudes, perceptions, and training. We also explored whether such information might inform development of a reliable and valid measure of the quality of the medical school PC experience, as an alternative to the often-cited USN&WR rankings.
We conducted this study from October 2007 through October 2008. The University of California, Davis, institutional review board provided ethical approval of the study.
We used a mixed-methods approach first to identify salient attributes affecting the medical school PC experience and then to assess their relative importance. Here, the term “PC experience” is intentionally broad, encompassing formal curricula, informal/hidden curricula, influential individuals (e.g., course instructors, clerkship directors), mentoring, career advising, and other potentially relevant exposures (e.g., PC research).
Phase 1: Identification of key attributes by opinion leaders
We identified select nationally recognized academic generalist leaders active in PC practice, education, and/or research. We identified these individuals via our ongoing involvement (e.g., scholarships, committee work) in national generalist professional organizations, such as the Robert Wood Johnson Foundation Generalist Physician Faculty Scholars Program, the Society of General Internal Medicine (SGIM), the Society of Teachers of Family Medicine, and the Academic Pediatric Association. We also contacted experts we knew through personal collaborations (e.g., mentoring relationships, educational research). We invited the leaders to participate in 30- to 60-minute semistructured interviews to elicit their views regarding the key attributes influencing the quality of the student PC experience within U.S. medical schools.
Of 48 generalist leaders invited to complete an interview, 16 (33%) agreed to participate. One of us (A.J.) conducted individual semistructured interviews (15 by telephone, one face-to face) with each participant. The interviewer used a series of open-ended discussion questions (List 1), along with follow-up questions as indicated by interviewees' responses, to elicit participants' thoughts.
One of us (A.J.) entered interview notes and all responses into a spreadsheet, sorted them by interviewee and trigger question (i.e., question prompting the response), and then printed them onto three- by five-inch index cards. Next, we undertook a card-sorting exercise, grouping interview statement cards by thematic categories. Card sorting is an established method for examining the way experts categorize information within a particular domain14,15 that other researchers have successfully applied within the health care realm.16–18 Initially, four of us (A.J., family medicine; M.S. and R.L.K., general internal medicine; and R.J.P., general pediatrics) each independently sorted the cards into categories. Then we (the same four) met to review and discuss the individual card-sorting results and negotiate a consensus sorting.
The four of us identified 58 attributes by consensus, and these attributes anchored the survey we developed in Phase 2.
Phase 2: Importance ratings of attributes by a national sample of generalists
We then created a Web-based survey, using SurveyMonkey (www.surveymonkey.com, Portland, Oregon), for a national sample of generalist physicians active in medical school education and/or in the generalist professional societies listed above (Phase 1). We generated invitee lists for family medicine and general pediatrics, based on, as in Phase 1, our involvement in organizations, research, scholarships, and collaborations. We gave invitees up to three reminders to respond to the survey, and respondents did not receive any incentives to complete the survey. To solicit respondents from general internal medicine, one of us (M.S.) arranged for an invitation to complete the survey to appear in the electronic newsletter sent to all faculty members of the SGIM. The window provided for SGIM members to respond was approximately three weeks. We sent the SGIM invitees no reminders to respond. We did, however, offer these responders a $5 gift card as an incentive for completing the survey.
The survey included basic respondent demographic questions such as academic rank, medical specialty, and institution.
Respondents addressed two questions related to each of the attributes we identified in Phase 1. First, they rated the relative importance of the attributes, using a nine-point Likert-like scale (1 = of little importance, 9 = of great importance). Second, they indicated (yes/no) whether they felt external experts (e.g., deans, chairs, program directors, instructors-of-record) could validly assess the quality of the attributes at other medical schools (i.e., not their own institution). We also asked a single, open-text question—“Do you have any other comments you wish to make?”—but the few responses added nothing substantive.
We conducted our analyses using SAS (version 9.1, SAS Institute, Inc., Cary, North Carolina). Four respondents identified their specialty as combined medicine/pediatrics; these four all had joint academic appointments in internal medicine and pediatrics. One of us (A.J.) contacted all of these individuals by e-mail to ask them whether the majority of their responsibilities were in internal medicine or pediatrics. Three indicated internal medicine and one pediatrics, and we categorized respondents accordingly for analyses.
We obtained summary importance ratings for each survey item by calculating the median, mean, and standard deviation of all nonmissing ratings. To be consistent with prior studies for determining expert consensus regarding the appropriateness of various medical interventions,19,20 we considered items with median ratings of 7 or greater as “highly important.”
We also determined the degree of agreement in respondents' importance ratings by modifying another previously described method.19,20 Briefly, the method involves calculating an agreement index, the weighted sum of the median importance rating and the number 5, the latter representing the highest level of uncertainty on the nine-point importance rating scale.20 The weights depend on the variability in respondent ratings, through the mean absolute deviation in scores from the median, a nonparametric measure of variation. The effect of weighting is to move the agreement index away from the median importance rating and toward 5 when variation in ratings occurs. As an example, consider the following: that a group of experts rates the importance of two attributes, that the median importance rating for each individual attribute turns out to be 7, that there is some variation in the individual experts' ratings for each attribute (as is typical), and that more variation among experts is present in the importance ratings of attribute 2 than of attribute 1. In this case, the agreement index for both attributes will be a number smaller than 7, and the agreement index for attribute 2 will be smaller (i.e., closer to 5) than the agreement index for attribute 1. Similar to prior investigators employing the method,20–22 we considered an agreement index greater than 6 to reflect strong group agreement that a particular attribute was highly important to the medical school PC experience.
We employed two additional steps to derive a smaller roster of attributes for potential use in a future PC experience quality-rating measure. First, we discarded attributes with overall study sample agreement indices of 6 or below. Second, we discarded any remaining items for which fewer than two-thirds of respondents indicated external experts could validly assess the quality of the attribute at medical schools. In other words, we retained only those attributes for which two-thirds or more of respondents felt that an outside expert's ratings would be well grounded or justifiable, based on the information about medical schools other than his/her own that such an expert might possess. An outside expert could potentially draw on a number of sources of noninsider information about other medical schools, including published data, reputation, personal impressions, and/or valued colleagues' opinions.
We also compared attribute importance ratings by subgroup within the following respondent characteristic categories: gender, PC specialty (family medicine, general internal medicine, or general pediatrics), academic rank (professor, associate professor, assistant professor, or other [e.g., adjunct professor]), and U.S. Census geographic region (West, Midwest, South, or Northeast). We divided agreement indices for each subgroup into ranked quartiles (ranking positions 1–14, 15–29, 30–44, and 45–58) to facilitate subgroup comparisons for attributes that at least one subgroup ranked in the first or second quartile. We defined significant disagreement between subgroups as more than one ranked quartile difference, which corresponded to less than one point absolute difference in the agreement index.
Table 2 summarizes characteristics of the 16 interview participants. Respondents were distributed relatively evenly across all four major U.S. geographic regions. Five were family physicians—one of whom was, at that time, completing a sports medicine fellowship and had an active national leadership role in PC education. Seven worked in the field of general internal medicine; of these, six were general internists, and the seventh was a PhD generalist educator and health services researcher affiliated with an internal medicine department. The remaining four participants were generalist pediatricians—one of whom was at that time completing an adolescent medicine fellowship and had an active national leadership role in PC education.
As mentioned, four of us (A.J., M.S., R.L.K., and R.J.P.) identified 58 attributes (Table 3), which, after extensive discussion, we grouped into four consensus thematic categories, also defined through extensive discussion. The informal curriculum category (23 attributes) concerns the unscripted and unplanned teaching and learning experiences in medical school that arise mostly from ad hoc interpersonal communication among and between faculty, students, and residents.23 The institutional infrastructure category (6 attributes) concerns formal medical school policies and programs that, although not ostensibly “educational,” might be expected to influence the PC experience of students. The final two thematic categories concern formal educational/curricular elements: the educational/curricular infrastructure category (6 attributes), which involves broad educational policies and requirements, and the specific educational experiences category (23 attributes).
Table 2 also summarizes characteristics of the 126 survey respondents. Respondents were again evenly distributed by national geographic region. Thirty-one of 62 family medicine invitees (50%) and 27 of 54 general pediatrics invitees (50%) completed the survey. Of the approximately 2,700 SGIM members who received the society's July 1, 2008 electronic newsletter, 64 (2%)—all of whom were academic faculty members—logged on during the three-week window and completed the survey.
Table 3 summarizes responses to the survey. Respondents rated 31 (53%) of the 58 attributes included in the survey as highly important (median rating ≥7). Importance ratings were generally high for informal curriculum items: 20 (87%) of 23 were rated as highly important. Nine (90%) of the top 10 attributes, and 17 (85%) of the top 20 attributes, were from the informal curriculum category. However, two-thirds of respondents perceived that external expert ratings (via survey) would be valid for only 4 (20%) of the 20 informal curriculum items with median importance ratings at or above 7. Respondents seemed particularly unsure of the validity of expert ratings of informal curriculum items that involved non-PC institutional leaders', faculty members', residents', and medical students' perceptions of the quality of PC departments, faculty, and residents. Ten such informal curriculum items had median importance ratings at or above 7, but for none did two-thirds or more of respondents perceive external expert ratings would be valid. By contrast, two-thirds or more of respondents felt that external experts could validly assess 10 (91%) of the 11 items from the institutional infrastructure, educational/curricular infrastructure, and specific educational experiences thematic categories that were rated as highly important (ratings of ≥7).
There were 14 items for which the agreement index was greater than 6 (e.g., strong agreement among raters that the attribute is highly important) and for which two-thirds or more of respondents felt external experts could validly rate by survey (Table 3, items with dagger). Of the 14, 4 (29%) concerned informal curriculum, 2 (14%) involved institutional infrastructure, 3 (21%) educational/curricular infrastructure, and 5 (36%) specific educational experiences.
With few exceptions, only small (no more than one quartile) differences existed in attribute importance rankings among respondent subgroups. Family physicians rated the importance of an admissions process that gives fair consideration to PC-inclined students more highly than did general internists and pediatricians (three-quartile difference, both comparisons). General internists rated the importance of excellent geriatrics training more highly than did family physicians and general pediatricians (two-quartile difference, both comparisons). Finally, full professors rated the importance of both an admissions process that gives fair consideration to PC-inclined students and an excellent information technology training program more highly than did associate and assistant professors (two-quartile difference, all comparisons).
In this study, expert academic generalists converged on a manageable list of institutional attributes thought to shape the quality of the medical school PC experience. Initial semistructured interviews generated a list of 58 institutional attributes influencing the PC experience. Subsequently, a larger sample of experts from across the United States rated 31 of these attributes as highly important in influencing the PC experience (Table 3), and generally strong agreement occurred among these raters—regardless of their generalist field, gender, academic rank, or geographic region.
Applicability of the study findings
The list of highly important attributes our experts identified begins to provide much needed guidance to the various stakeholders in PC. Medical school administrators and educators interested in strengthening their PC offerings might use the findings to establish high-priority areas for improvement. Policy makers might employ these results both to begin establishing uniform criteria for assessing the quality of the medical school PC experience and to guide initiatives (e.g., federal PC training grants) aimed at addressing flagging student interest in generalist careers. Finally, medical school applicants interested in PC might investigate the status of the high-importance attributes in their preparatory research, discussions with current students, and interviews with medical school officials.
The importance of informal curriculum
Our expert respondents viewed informal curriculum attributes, such as medical students', administrators', and non-PC physicians' views regarding PC departments, clerkships, core and community-based faculty, and residents, as particularly important shapers of the PC experience. Nine (90%) of the top 10 and 17 (85%) of the top 20 most important items were from this category. This finding underscores the need to maintain a broad view of the PC experience, rather than limiting consideration of it to formal curricular elements such as specific topics and activities covered in courses and clerkships. The significance of hidden curriculum attributes that our study identified is consistent with other research documenting the strong influence of the informal curriculum in shaping the medical education experience.24
Toward a more valid PC experience ranking methodology
Our findings challenge the face validity of the “Quality assessment” element of the USN&WR PC medical school rankings (Table 1), which includes global, single-item peer administrator and residency director subjective assessments.12 Expert respondents in our study identified 31 different attributes as highly important in shaping the medical school PC experience, and these attributes touch on a wide range of areas including mentoring and career advising, institutional culture, the scope and quality of PC practices, curricular infrastructure, and specific curricular elements and topics. In this context, there are several important limitations to the single peer and residency director global rating items employed in the USN&WR PC ranking methodology. First, single-item scales are inherently unreliable.25 Second, given the high percentage of attributes in our survey that respondents felt could not be validly rated by expert opinion survey, the USN&WR global measures seem to require expert respondents to evaluate things they are not actually in a position to assess. Third, even if one assumes that the quality of a PC experience is a unidimensional construct, the USN&WR global ratings offer no actionable information to medical students, faculty, administrators, or policy makers.
A secondary aim of our study was to begin to develop a more reliable and valid way of assessing the relative quality of the PC experience at U.S. medical schools. Along these lines, a two-thirds majority of expert respondents believed that external experts could assess 14 of the 31 highly important attributes of the PC medical school experience via opinion surveys. With further validation, these 14 items might form the core of a reliable and valid expert opinion measure. Unfortunately, the validity of expert opinion survey ratings for the other highly important attributes reported here may be low. For example, our respondents indicated that none of the four attributes with the highest mean importance ratings could be validly rated in this manner. While the reasons for these findings are not fully clear, our respondents likely felt that an outside content area expert would not have the relatively detailed, insider's knowledge of other medical schools that he or she would require to validly rate a number of the survey attributes, particularly informal curriculum items. Nonetheless, if researchers could identify alternative methods of assessing these attributes, some might still eventually be included in a multicomponent PC experience rating system. For some attributes, it may be possible to identify readily accessible existing data sources that capture a reasonable amount of the variance in the quality among schools. For example, one might assess the responsiveness of institutional leaders to the needs of PC providers and departments by considering the relative proportion of tenured generalist faculty, the presence of generalist fellowships, and/or the number of advanced leadership roles (e.g., deanships) held by generalist faculty.
For other high-importance attributes not validly ratable by expert survey, existing data sources will probably not be sufficient. Measuring the status of these attributes will likely require new approaches, such as adding items to the Association of American Medical Colleges Graduation Questionnaire,26 conducting additional student, faculty, and administrative leadership surveys, and requesting summary reports of pertinent information from deans' offices. Such approaches would be particularly valuable to ascertain the views of medical students, non-PC faculty, and institutional leaders regarding PC departments, faculty, and residents.
Study strengths and limitations
A major strength of our study was the careful sampling of highly qualified, locally, regionally, and nationally recognized content area experts from all three generalist disciplines for participation in both the semistructured interviews and the national survey. Recognized differences in the correlates of student career choices among the generalist fields8 might lead some to question the advisability of combining responses across fields. Nonetheless, the generally strong agreement among our respondents regarding the relative importance of various attributes, regardless of generalist field, supports the approach.
Our study also had some limitations. Our study concerned only subjective ratings of the importance of attributes affecting the PC experience in U.S. medical schools, provided by a relatively small sample of academic generalists. As mentioned previously, one would ideally combine such information with other types of data, collected in different ways and from different sources, to create a multicomponent medical school PC experience quality-rating system. Additionally, we assessed experts' views regarding the validity of rating such attributes via expert survey, but the actual validity of such ratings remains to be examined. Further research to assess validity would be valuable. Future studies should also include osteopathic educators, who were not surveyed in the current study. Osteopathic schools traditionally produce large numbers of PC physicians, which suggests that osteopathic academic generalist leaders may have important insights not realized in this initial study limited to MD (medical degree) generalists.
Finally, because our focus was on the PC experience within medical schools, we did not directly assess the influence of factors such as the lack of equity in physician compensation by specialty27 and other broader medical practice issues. We fully recognize that such factors are likely to have an impact on student perceptions regarding their PC medical education experience. In fact, our survey item concerning high PC physician job satisfaction, which had the second-highest mean importance rating of our 58 survey items, indirectly reflects the strong influence of prevailing medical practice factors.
In conclusion, we identified 58 institutional attributes that content area experts believed influence the quality of the student PC experience in U.S. medical schools. Expert respondents rated 31 of these attributes as highly important, and informal curriculum attributes appeared frequently in this rating category. Finally, respondents felt that external experts could validly rate 14 of the 31 highly important attributes via expert opinion survey. Those 14 items might form the core of the subjective component of an eventual multicomponent PC experience rating system to supersede the USN&WR PC medical school rankings. Additional work is clearly needed to establish the actual validity of expert survey ratings of these attributes and to examine alternative approaches to assessing the status of high-importance attributes for which experts believed survey ratings to be invalid. Nonetheless, our study represents an important first step in identifying and weighting the importance of the various institutional attributes that influence the medical school PC experience, an undertaking of considerable importance to generalist educators, administrators, and policy makers seeking to reverse the decline in student interest in generalist careers3–5 and to avert looming shortfalls in the PC workforce.1,6
Partial funding for the study was provided by a University of California, Davis, Department of Family and Community Medicine research grant.
This study was approved by the institutional review board of the University of California, Davis.