Implementation science (IS) is a pragmatic science that aims to address the “know-do” gap required for bringing scientific discovery to public health scale by producing evidence and strategies to address policy and program gaps. As the field evolves, stakeholders debate whether IS fulfills its mission with rigorous methods and relevant data that propel health innovations to scale.1,2 Although peer-reviewed publications and scientific outcomes are common and important measures of research impact, assessing the true impact of IS should go beyond publication metrics to understand whether and how the research findings are used by stakeholders at all relevant levels of the health system.3–5 There is no agreed upon approach or defined metrics to quantify translational success of IS to policies, programs, society, or the broader economy, and this should be an imperative for the field.5–7
The United States Agency for International Development (USAID) has used operations research and IS as an integral part of effective programming.1,8–11 During the second phase of the US President's Emergency Plan for AIDS Relief (PEPFAR), IS emerged as a key strategic component, and IS investments were aligned with all program priorities and central initiatives.12,13 USAID as a PEPFAR implementing agency programmed a meaningful part of those IS investments. Through a central PEPFAR IS initiative, USAID programmed 10 IS Annual Program Statement (APS) awards implemented between 2012 and 2017. Amidst the debate on IS's mission and ensuring efficient and effective allocation and use of research funds, USAID began exploring ways to assess IS's impact on the areas described above. Using 10 IS APS awards, we assessed the impact of USAID's IS investment using the Payback Framework (PF) for research utilization (RU)14,15 and proposed practical ways to measure and strengthen IS impact going forward.
We selected the PF, also known as the Health Economics Research Group framework (Fig. 1) for this study because it had been used widely to assess RU including previous assessments of government-funded portfolios.15–17 The framework covered the widest range of impact domains with multidimensional measurement, and while forward tracking, it permitted feedback loops.5,16–21 The PF defines 2 stakeholder interfaces (SIs) in which researchers interact with research users (eg, the funder, the research community, and beneficiaries) and 7 stages of research sequence covering assessment, inputs, processes, primary and secondary outputs, application, and impact (Fig. 1).9,16,18,19 The 5 categories measuring impact are: (1) knowledge production; (2) benefits to future research including research training; (3) benefits to policy, where findings inform or influence decision makers; (4) benefits to health and the health system, such as HIV infections or deaths averted; and (5) broader economic benefits such as a healthier workforce.15,22 We hypothesize that the feedback loop can include greater iteration of SI throughout the research process (rather than 2 as outlined by Donovan and Hanney23). We should also include variables within each payback impact category measuring proximal health benefits represented by operational changes at service delivery sites where the IS work was conducted and throughout the health system.
We (authors A.M., A.Y., and D.C.) adapted a PF questionnaire designed by Kwan et al17 that contained quantitative checklists and qualitative semistructured assessments (see supplementary materials for the adapted questionnaire, Supplemental Digital Content, http://links.lww.com/QAI/B392). The data collection tool was modified to measure SI at each stage in the research process described above. The questionnaire was pilot tested on other USAID-funded IS activities unrelated to the IS portfolio reported here.
The questionnaire was administered to the principal investigator (PI) or their representative of 10 IS cooperative agreements funded through the same APS [same funding opportunity announcement (FOA)] between August, 2017, and February, 2018.24,25 Information was collected bidirectionally. The research team conducted desk reviews of APS-related documents stored in USAID repositories (eg, study proposals; cooperative agreements; protocols; quarterly/annual reports; program management plans; performance reports; study briefs; publications, presentations; and tools and job aids), PEPFAR repositories, and operational plans, as well as publicly accessible sources (eg, policies, guidelines, and operational plans) that attributed the IS awards. These were used in the absence of investigator responses and for triangulating responses. We abstracted relevant data to prepopulate the questionnaire that was subsequently shared with the PI for verification and completion. The study team either facilitated completion of the tool through a 40–60-minute teleconference or the PI independently completed and submitted the tool with additional relevant and supporting documents for verification. The desk reviews and supporting documents were also used to mitigate attribution and reporting bias. Studies were deidentified by replacing titles with letters.23 Semistructured questions allowed the investigative team to expand on or qualify quantitative responses.
A subset of USAID's IS portfolio was used illustratively to examine the modified PF. Ten APS awards were funded by USAID as part of a PEPFAR central initiative on IS. These IS awards aimed to address PEPFAR's prioritized challenges across the HIV prevention, care, and treatment continuum identified in consultation with its Scientific Advisory Board. We selected all 10 awards as they were awarded and began research implementation within a 2-year period under an FOA that aimed to: (1) identify new and effective interventions and cost-effective, efficient implementation models for proven interventions; and (2) identify approaches for adopting and integrating programs, technologies, and guidelines for optimal, timely effect and impact.24,25Table 1 shows award administrative details, and Table 2 shows aims, design, sample size, and primary findings.
Data abstraction tables mapped to the framework categories were developed by research staff for summarizing responses and independently entered by 2 team members (A.M. and A.L.K.). Any discrepancies between data entry (ie, payback achieved vs. not achieved) were reconciled through discussion including a third team member (D.C. or A.Y.) for tie breaking. We calculated a total and category-specific payback score using the following systematic method: (1) each subcategory defined within 1 of the 5 PF categories was assigned 1 point if RU was noted; (2) scores were summed without weighting within or between categories (eg, studies with 1 journal publication versus another study with 10 publications, both are scored 1 in that subcategory); and (3) percentage achievement score was calculated of the actual score relative to the total possible score for that category (eg, if PI noted payback in 3 of the 4 subcategories, the payback score for that category for that study would be 75%). Table 3 shows the total points possible in each payback category: knowledge production (4 points), benefits to future research (6 points), benefits to policy (4 points), benefits to health and the health system (6 points), and broader economic benefits (4 points). Qualitative data were obtained through semistructured questions that described nuances of structured payback responses (eg, HIV treatment guidelines were influenced through discussion with the Ministry of Health).
By February 2018, 8 of 10 IS award teams returned completed questionnaires, of which 6 through teleconference with USAID and 2 independently. For the 2 nonresponding award teams, only data abstracted through independent document review were used. The total “payback” benefit was 43% on average (min, max; 29%, 88%), indicating that programs reported positively on less than half of all PF measures across the 5 impact categories. All awards achieved some level of knowledge production (
= 75%: min, max; 50%, 100%) and benefits to future research (
= 70%: min, max; 50%, 100%), and 9 of the 10 awards reported payback in the category of benefits to policy (
= 45%: min, max; 0%, 100%) (Table 4). Benefits to health and the health sector (
= 18%; min, max; 0%, 100%) and broader economic benefits (
= 5%; min, max; 0%, 25%) were less common. Figure 2 displays a heat map of payback by award and category.
While knowledge production was high overall (75%), awards ranged considerably in the type and magnitude of impact. In descending order, posters (Ntotal = 58: min, max; 0, 13), oral presentations (Ntotal = 27: min, max; 0, 9), and peer-reviewed publications (Ntotal = 26: min, max; 0, 7) were most frequently generated. Five awards (A, C, D, H, and J) developed products that were used by stakeholders at the site level (eg, standard operating procedures, job aids, training guides, short message services, and appointment reminder services).
Benefits to Future Research
Benefits to future research were also high (
= 70%), but variable by award and type of activity. While all awards reported individual capacity development (Ntotal = 47), the average number of capacitated individuals per award was low (
= 1.23; min, max: 0, 12). The majority of individual capacity building (
= 70%) was reported to be indigenous. Four awards (C, G, H, and I) reported actual or expected qualifications gained for 7 in-country team members, and 5 awards (A, C, G, H, and I) reported actual or expected career advancement for 25 in-country team members. Only half (N = 5) of the awards reported any kind of institutional capacity development, and 3 reported contributing to future research by others. Institutional capacity development was related to training and support of grants management such as budget forecasting, monitoring, and reporting of expenses. Seven awards (A, C, D, E, H, I, and J) reported that the research project led to the generation of research funding, totaling over $19.35 million (USD).
Benefits to Policy
Benefits to policy (
= 45%) was reported by 6 awards. These awards (A, B, D, E, G, and I) indicated that their findings had been used for decision-making at some level in the health system. Two awards reported that their findings influenced their respective countries' Ministries of Health policy for antiretroviral therapy (ART) delivery to pregnant women; a third award informed World Health Organization guidelines; and another claimed to influence the availability of female condoms in-country. Three awards reported that they expected their findings to influence health policy in the future. We could not corroborate most of the 6 awards that reported influencing policy/guidelines relevant to medical professionals, health care managers, and/or health service users or the wider public. Attribution could only be triangulated for the award cited in World Health Organization guidance.
Benefits to Health and the Health System
The benefits to health and the health sector were reported to be relatively low (
= 18%: min, max; 0%, 100%), and most awards (N = 6) reported no impact. The greatest impact was reported to be in expected qualitative improvements and effectiveness of services, but only 2 awards reported actual increase in service delivery effectiveness and 1 award reported any actual health system improvements. No evidence of actual or expected benefits to health or the health system was provided.
Broader Economic Benefits
Broader economic benefits were low (5%: min, max; 0%, 25%), with no awards reporting any actual economic benefits and 2 reporting expected cost reduction in service delivery.
We report on the use of a modified PF for assessing the impact of USAID's IS investment using 10 awards that were developed in direct response to policy and program gaps highlighted by PEPFAR and USAID.24–26 Not surprisingly, the commonly measured categories, knowledge production and benefits to future research (75% and 70%, respectively) had the greatest impact among the 5 categories.5 Impact in these categories can sometimes be seen before study completion, for example, publication of preliminary or midterm findings.19 Payback was moderate for policy benefits (45%), lower for health and health system benefits (18%), and lowest for broader economic benefits (5%). These categories tended to require a longer passage of time for impact and are more indirect and difficult to attribute. Cruz-Rivera et al5 cited that benefits to policy typically occur 1–3 years after evidence generation. The 2 awards (D and F) that reported “impact” in all 5 categories were completed in 2015 and 2016, whereas other awards completed in 2017 did not show universal impact.
Although data collection was completed through desk review with (N = 8) or without (N = 2) the investigator responses, we had missing data resulting from unrecorded study attribution in policy and program reports, undocumented activities, PI nonresponse, PI inability to recall particular aspects of RU (eg, how many capacity building trainings were conducted), and PI perceiving the activity to be insignificant. Despite our efforts to triangulate information, there was insufficient time to independently observe RU outcomes throughout the levels of the health system, and this resulted in greater reliance on PI self-reports. Self-report may reflect numerous biases including desirability bias27; however, we conducted this assessment after awards had ended to minimize bias. In addition, we requested evidence to support PI responses when they were not found through our independent reviews. Knowledge production, which had the largest impact, was more easily corroborated, but supporting evidence of policy, health and health sector, economic benefits, and stakeholder engagement had fewer corroborating documents. Eliciting stakeholder input, particularly program managers and implementers and beneficiaries, proved challenging during this assessment time. Study design (eg, randomized control trial, observational cohort, and survey) and outcomes (eg, null, positive) did not appear to influence RU. Beyond study design, we did not assess methodological rigor. Additional efforts are needed to independently assess methodological rigor using evaluations like Grading of Recommendations Assessment, Development, and Evaluation (GRADE)] in conjunction with IS impact indicators.28 RU appeared to vary by the type of implementing partner. Implementing partners from academic institutions tended to have higher average payback benefits than nonacademic partners (eg, nongovernmental organizations). It is noteworthy that the largest payback benefits came from knowledge production and future research, and there are academic incentives for both as part of professional promotion and grantsmanship.29 We could not examine these differences further with this very small sample. The PF and this application to the APS portfolio had several strengths. We found the PF to be most suitable for assessing IS work because of the broad range of impacts measured, particularly the emphasis on policy, health system, and broader economic impact, which is key for rigorous IS.3,27,30 In addition, the interfaces of the research sequence in the PF model provides a useful tool for researchers to plan for RU that includes iterative engagement with stakeholders, funders, researchers, providers, and beneficiaries across levels of the health system.31 The PF allowed us to track how these IS studies demonstrated RU including stakeholder engagement from inception, along the research process, and through the postaward phase. We used a targeted subset of our IS portfolio to examine this framework. While the sample was purposefully selected, the FOA context of these IS awards sets expectations that these studies should have actual or expected policy, health system, and economic impact; yet even with self-report and the potential for reporting bias, impact in these categories was low. We expect that other IS studies with less direct expectations of impact may have lower impact in these latter categories and that needs to be evaluated.5,31 The exercise showed that IS impact could increase meaningfully if funders of IS studies included metrics on RU that extended beyond scholarly reporting as an incentive in grantsmanship. As IS proposals are assessed for scientific rigor, they should also be assessed for the strength of their RU plan along the entire research process. Implementation of this framework also provided a method for consistent data collection, which allowed for comparisons to be made across studies.31 This approach is too intensive for routine and frequent use, but selected indicators of RU in the following areas should be codified as part of protocols, workplans, and monitoring and evaluation plans of IS studies; expected and actual policy, health and health system, and economic benefits. For funders, demonstrated RU is an indicator that limited resources were well used and that findings made an impact. For researchers, RU demonstrates the usefulness of their studies' findings beyond publications into health practice, programs, and policies and shows funders that they made sound investments. For global policymakers (eg, multilateral institutions) and country governments (eg, ministries of health), RU ensures that studies' findings are applied to policies and programs. For beneficiaries, or the end users, RU translates evidence into real-world practice that can ultimately improve their health and quality of life.
Several limitations were noted for this exercise. The PF did not have subcategories for knowledge production. We adopted subcategories to identify and distinguish research, policy, and program outputs that would more likely used by policy and program stakeholders.17 Despite its emphasis that payback is multidirectional, the PF suggests a linearity of payback that fails to capture the complexity of context.32 In addition to the potential reporting biases mentioned above, proving attribution is an often-cited challenge in RU assessments that is relevant to the PF.5,7,30,33,34 Since 1979, Weiss35 proposed moving toward measuring the contribution of research to social policy instead of attribution. Morton's33 Research Contribution Framework shifts the focus from attribution—precise identification of research impact, for example, findings cited in national policy—to contribution. The contribution approach examines the many contextual factors and actors that cause an outcome, while also identifying how research contributed to the outcome.33 Furthermore, Klautzer et al's36 previous RU assessment suggests an indirect route of research impact to policymakers, and through this route, research can iteratively influence policy at any stage of both the research and policy processes. Other RU frameworks exist that emphasize policy networks, that is, how the relationship between researchers, communities, and policymakers influences incremental policy change.5 Associated methods of measurement included case studies that detailed the RU process or independent semistructured interviews with research staff and end users to triangulate findings and better capture the complex pathways by which RU emerges.7,20,22,27,37 The 2 awards with payback across the 5 categories (D and F) reported consulting with policymakers at the country level, but as noted we were unable to triangulate with end users. Finally, the questionnaire we adapted from the study by Kwan et al to systematically collect and analyze the data scored each subcategory in the framework as binary (eg, one or many journal publications get the same score) rather than continuous. The scoring system may have obscured important variations among the awards and limited representation of the magnitude of research impact. Cohen et al7 recommend using a scoring system that takes into account the level and degree of corroborating evidence to address this challenge. The PF benefit score has many dimensionalities that complicate its interpretation. Although there is value in quantifying certain categories, it further complicates interpretation.
The application of the PF to USAID's IS APS research portfolio demonstrated varying levels of RU across IS awards and payback categories. The PF was useful in measuring RU at stages and levels important for measuring IS impact but limited in adequately measuring stakeholder engagement. Pathways by which impact of IS will be achieved, include intensive stakeholder engagement. The metrics from this framework can be used early in IS development, prospectively throughout the study to create pathways for utilization and to enhance IS impact through informed decision-making.33,38 The greater benefits observed in knowledge and research may have resulted from several factors including grantsmanship incentives, and similar incentives should be explored for policy, health, and economic benefits.5,20,31 To fill the “know-do” gap between evidence and practice, IS should emphasize RU including stakeholder engagement. This assessment highlights the need for routine metrics to monitor IS impact at the program-, policy-, and economic-level and stakeholder engagement.
The authors thank the PIs and the study teams that so graciously provided their time and shared information to conduct this assessment. The authors also thank stakeholders who provided valuable input for conducting this work.
The information provided does not necessarily reflect the views of USAID or the United States Government, and the contents of this manuscript are the sole responsibility of the authors.
1. Geng EH, Peiris D, Kruk ME. Implementation science
: relevance in the real world without sacrificing rigor. PLoS Med. 2017;14:1–5.
2. Ridde V. Need for more and better implementation science
in global health. BMJ Glob Health. 2016;1:1–3.
3. Banzi R, Moja L, Pistotti V, et al. Conceptual frameworks and empirical approaches used to assess the impact of health research: an overview of reviews. Health Res Policy Syst. 2011;9:1–10.
4. Rowe G, Frewer LJ. A typology of public engagement mechanisms. Sci Technol Hum Values. 2005;30:251–290.
5. Cruz Rivera S, Kyte DG, Aiyegbusi OL, et al. Assessing the impact of healthcare research: a systematic review of methodological frameworks. PLoS Med. 2017;14:1–24.
6. Grimshaw JM, Eccles MP, Lavis JN, et al. Knowledge translation of research findings BT—effective dissemination of findings from research. Implement Sci. 2012;7:1–17.
7. Cohen G, Schroeder J, Newson R, et al. Does health intervention research have real world policy and practice impacts: testing a new impact assessment tool. Health Res Policy Syst. 2014;13:1–12.
8. Population Council. Horizons: Global Leadership, Research and Development Responsibilities and Best Practices in HIV/AIDS
: Final Narrative Report. Washington, DC: HORIZON: Population Council; 2010.
9. Molldrem V, Justice J. Project SEARCH End of Project Evaluation: Supporting Evaluation and Research to Combat HIV. Washington, DC: USAID; 2012.
10. The Population Council Inc. Project SOAR: Supporting Operational AIDS Research: About. 2018. Available at: http://www.projsoar.org/about/
. Accessed November 8, 2018.
11. The Population Council Inc. HIVCore: What We Do. Available at: http://www.hivcore.org/what.html
. Accessed August 23, 2019.
12. Padian N, Holmes C, McCoy S, et al. Implementation science
for the US President's Emergency Plan for AIDS Relief (PEPFAR
). J Acquir Immune Defic Syndr. 2011;56:199–203.
13. The Office of the Global AIDS Coordinator. PEPFAR
Blueprint: Creating an AIDS-Free Generation. Washington, DC. 2012. Available at: http://www.pepfar.gov/documents/organization/201386.pdf
. Accessed August 23, 2019.
14. Buxton M, Hanney S. How can payback from health services research be assessed? J Health Serv Res Policy. 1996;1:35–43.
15. Hanney SR, Grant J, Wooding S, et al. Proposed methods for reviewing the outcomes of health research: the impact of funding by the UK's “Arthritis Research Campaign.” Health Res Policy Syst. 2004;2:1–11.
16. Donovan C, Butler L, Butt AJ, et al. Evaluation of the impact of National Breast Cancer Foundation-funded research. Med J Aust. 2014;200:214–218.
17. Kwan P, Johnston J, Fung AYK, et al. A systematic evaluation of payback of publicly funded health and health services research in Hong Kong. BMC Health Serv Res. 2007;7:1–10.
18. Hanney S, Buxton M, Green C, et al. An assessment of the impact of the NHS health technology assessment programme. Health Technol Assess. 2007;11:1–180.
19. Scott JE, Blasinsky M, Dufour M, et al. An evaluation of the mind-body interactions and health program: assessing the impact of an NIH program using the payback framework. Res Eval. 2011;20:185–192.
20. Greenhalgh T, Raftery J, Hanney S, et al. Research impact: a narrative review. BMC Med. 2016;14:78.
21. Buxton M, Hanney S. Assessing Payback from the Department of Health Research and Development: Preliminary Report. Vol. 1. The Main Report, Health Economics Research Group: Uxbridge, London; 1994.
22. Hanney S, Greenhalgh T, Blatch-Jones A, et al. The impact on healthcare, policy and practice from 36 multi-project research programmes: findings from two reviews. Health Res Policy Syst. 2017;15:26.
23. Donovan C, Hanney S. The “payback framework” explained. Res Eval. 2011;20:181–183.
24. U.S. Agency for International Development. Implementation Science
Research to Support Programs under the President's Emergency Plan for AIDS Relief (PEPFAR
). 2011:1–110. Funding Opportunity/APS Number APS-OAA-11-000002.
25. U.S. Agency for International Development. Implementation Science
Research to Support Programs under the President's Emergency Plan for AIDS Relief (PEPFAR
)—Round 2. 2011:1–60. Funding Opportunity/APS Number: APS-OAA-11-000002 98.001.
Scientific Advisory Board. Summary Recommendations. 2011. Available at: http://www.pepfar.gov/documents/organization/155571.pdf
. Accessed August 23, 2019
27. Milat AJ, Bauman AE, Redman S. A narrative review of research impact assessment models and methods. Health Res Policy Syst 2015;13:1–7.
28. Guyatt G, Oxman A, Akl E, et al. GRADE guidelines: 1. Introduction—GRADE evidence profiles and summary of findings tables. J Clin Epidemiol. 2011;64:383–394.
29. Groote SLDe, Shultz M, Smalheiser NR. Examining the impact of the national institutes of health public access policy on the citation rates of journal articles. PLoS One. 2015;10:e0139951.
30. Penfield T, Baker MJ, Scoble R, et al. Assessment, evaluations, and definitions of research impact: a review. Res Eval. 2014;23:21–32.
31. Hanney S, Packwood T, Buxton M. Evaluating the benefits from health research and development Centres: a categorization, a model and Examples of application. Evaluation. 2000;6:137–160.
32. Davies H, Nutley S, Walter I. Assessing the impact of social science research: conceptual, methodological and practical issues. 2005. ESRC Symposium on Assessing Non-Academic Impact of Researchs; May 12–13, 2005; St. Andrews, Scotland.
33. Morton S. Progressing research impact assessment: a “contributions” approach. Res Eval. 2015;24:405–419.
34. Kok MO, Schuit AJ. Contribution mapping: a method for mapping the contribution of research to enhance its impact. Health Res Policy Syst 2012;10:21.
35. Weiss CH. The many meanings of research utilization. Public Adm Rev. 1979;39:426–431.
36. Klautzer L, Hanney S, Nason E, et al. Assessing policy and practice impacts of social science research: the application of the payback framework to assess the future of work programme. Res Eval. 2011;20:201–209.
37. Stern N. Building on Success and Learning from Experience: An Independent Review of the Research Excellence Framework. 2016:56. Available at: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/541338/ind-16-9-ref-stern-review.pdf
. Accessed August 23, 2019.
38. Meagher L, Lyall C, Nutley S. Flows of knowledge, expertise, and influence: a method for assessing policy and practice impacts from social science research. Res Eval. 2008;17:163–173.