Working with implementing organizations and governments in over 32 countries, the US President's Emergency Plan for AIDS Relief (PEPFAR) has contributed to the rapid acceleration of HIV treatment access, availability of care and support services, and HIV prevention interventions. In the first phase of PEPFAR, these activities were appropriately carried out in an emergency fashion with the goal of using available interventions to reduce mortality and alleviate suffering from HIV disease as quickly and effectively as possible. Many lessons have been learned through examination of programs, including simple evaluations and operations research. Commensurate with the emergency response, however, state-of-the-art monitoring, evaluation, and research methodologies were not fully integrated or systematically performed.
In the second phase of PEPFAR, characterized by an increased emphasis on sustainability, programs must demonstrate value and impact to be prioritized within complex and resource-constrained environments. In this context, there is a greater demand to causally attribute outcomes to programs. Better attribution can be used to inform midcourse corrections in the scale-up of new interventions (eg, male circumcision) or to re-evaluate investments in programs for which impact is less clear.
To meet these demands, PEPFAR is adopting an implementation science (IS) framework to improve the development and effectiveness of its programs at all levels. IS is the study of methods to improve the uptake, implementation, and translation of research findings into routine and common practices (the “know-do” or “evidence to program” gap).1,2 For example, IS was used to evaluate the routine operational effectiveness of the South African National Prevention of Mother-to-Child Transmission Programme.3 Investigators explored the survival of HIV-free infants across program sites and identified specific sources of variation such as health system factors (eg, limited antenatal visits and lack of syphilis screening) and individual behaviors (eg, breastfeeding practices). By framing the problem through IS, the study revealed opportunities for improving program performance that could be translated into immediate solutions (eg, improving quality of care, infant feeding counseling). In this way, IS proved to be a valuable tool that was used not only to improve program effectiveness, but also to explain what worked, why, and under what circumstances.
Although no less rigorous than biomedical research dictated by a static protocol with robust internal validity (ie, “proof-of-concept” research with a precisely defined and narrow objective), an IS approach represents a paradigmatic shift in emphasis to greater external validity. The IS scope is also broader, seeking to improve program effectiveness and optimize efficiency, including the effective transfer of interventions from one setting to another.1,4 The methods of IS facilitate making evidence-based choices between competing or combined interventions and improving the delivery of effective and cost-effective programs.
Development of a PEPFAR Implementation Science Framework
Components of the Implementation Science Framework
Building on a successful experience with some elements of IS in the first phase of PEPFAR, the Office of the US Global AIDS Coordinator, in consultation with its US Government PEPFAR implementing agencies and a broad group of academic, programmatic, and methodological experts, is in the process of developing an IS framework. The framework will provide structure, methodological rigor and diversity as well as knowledge generation to meet the needs of the program and the global community. This framework, described here, incorporates the fundamental components of IS: monitoring and evaluation, operations research, and impact evaluation (including modeling and cost-effectiveness analyses).
Monitoring and Evaluation
Monitoring is the routine, daily assessment of ongoing activities, inputs, outputs, and progress.5 By contrast, evaluation assesses what has been achieved (Table 1).5 Using prevention of mother-to-child transmission (PMTCT) as an example, monitoring and evaluation (M&E) has indicators at every step of program implementation, including inputs (eg, healthcare workers, clinic sites, laboratory supplies, antiretroviral medications, referral staff), outputs (eg, number of HIV-infected mothers and their infants who are served), outcomes (eg, percent of HIV-infected pregnant women who received antiretrovirals to reduce the risk of transmission), and impacts (eg, reduction in perinatal transmission rates).5 Note that although M&E descriptively reports on outcomes and impacts, impact evaluation (described subsequently) links changes in outcomes to a particular program through methods of causal attribution.
Since its launch, PEPFAR has conducted extensive M&E activities. In fact, all program implementation was mandated to include M&E as essential tools for tracking program performance and short-term outcome/impact results. Given the immense number of program activities supported by PEPFAR, however, use of these data was left to the actual implementing partner. No efforts were made to collect this information at US Government headquarters with the exception of a limited set of indicators to monitor progress and to report to Congress.6 In many respects, these indicators have become the global measure of PEPFAR's success, although they document only limited components of PEPFAR's results. More formal program evaluations have been conducted (supported through the former Targeted Evaluation and Public Health Evaluation programs), although these studies have been relatively limited in number and disparate in the range of research questions. In addition, these studies were not integrated within a comprehensive evaluation framework, as proposed here.
In this new phase, M&E will be included within the more universal IS framework for evaluating all activities of PEPFAR. Importantly, this new framework will support the long-term goals of program effectiveness, efficiency as well as sustainability, country ownership, and program integration. This is an opportunity to establish an a priori approach to analyzing the outputs and outcomes of PEPFAR activities and to link these findings both to operations research and the impact evaluation components of IS to provide a comprehensive assessment of PEPFAR program activities.
Operations research (OR) focuses on increasing the efficiency of implementation and operational aspects of a particular program through the use of scientifically valid research methods (Table 2).1,7 OR allows program planners to design, implement, and test solutions to improve program delivery. Included under this umbrella are the tools drawn from the academic discipline of OR, which uses advanced mathematical techniques (eg, simulation, mathematical optimization, decision science) to improve decision-making (eg, how to optimally allocate limited resources).8
Operations research is typically not an aspect of emergency response. With a focus on increasing efficiency of implementation, OR usually assumes a program or activity has already been implemented in the field and that a baseline measure of program delivery has been established, neither of which was the case when PEPFAR was first implemented. Now, with PEPFAR's years of implementation experience as well as heightened interest in developing sustainable infrastructure for service delivery, OR is an urgent priority.
Historically, OR has been applied to pharmaceutical supply chain management (including inventory control, logistics management and storage, information and distribution systems), laboratory service infrastructure and planning, and healthcare workforce development, areas that are relevant to PEPFAR's scale-up of treatment and care.8 OR has also been used for epidemic modeling,9,10 ensuring equitable antiretroviral treatment roll-out,11 and resource allocation for HIV prevention programs for injection drug users.12
One example of OR is a 2003 study that compared various simulations to determine the best prevention packages to reduce mother-to-child transmission of HIV globally.13 The investigators found that interventions must balance prevention of mother-to-child transmission through avoidance of breastfeeding with the positive immunologic and nutritional benefits of breastfeeding for the infant. In settings where the risk of mortality from not breastfeeding HIV-uninfected infants was low (eg, there was access to clean water and appropriate feeding supplements), interventions that combined avoidance of breastfeeding with antiretroviral prophylaxis prevented the most deaths. On the other hand, in settings where the risk of mortality from not breastfeeding was high (eg, hygienic replacement feeding is more difficult), interventions that included avoidance of breastfeeding could actually result in more deaths than no intervention. The OR model provided a useful tool for determining the optimal combination of interventions in various parts of the world.
PEPFAR has supported OR through its Public Health Evaluation studies, although these studies have been relatively limited in number. One example is the PEPFAR-funded PEARL study (PMTCT Effectiveness in Africa: Research and Linkages to Care and Treatment) that randomly selected women undergoing PMTCT in 43 sites in 4 countries.14 This study used operational metrics and biomedical techniques of cord blood measurement of drug levels to assess the actual effectiveness of the PMTCT cascade (Fig. 1) and to find programmatic determinants of successful prevention of vertical transmission. The study has led to a re-evaluation of many of the elements of PMTCT programs and greater attention to predictors of program success. Moving forward, PEPFAR's unified IS strategy will guide OR efforts, linking this work to a broader range of program and impact evaluations and a sharper focus to address service delivery strategies and resource allocation for program improvement.
Impact evaluation permits causal attribution of observed changes in outcomes to a particular program by comparing these changes with what would have happened had the program not been implemented (the counterfactual scenario).15-17 Historically, the most common application of these methods has been to examine the effect of a program on its ultimate outcome of interest such as HIV incidence for prevention or survival for care and treatment programs. However, these methods can also be used on an ongoing basis to assess whether a program is on track by assessing intermediary outcomes that can be causally attributed to the program of interest and to assess the comparative efficiencies and cost-effectiveness of different programs (Table 3).
Randomized experimental designs are often considered the most rigorous methods for impact evaluation because random allocation of the treatment to individuals or communities reduces or eliminates selection bias, ensuring that observed outcomes are attributable to the program.18 Randomization can often be achieved through “smart implementation” without the enormous costs and levels of monitoring necessary in a clinical randomized controlled trial typical of a drug for regulatory approval. The fundamental premise underlying randomized approaches to implementation is to concentrate implementation in a few sites to start (preferably selected randomly) and then to phase in other sites over time (eg, a “stepped wedge”).19,20 This approach is in contrast to simultaneous implementation across many sites and districts and capitalizes on the logistic and fiscal realities that usually make a widespread simultaneous implementation approach difficult. Because study locations are randomized based on time, sites that at first do not receive the program initially serve as a comparison; however, all eligible sites eventually receive the program, ensuring equity.
In addition to randomization, the impact evaluation tool kit includes quasiexperimental methods that generate a valid counterfactual without random allocation through use of statistical methods that permit causal attribution of outcomes to the program.15,16 Modeling can also be used to integrate multiple data sources and permit inferences about impact, especially when empiric data on outcomes are limited or nonexistent.21,22 Because such methods often draw on existing data such as demographic surveys, they allow for cross-sectional or retrospective evaluations, and they are often quicker and cheaper than experimental designs. However, they can be more analytically intensive and have a greater chance of suffering from both selection bias and violation of statistical assumptions than do experimental designs.15
There are numerous examples of successful impact evaluations.17,23,24 For example, stepped wedge designs have been used effectively in trials assessing the impact of hepatitis B vaccination,25 nevirapine to prevent mother-to-child transmission,26 education programs to improve adherence to HIV treatment,27 and tuberculosis screening and treatment in HIV-positive men.28 In addition, the innovative methods of adaptive clinical trial designs in combination with such approaches hold promise as a way to assess multicomponent, combination HIV prevention programs.29
Application of the Implementation Science Framework to PEPFAR Programs and Research
Applying an IS framework to PEPFAR programs will sharpen our ability to support partner countries in choosing strategic programs that provide the most benefit using the most efficient methods. To accomplish these goals, the entire IS continuum encompassing monitoring and evaluation, operations research, and impact evaluation should be unified into a coherent framework, in which results across the full spectrum can support successful implementation of critical programs. All PEPFAR programs must be subject to the most rigorous IS methods and evaluations that are appropriate and feasible. Impact evaluations should be prioritized for programs of unknown or uncertain efficacy, whereas all programs should be evaluated for opportunities to improve efficiencies through operations research. Similarly, M&E should be incorporated into all programs to determine the adequacy of program implementation, coverage, and outcomes.
Existing and Future PEPFAR Programs and Research
IS-based evaluations must be added as needed to existing programs, especially where there are questions about effectiveness or efficiency. For these programs, impact evaluations may be possible through slight changes in the program (eg, randomly adding an incentive program to increase testing or adherence) and/or additional data collection to create a de novo baseline with clear outcomes. In all cases, operations research can be used to improve program efficiency, and M&E can monitor whether programs are implemented as intended.
It will also be essential for new programs to include evaluation plans designed and implemented in tandem with the program. Designs of new programs should consider the complete array of IS tools: rigorous impact evaluations (eg, stepped wedge evaluation designs incorporated into program roll-out), multiple models of service delivery and/or implementation to be explored though operations research, and establishing a system to routinely monitor the program's inputs, outputs, and outcomes. As discussed, an IS approach incorporates strategic use of real-time data collection that permits ongoing program corrections. Programs can be improved on an ongoing basis using input from a variety of stakeholders and data collected from multiple levels of implementation: from inputs to impact. To gain further input into the adoption of the IS framework, the Office of the US Global AIDS Coordinator is forming a Scientific Advisory Board, composed of leading academic, nongovernmental and US Government researchers, that will help shape IS priorities to maximize PEPFAR impact (potential examples are listed in Table 4).
PEPFAR has unequivocally had enormous beneficial impact. The scale and urgency of the program as developed were commensurate with the scale and urgency of the epidemic and resulted in millions of lives saved and infections averted. In the next phase of PEPFAR, emphasis must also be placed on the development and contribution of knowledge about HIV/AIDS program implementation to the global community. An IS framework will permit identification of high-priority implementation questions and development of tools with which to answer them. Acceleration of this work will permit PEPFAR to support strategic interventions and focus them where they will have the most impact while simultaneously improving implementation efficiency and sustainability.
We thank Drs Stefano Bertozzi, Nicholas Jewell, Margaret Brandeau, and Brian Lewis for helpful critiques of an early draft of this paper.
1. The Global Fund to Fight AIDS Tuberculosis and Malaria, United States Agency for International Development (USAID), World Health Organization (WHO), Special Program for Research and Training in Tropical Diseases, Joint United Nations Program on HIV/AIDS (UNAIDS), The World Bank. Framework for Operations and Implementation Research in Health and Disease Control Programs
2. Madon T, Hofman KJ, Kupfer L, et al. Public health. Implementation science. Science
3. Jackson DJ, Chopra M, Doherty TM, et al. Operational effectiveness and 36 week HIV-free survival in the South African programme to prevent mother-to-child transmission of HIV-1. AIDS
4. Hirschhorn LR, Ojikutu B, Rodriguez W. Research for change: using implementation research to strengthen HIV care and treatment scale-up in resource-limited settings. J Infect Dis
. 2007;196(Suppl 3):S516-S522.
5. Joint United Nations Program on HIV/AIDS (UNAIDS), The World Bank. Monitoring and Evaluation Operations Manual
Geneva: National AIDS Councils; 2002.
6. The US President's Emergency Plan for AIDS Relief. 2009 Annual Report to Congress on PEPFAR Program Results
. Washington, DC; 2010.
7. World Health Organization. Expanding Capacity for Operations Research in Reproductive Health: Summary Report of a Consultative Meeting
. Geneva: WHO; 2003.
8. Xiong W, Hupert N, Hollingsworth EB, et al. Can modeling of HIV treatment processes improve outcomes? Capitalizing on an operations research approach to the global pandemic. BMC Health Serv Res
9. Sloot PMA, Ivanov SV, Boukhanovsky AV, et al. Stochastic simulation of HIV population dynamics through complex network modelling. International Journal of Computer Mathematics
10. Caulkins JP, Kaplan EH, Lurie P, et al. Can difficult-to-reuse syringes reduce the spread of HIV among injection drug users? Interfaces
11. Wilson DP, Blower SM. Designing equitable antiretroviral allocation strategies in resource-constrained countries. PLoS Med
12. Zaric GS, Brandeau ML. Optimal investment in a portfolio of HIV prevention programs. Med Decis Making
13. Bertolli J, Hu DJ, Nieburg P, et al. Decision analysis to guide choice of interventions to reduce mother-to-child transmission of HIV. AIDS
14. Stringer EM, Ekouevi DK, Coetzee D, et al. Coverage of nevirapine-based services to prevent mother-to-child HIV transmission in 4 African countries. JAMA
15. Baker JL. Evaluating the Impact of Development Projects on Poverty: A Handbook for Practitioners
. Washington, DC: The World Bank; 2000.
16. Network of Networks for Impact Evaluation (NONIE). Impact Evaluations and Development
. Washington, DC: NONIE; 2009.
17. Duflo W. Scaling Up and Evaluation
. Annual World Bank Conference on Development Economics: The International Bank for Reconstruction and Development, The World Bank; 2004.
18. Duflo E, Kremer M. Use of Randomization in the Evaluation of Development Effectiveness
. Conference on Evaluation and Development Effectiveness: World Bank Operations Evaluation Department (OED); 2003.
19. Hussey MA, Hughes JP. Design and analysis of stepped wedge cluster randomized trials. Contemp Clin Trials
20. Brown CA, Lilford RJ. The stepped wedge trial design: a systematic review. BMC Med Res Methodol
21. Hallett TB, Zaba B, Todd J, et al. Estimating incidence from prevalence in generalised HIV epidemics: methods and validation. PLoS Med
22. Hallett TB, Singh K, Smith JA, et al. Understanding the impact of male circumcision interventions on the spread of HIV in southern Africa. PLoS One
23. Parker SW, Teruel GM. Randomization and social program evaluation: the case of Progresa. Annals of the American Academy of Political and Social Science
24. Arifeen SE, Hoque DM, Akter T, et al. Effect of the integrated management of childhood illness strategy on childhood mortality and nutrition in a rural area in Bangladesh: a cluster randomised trial. Lancet
25. Hall AJ, Inskip HM, Loik F, et al. The Gambia Hepatitis Intervention Study. Cancer Res
26. Hughes J, Goldenberg RL, Wilfert CM, et al. Design of the HIV Prevention Trials Network (HPTN) Protocol 054: a cluster randomized crossover trial to evaluate combined access to nevirapine in developing countries. UW Biostatistics Working Paper Series. Working Paper 195
. Seattle: The Berkeley Electronic Press; 2003.
27. Fairley CK, Levy R, Rayner CR, et al. Randomized trial of an adherence programme for clients with HIV. Int J STD AIDS
28. Grant AD, Charalambous S, Fielding KL, et al. Effect of routine isoniazid preventive therapy on tuberculosis incidence among HIV-infected men in South Africa: a novel randomized incremental recruitment study. JAMA
29. Chow S, Chang M. Adaptive Design Methods in Clinical Trials
. Boca Raton, FL: Chapman & Hall/CRC; 2006.
30. Stringer EM, Chi BH, Chintu N, et al. Monitoring effectiveness of programmes to prevent mother-to-child HIV transmission in lower-income countries. Bull World Health Organ