Although not all research questions require a multisite approach, multisite trials are advantageous for studies aimed at learning about diverse experiences, estimating treatment by site interaction effects, and examining different ways of organizing care or service delivery (Weinberger et al., 2001). Multisite trials are well suited for research questions faced in nursing, public health, community settings, and translational science (e.g., What interventions are best for subpopulations of people, and how well will interventions work in actual community settings?; Lebowitz, Vitiello, & Norquist, 2003). For example, the Braden Scale, a globally used tool in nursing homes and hospitals developed to determine patient risk for pressure ulcers, was tested in a nurse-led multisite trial (Bergstrom, Braden, Kemp, Champagne, & Rudy, 1996) recognized by the National Institute of Nursing Research as one of the 10 Landmark Nursing Research Studies (National Institute of Nursing Research, 2006). The many advantages to multisite studies include the potential to recruit a large sample in less time and increase the generalizability and external validity of study results (Flynn, 2009). Multisite studies allow for subject recruitment from broader and more diverse sampling pools or geographic regions, enhancing generalizability (Flynn, 2009). Furthernore, recruitment of participants over less time results in fewer contamination effects and enhances study fidelity to procedures in multisite studies (Borrelli, 2011). Given the benefits to scientific rigor, reproducibility, and design, findings from multisite studies are more likely to provide evidence to transform clinical practice and influence policy.
However, despite the many benefits to conducting multisite studies, challenges exist. Advanced statistical methods are often needed to adjust for differences and inherent complexity of sampling. Implementing protocols can be challenging with resultant data integrity, communication, reliability, and cost implications. Standardization across sites may be difficult due to lack of uniformity in clinical practices, institutional traditions and routines of care, and differing clinical privileges for research staff. Hiring and training of project staff at each site may be daunting. “Intervention drift,” the gradual deviation from the intervention protocol, may occur, as well as competition from other studies and “politics” among the sites.
Although multisite studies may be economically cheaper due to faster accrual of participants resulting in reduced time for recruitment, increased time involved in getting approval from multiple institutional review boards (IRBs) and other committees is common, often slowing down the research process. Study-related costs should account for the geographic distance between sites; distance may increase mileage reimbursement and travel time while decreasing accessibility or timely responsiveness. Telecommunication among and across sites can be expensive; these costs must be carefully planned and factored into the study budget. Internet technologies, such as e-mail listservs, bulletin board systems, or other multimedia systems, are lower cost options for cost-effective communication (Lindquist, Treat-Jacobson, & Watanuki, 2000). Complex subcontracts are frequently used, and sometimes high indirect costs are charged by institutions. These added costs need to be considered while building the budget for the study.
Because the challenges to conducting multisite studies are often daunting, addressing these issues a priori results in a more efficient and effective study. This article provides a summary of articles on multisite trials conducted within the past 10 years to uncover common challenges in conducting these trials. To enrich the context, exemplars from authors are included. Based on literature and experience, strategies to combat challenges are summarized; some strategies may be more applicable to specific types of trials. The strategies for addressing challenges related to multisite studies include site selection, epicenter/coordinating centers, hiring/managing staff, fidelity monitoring, human subjects review/IRBs, statistical considerations, and authorship considerations. Exemplars are discussed from three of Dr. Duffy’s studies (two smoking cessation studies, of which one was a randomized controlled trial [RCT] and the other was a quasi-experimental longitudinal study of head and neck cancer patients; Duffy, Ronis, Ewing, et al., 2016, Duffy, Ronis, Karvonen-Gutierrez, et al., 2016; Duffy et al., 2006, 2007, 2012). Additional community-based exemplars are discussed from one of Dr. Smith’s studies, which was a school-based group RCT of a physical activity intervention targeting Appalachian adolescents (Petosa & Smith, 2017; Smith & Petosa, 2016).
When selecting sites, one must consider potential accrual rates, as well as patient demographics and facility characteristics. For example, in a head and neck cancer study, a university hospital lacked racial diversity but had more women, whereas Veterans Affairs sites had more racial diversity but fewer women (Duffy et al., 2006). In a smoking study, the quasi-experimental design allowed for matching intervention and control sites based on size and numbers of minorities (Appendix A; Duffy, Ronis, Ewing, et al., 2016; Duffy, Ronis, Karvonen-Gutierrez, et al., 2016; Duffy et al., 2012). Obtaining numbers of staff and patients and demographic data a priori from clinical sites can be time-consuming. Solutions to this include usual annual reports or other routine reporting systems (e.g., cancer registries). See Appendix B for the CONSORT statement for the multisite study.
The geographic location of the study sites is also a consideration. Sites closer to the flagship site are often easier to manage due to easier ability of investigators to make face-to-face visits and manage problems, but sites farther away may offer more generalizability. For example, an RCT enrolled smokers from Michigan, Florida, and Texas to represent a variety of regions of the country (Duffy et al., 2006). However, another smoking study included six hospitals in Michigan to integrate a smoking cessation intervention with a hospital system (Duffy, Ronis, Ewing, et al., 2016; Duffy, Ronis, Karvonen-Gutierrez, et al., 2016; Duffy et al., 2012). Depending on the goal and objectives of the study, it may be advantageous to select sites that amplify differences that may exist between types of cases (Audet & d’Amboise, 2001).
Every site must have a site principal investigator (PI); in many cases, this is a hospital or community member. There is little incentive for site investigators to participate as research often becomes an added responsibility to their already high workloads. Therefore, incentives need to be carefully considered. For example, offering participants something that the site sees as valuable or offering something of value to staff such as continuing education credits may be beneficial. Staff nurses can be involved in the research because this involvement provides evidence needed for certain certifications (e.g., Magnet designation). Negotiating with leadership to allow personnel involved in the study the opportunity to use research involvement for promotion and offering authorship to site PIs are additional incentives to consider.
Selecting study sites is determined by the site’s ability to follow good clinical practice guidelines and strictly adhere to the protocol in study conduct. The availability of qualified and cooperative site personnel is a practical consideration of site selection (Lindquist et al., 2000). In Duffy’s quasi-experimental smoking study, one site had problems with research assistants that precluded the site from contributing data; this site could not be replaced (Duffy, Ronis, Ewing, et al., 2016; Duffy, Ronis, Karvonen-Gutierrez, et al., 2016; Duffy et al., 2012). To avoid pitfalls such as this, early engagement and involvement of leaders from each site is key to building trust and rapport (Duffy et al., 2012; Fenlon et al., 2013), retaining participants (Fiss et al., 2010), and creating a shared commitment to the completion and success of the project (Flynn, 2009).
Although time intensive, the most effective way to engage sites is through personal outreach and face-to-face contact throughout all phases of a study. Hiring study personnel local to the community or who have commonalities with the target population is recommended. For example, in the community-based school study focused on mitigating obesity in Appalachia (Smith & Petosa, 2016), the site PI grew up in Appalachia and thus gained trust more easily and developed rapport with the target population quickly. Ongoing and regular communication between team members and individual participants is essential. Site leaders and participants require clear and regular updates on study progress and expectations (Fenlon et al., 2013). Internet technologies, such as e-mail listservs, bulletin board systems, or other multimedia systems, are options for cost-effective communication (Lindquist et al., 2000). Channels of communication may vary in ways that fit the needs of each individual site. For instance, in some rural and remote Appalachian sites, Internet access is poor and website availability is limited. Therefore, participants and gatekeepers prefer “face-to-face meetings,” paper newsletters for updates, and personal communication (Petosa & Smith, 2017).
Although a lead PI assumes chief responsibility for the project, a site PI assumes responsibility for her or his site. The funding source for multisite studies may designate an epicenter or coordinating center used to plan, organize, and implement the entire study (Flynn, 2009). Epicenters fulfill many study needs, including preparing (or finalizing) study protocols, selecting clinical enrollment sites, conducting staff training, providing clinical research records storage, preparing study-related medications, providing software for data entry, randomizing sites or subjects, ensuring quality of the database, monitoring protocol adherence, ensuring the safety of the study participants, and arranging for data analyses and manuscript preparation (Fiss et al., 2010; Flynn, 2009; Lebowitz et al., 2003). Epicenters are often created by combining expertise from several sources, including contract research organizations. More recently, public companies offer epicenter functions for multisite trials; these are most often used by the pharmaceutical industry.
The Project Coordinator
When hiring a project coordinator, the main considerations are qualifications, personality, and fit with project needs. Qualifications include knowledge and experience with human studies procedures, patient recruitment, and data collection. Personality is also important because if the project coordinator cannot get along with study personnel, then site cooperation is in jeopardy. An important aspect of the project coordinator’s role is to maintain frequent contact with local site liaison persons to troubleshoot issues of concern regarding recruitment and retention and to ensure compliance with the study protocol. This close contact also builds relationships with the performance sites (Fiss et al., 2010). The project coordinator bears responsibility for fidelity oversight, monitoring compliance with approved protocols, human subjects regulations, and ethical considerations (Oermann et al., 2012). Other skills may include the ability to enhance patient recruitment and retention, data entry, simple data analyses, and scientific writing.
Research Personnel at the Performance Sites
Although selecting the research team may be somewhat time-consuming at the beginning, having the right personnel at each site will result in improved rapport, productivity, fidelity, and retention. Strategies include carefully written job descriptions that list essential tasks and duties for all personnel at each site location. Conducting semistructured interviews with potential candidates, verifying professional references, and allowing candidates to interact with the key study team at the site during the interview process are essential to uncover appropriate fit and rapport. Having the right personnel at each site will ultimately result in improved rapport, productivity, fidelity, and retention.
Several approaches may be used for hiring personnel for research and data collection. One approach is to use personnel already employed at sites and negotiate release time. Two problems may occur with this approach. First, personnel may make their priority the hospital or agency needs rather than those of the study—especially if they are involved in providing patient care. Thus, clarifying the time commitment needed for study activities is essential. Second, sites may offer the research team their most marginally performing employees, who may perform in similar manner for the study. However, hiring existing staff can be helpful, as these staff know the institution or agency system and may already have a relationship with participants. It is helpful to identify staff who are likely to “champion” the study and facilitate solutions to potential barriers at their sites (Fiss et al., 2010). For example, in a tobacco study, staff nurses were taught to deliver the intervention, and a nurse champion for smoking cessation was identified on every unit, which helped solidify the intervention on the units (Duffy, Karvonen-Gutierrez, Ewing, Smith, & Veterans Integrated Services Network, 2010).
A second approach for hiring personnel is to allow the sites to hire research staff specifically for the project, who then report to the site PI. A third approach is for the research team to hire and employ all staff in the sites, but the result may be that clinical privileges have to be obtained from sites for all new personnel. All of these approaches have strengths and weaknesses and can work depending on the study.
Participant recruitment is a crucial activity that research staff performs; failure to recruit is a common shortcoming in research studies. Therefore, the rapport the staff has with participants is an important consideration in hiring. Perceived recruiter characteristics have been found to facilitate recruitment, such as having a pleasant personality, being approachable, and sounding competent (Chang, Hendricks, Slawsky, & Locastro, 2004). How study personnel handle or prevent loss to follow-up is also an important consideration. Many studies keep the research staff in the sites focused on recruitment but use a separate staff that focuses only on follow-up and retention.
Duffy et al. (2015) describe the methods of fidelity monitoring across seven smoking cessation studies and provide examples of fidelity assessment tools that may be adapted and used as models for researchers conducting other intervention trials. The five main components of fidelity monitoring are well established: study design, training, delivery, receipt, and enactment (Borrelli, 2011; Borrelli et al., 2005).
The design of the study can influence fidelity in several ways. For instance, randomization at the hospital level may prevent cross-contamination that can occur by randomizing at the unit level. In Smith’s school-based study, two methods of randomization were considered (Petosa & Smith, 2017). The first approach was to identify schools that met study criteria and contact those schools for possible inclusion/randomization. With this approach, schools could refuse to participate after the randomization assignment was made because of feasibility concerns based on assigned condition. A second approach was to overidentify schools that met eligibility criteria, apply randomization procedures, and then recruit schools based on assigned condition. With this approach, the uncertainty of study condition and feasibility was removed from consideration as school personnel can make better informed decisions knowing what protocols and procedures are required for participation in the study. This randomization approach improved study fidelity and reduced site attrition rates by allowing sites to know a priori to which intervention arm they were assigned (Petosa & Smith, 2017).
Standardizing training of research personnel, including interventionists and data collectors, is necessary to ensure continuity among sites. In the smoking cessation studies, a standardized presentation was developed to train staff nurses, and a Tobacco Tactics manual was developed to deliver the intervention to patients (Ewing, Karvonen-Gutierrez, Noonan, & Duffy, 2012). Although the training should be the same across sites, it is helpful to incorporate a variety of teaching methods to accommodate different learning styles. These methods may include lectures, videos, and role play; discussion of case studies; and pretests and posttests. Although training research staff at a central site at a specific time is beneficial to maintaining consistency, this approach is not always possible (Oermann et al., 2012). For example, in Duffy et al.’s smoking studies (Duffy, Ronis, Ewing, et al., 2016; Duffy, Ronis, Karvonen-Gutierrez, et al., 2016), not all of the nurses could be relieved of duties from their units for training at the same time. In addition to group training, onsite training may be needed where supplementary hands-on preparation occurs (Oermann et al., 2012). As the study progresses, “booster sessions” may become necessary to prevent intervention drift. Training and booster sessions may be in person and/or online, whereas a hybrid of both often works well. For example, in the smoking studies, nurses were trained partly online and partly in face-to-face sessions. Booster sessions consisted of online questions and “huddles” (brief in-person meetings of staff during a shift) for immediate feedback (Duffy, Ronis, Ewing, et al., 2016; Yu, 2017). Standardization of the study protocol increases the chances that all research participants at each site will receive the same “treatment” regardless of who is implementing the protocol.
Once staff are trained, fidelity checks are made by the research team using standardized checklists or videotaping to determine if the intervention is delivered as intended in all sites. For example, in a 20-school, group RCT conducted by Smith and Petosa (2018), fidelity assessments were assessed via two methods. First, a research assistant observed at least half of all conducted sessions (intervention and control) and completed a fidelity assessment form for each observed session. The project director reviewed the forms for intervention fidelity. If implementation concerns were noted, then the PIs were consulted and program retraining was planned for the interventionists. Second, 20% of the sessions were videotaped and reviewed for fidelity adherence by the PIs, who found that recording 20% of the sessions allowed for adequate adherence assessment without biasing or being intrusive. Finally, at the conclusion of the intervention, cognitive interviews were conducted with a subset of the interventionists to assess their perceived barriers and satisfaction with the program.
Even if delivered as intended, it is necessary to determine whether the participant received the intervention as intended. As described in the Smith, Petosa, and Shoben (2018) school-based study, videotaping a percentage of the sessions and conducting interviews with a subset of the participants at the end of the program provided rich information from the participant perspective at each site, as well as objective data from the videotaping. These were supplemented with the observational fidelity assessments completed at each site. Last, it is necessary to determine if the intervention was enacted by the participant as intended and if not, why not. For instance, a smoking cessation intervention may encourage participants to obtain nicotine patches. However, in some low-income neighborhoods, patches are not commonly sold in drug stores. Knowing this can assist with identifying barriers to implementation and prevent reporting intervention failure when in fact the intervention was not implemented as designed.
Human Subjects Review/IRBs
In the past, most multisite studies required IRB review from each site. This led to extensive human resources needed for the completion of protocols, modifications, amendments, continuing reviews, and study termination reports for each performance site; variation in procedures across sites; and restrictions on the type and method of data sharing between participating sites (Check, Weinfurt, Dombeck, Kramer, & Flynn, 2013; Fiss et al., 2010; Flynn, 2009; Winkler, Witte, & Bierer, 2015). Beginning in the mid-1990s, cooperative agreements or integrated IRB agreements have facilitated easier human subjects review (Thornquist, Edelstein, Goodman, & Omenn, 2002). All study-related reviews, renewals, amendments, and reports are submitted to the agreed-upon review body among participating sites. Such agreements reduce IRB approval times, improve study oversight, strengthen fidelity, and improve team accountability to operating procedures or protocols (Winkler et al., 2015). The Department of Veterans Affairs has a central IRB for multisite studies, although individual sites are given an opportunity to voice concerns prior to final approval (IRB, 2017).
Recently, the National Institutes of Health (NIH) published NOT-OD 17-075: The Final Policy on Use of a Single Institutional Review Body for Multisite Research (NIH, 2016). Although there are some exceptions to the policy, the NIH requires the use of a single or designated IRB for multisite research studies conducted in the United States that are funded by NIH starting in 2020. Eliminating duplicative IRB reviews is expected to reduce unnecessary administrative burdens and systemic inefficiencies, without diminishing human subjects’ protections (NIH, 2016).
A multisite study allows for recruitment of greater numbers of subjects within a given study period compared to a single-site study and thus has the advantage of improved statistical power. Sample size calculation for a multisite trial requires more careful thought than simply splitting the required sample size calculated for a single-site trial across sites. Key design considerations include the number of participants per site and the number of sites needed to achieve sufficient power while using resources optimally within and across sites. Under the optimal design formula (Raudenbush & Liu, 2000), fewer sites are needed for an efficacy trial to test a main treatment effect of a specified effect size with (a) larger optimal sample size per site. (b) smaller treatment-by-site variance. and (c) greater cost ratio of sampling sites relative to sampling within sites. However, for an effectiveness trial aiming to estimate treatment-by-site variance, the number of sites is more important than sample size per site. In other words, the main treatment effect is more important for a trial focusing on efficacy, whereas treatment-by-site variance is more important for a trial focusing on effectiveness (Kraemer, 2000). Other considerations only briefly described here include reductions in statistical power from planned subgroup analyses and data clustering and, conversely, improved statistical power with covariate adjustment. It is important to note that statistical power depends on sample size and is determined for each analysis prior to study initiation. A component of a properly planned study is the determination of a sample size that will be sufficient to address the research equations with adequate power for all planned tests and subgroup analyses.
Weighing the importance of efficacy versus effectiveness also carries important implications for sampling strategies in multisite trials. A homogenous sample is more desirable for a trial focusing on efficacy. Most pharmaceutical trials fit into this category, especially Phase III trials and those focused on relatively rare conditions. A multisite design is used in these trials to facilitate subject recruitment.
On the other hand, a heterogeneous sample offers greater generalizability for a trial focusing on effectiveness, which generally has had several efficacy studies preceding it (Weinberger et al., 2001). Multisite effectiveness studies generally have a relatively less stringent inclusion and exclusion criteria, allowing recruitment of adequate numbers of subjects who represent a diverse patient population of a wide range of sociodemographics, illness severities, and medical complexities. The advantages of a heterogeneous sample in multisite trials include (a) study findings are more generalizable with enhanced external validity, (b) results have greater potential to impact clinic practices or policies, and (c) larger sample sizes provide more diversity and greater possibilities for subgroup analysis to generate evidences for improved precision in care delivery.
Because treatment effect may differ from one site to another, analyzing site differences should be incorporated in the statistical analysis plan (Feaster, Mikulich-Gilbertson, & Brincks, 2011). Two approaches are often used for analyzing site differences: testing treatment by site interaction and site-stratified analysis. To avoid pitfalls, results from either approach need to be interpreted carefully. First, a statistically nonsignificant treatment by site interaction does not necessarily exclude the existence of site differences. A much larger sample size is typically needed to have sufficient power to detect a significant treatment by site interaction due to a greater standard error for an interaction effect (vs. that for a main effect). The lack of power is also true for site-stratified analysis due to small sample sizes in each site. Researchers often are advised to rely on the effect size for tests known to be underpowered. A cautionary note is that sites with the same treatment effect sizes do not necessarily exclude site differences. The site main effect needs to be examined in addition to site by treatment interactions to fully capture the site differences. Thus, examining the distribution of outcome measures for each trial arm by sites is recommended prior to applying advanced methods to model site differences (e.g., random effect modeling; Feaster et al., 2011).
Clustering exists in multisite trials as participants are recruited and randomized within sites. Intraclass correlation (ICC), the proportion of total variance accounted for by clustering, quantifies the within-site clustering among observations. In multisite trials, clustering among participants within a site may occur from two possible sources. First, participants within a site may be more alike than participants across sites. For example, in the Appalachian obesity mitigation study (Smith et al., 2018), participants within a school likely come from the same neighborhood and socioeconomic status. In addition, site-level factors (e.g., school lunch program) may exert an influence on outcomes of participants. When such sources of clustering are not accounted for in the analysis, the standard error for treatment effect will be inflated by a factor of (1 − ICC)−1/2, resulting in reduction in power (Parzen, Lipsitz, & Dear, 1998). Therefore, it is important to identify potential sources of clustering (individual- or site-level characteristics) during the trial planning, so that they can be adequately adjusted for in the statistical analysis.
Other prerandomization or postrandomization clustering should be taken into account. For instance, in the quasi-experimental smoking study, patients were clustered on units within hospitals (Duffy et al., 2012). A typical example of postrandomization clustering is from repeated-measures trials in which longitudinal data are collected at multiple time points. An often-neglected, postrandomization clustering occurs if treatments are assigned with different probabilities across clusters (e.g., therapists are more likely to treat patients assigned to one specific arm). Ignoring such clustering will underestimate the standard error of treatment effect and thus increase the chance of Type I error (false positive or identification of a significant treatment defect when in fact it was not significant; Kahan & Morris, 2013).
Large-scale multisite implementation trials often employ a random sampling strategy. Advantages include operational efficiency and accurate representation of a larger target population. For example, data were collected from successive randomly selected individuals during the preimplementation and postimplementation periods in a community trial using the multistage cluster sampling approach. Though, because clustering is inherent in such an approach, survey analysis methods are needed to account for the complexity of the sampling in order to obtain approximately unbiased estimators for the target population.
Testing for heterogeneous treatment effect to examine whether the treatment works uniformly across subpopulations is important to determine optimal treatment delivery targeting individuals who need to benefit most from the treatment while avoiding individuals for whom the treatment is harmful or not useful. For example, in the quasi-experimental smoking cessation study, subgroup analyses were conducted based on patient comorbidity. Although subgroup analysis is the most commonly used method for analyzing heterogeneous treatment effect, there has been considerable discussion about potential problems with this method (Lagakos, 2006; Wang, Lagakos, Ware, Hunter, & Drazen, 2007). First, stratified analysis by subgroups does not answer the question of whether the magnitude of benefits (or harm) differs significantly across subgroups. Second, subgroup analyses generally suffer from lack of power (Brookes et al., 2004). Third, failure to account for multiplicity leads to increased chance of false discovery. Therefore, a strong emphasis on planned subgroup analyses and multiplicity adjustment is necessary to ensure adequate recruitment of a sample with the desired characteristics and to prevent spurious discovery from data fishing (Lindquist et al., 2000). Guidelines have been proposed to standardize the reporting of subgroup analyses (Wang et al., 2007). Results should be based on tests for interaction along with subgroup-specific point estimates and corresponding confidence intervals and be interpreted as a plausible range of the heterogeneous treatment effect estimates (Lagakos, 2006). Despite potential pitfalls, subgroup analyses can provide valuable information when properly planned, reported, and interpreted.
Authorship in Multisite Studies
Although many articles have been written on authorship guidelines, there is still much variation by discipline and journal due to varying requirements; Multisite studies add another layer of complication (Fiss et al., 2010; Oermann et al., 2012). For example, almost all of the authorship guidelines suggest that the authors participate in the writing of the article. Yet many clinical personnel who were essential in conducting the study do not have the time or expertise to write the articles. Excluding study site members as authors can lead to hard feelings and preclude further research in these sites. Another challenge is that multisite studies involve many research personnel, and some journals have limits on the number of authors.
One option is to give authorship to the “main implementer” at each site. Although this may not totally comply with some authorship guidelines, the study could not have been conducted without these people. Another option is group, collaborative, corporate, or collective authorship, which usually involves multicenter study investigators, members of working groups, and official or self-appointed expert boards, panels, or committees. These groups can comprise hundreds of participants and often represent complex, multidisciplinary collaborations. Although everyone involved in a group may not actually write or author the article, group authorship credit should be based on (a) substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting the article or revising it critically for important intellectual content; and (c) final approval of the version to be published (National Library of Medicine Technical Bulletin, 2008). In many cases, naming the study personnel in the acknowledgement section of the article is sufficient. See Appendix C for examples of authorship and acknowledgments in multisite studies.
With multisite studies, decisions about authorship and manuscript submission should be made at the outset to avoid conflict among members of the team (Fiss et al., 2010; Oermann et al., 2012). Consensus must be reached regarding if and how the newly produced evidence will be translated into practice (Flynn, 2009). All authors and persons acknowledged need to read and approve the manuscript prior to submission.
Overall, multisite studies have numerous benefits to both the researcher and science. Multisite studies result in larger, more diverse samples and may result in expedited recruiting. Although there may be challenges to conducting multisite studies, viable solutions exist. Planning should begin early, and communication among key study members should be a priority. These recommendations can assist new investigators avert problems a priori for a more seamless implementation of and more rigorous production of results from multisite studies.
Audet J., & d'Amboise G. (2001). The multi-site study: An innovative research methodology. Qualitative Report
, 6, 1–18.
Bergstrom N., Braden B., Kemp M., Champagne M., & Rudy E. (1996). Multi-site study of incidence of pressure ulcers and the relationship between risk level, demographic characteristics, diagnoses, and prescription of preventive interventions. Journal of the American Geriatrics Society
, 44, 22–30. doi:10.1111/j.1532-5415.1996.tb05633.x
Borrelli B. (2011). The assessment, monitoring, and enhancement of treatment fidelity
in public health clinical trials. Journal of Public Health Dentistry
, 71, S52–S63. doi:10.1111/j.1752-7325.2011.00233.x
Borrelli B., Sepinwall D., Ernst D., Bellg A. J., Czajkowski S., Breger R., … Orwig D. (2005). A new tool to assess treatment fidelity
and evaluation of treatment fidelity
across 10 years of health behavior research. Journal of Consulting and Clinical Psychology
, 73, 852–860. doi:10.1037/0022-006x.73.5.852
Brookes S. T., Whitely E., Egger M., Smith G. D., Mulheran P. A., & Peters T. J. (2004). Subgroup analyses in randomized trials: Risks of subgroup-specific analyses: Power and sample size for the interaction test. Journal of Clinical Epidemiology
, 57, 229–236. doi:10.1016/j.jclinepi.2003.08.009
Chang B. H., Hendricks A. M., Slawsky M. T., & Locastro J. S. (2004). Patient recruitment to a randomized clinical trial of behavioral therapy for chronic heart failure. BMC Medical Research Methodology
, 4, 8. doi:10.1186/1471-2288-4-8
Check D. K., Weinfurt K. P., Dombeck C. B., Kramer J. M., & Flynn K. E. (2013). Use of central instittional review boards for multicenter clinical trials in the United States: A review of the literature. Clinical Trials
, 10, 560–567. doi:10.1177/1740774513484393
Duffy S. A., Cummins S. E., Fellows J. L., Harrington K. F., Kirby C., Rogers E., … Waltje A. H. (2015). Fidelity
monitoring across the seven studies in the Consortium of Hospitals Advancing Research on Tobacco (CHART). Tobacco Induced Diseases
, 13, 29. doi:10.1186/s12971-015-0056-5
Duffy S. A., Karvonen-Gutierrez C. A., Ewing L. A., Smith P. M.; Veterans Integrated Services Network (2010). Implementation of the Tobacco Tactics
program in the Department of Veterans Affairs. Journal of General Internal Medicine
, 25, 3–10. doi:10.1007/s11606-009-1075-9
Duffy S. A., Ronis D. L., Ewing L. A., Waltje A. H., Hall S. V., Thomas P. L., … Jordan N. (2016). Implementation of the Tobacco Tactics intervention versus usual care in Trinity Health community hospitals. Implementation Science
, 11, 147. doi:10.1186/s13012-016-0511-6
Duffy S. A., Ronis D. L., Karvonen-Gutierrez C. A., Ewing L. A., Hall S. V., Yang J. J., … Gray D. (2016). Effectiveness of the tobacco tactics program in the trinity health system. American Journal of Preventive Medicine
, 51, 551–565. doi:10.1016/j.amepre.2016.03.012
Duffy S. A., Ronis D. L., Titler M. G., Blow F. C., Jordan N., Thomas P. L., … Waltje A. H. (2012). Dissemination of the nurse-administered Tobacco Tactics intervention versus usual care in six Trinity community hospitals: Study protocol for a comparative effectiveness trial. Trials
, 13, 125. doi:10.1186/1745-6215-13-125
Duffy S. A., Ronis D. L., Valenstein M., Fowler K. E., Lambert M. T., Bishop C., & Terrell J. E. (2007). Depressive symptoms, smoking, drinking, and quality of life among head and neck cancer patients. Psychosomatics
, 48, 142–148. doi:10.1176/appi.psy.48.2.142
Duffy S. A., Ronis D. L., Valenstein M., Lambert M. T., Fowler K. E., Gregory L., … Terrell J. E. (2006). A tailored smoking, alcohol, and depression intervention for head and neck cancer patients. Cancer Epidemiology, Biomarkers & Prevention
, 15, 2203–2208. doi:10.1158/1055-9965.EPI-05-0880
Ewing L. A., Karvonen-Gutierrez C. A., Noonan D., & Duffy S. A. (2012). Development of the Tobacco Tactics logo: From thumb prints to press. Tobacco Induced Diseases
, 10, 6. doi:10.1186/1617-9625-10-6
Feaster D. J., Mikulich-Gilbertson S., & Brincks A. M. (2011). Modeling site effects in the design and analysis of multi-site trials. American Journal of Drug and Alcohol Abuse
, 37, 383–391. doi:10.3109/00952990.2011.600386
Fenlon D., Seymour K. C., Okamoto I., Winter J., Richardson A., Addington-Hall J., … Foster C. (2013). Lessons learnt recruiting to a multi-site UK cohort study to explore recovery of health and well-being after colorectal cancer (CREW study). BMC Medical Research Methodology
, 13, 153. doi:10.1186/1471-2288-13-153
Fiss A. L., McCoy S. W., Bartlett D. J., Chiarello L. A., Palisano R. J., Stoskopf B., … Wood A. (2010). Sharing of lessons learned from multisite research. Pediatric Physical Therapy
, 22, 408–416. doi:10.1097/PEP.0b013e3181faeb11
Flynn L. (2009). The benefits and challenges of multisite studies: Lessons learned. AACN Advanced Critical Care
, 20, 388–391. doi:10.1097/NCI.0b013e3181ac228a
Institutional Review Board
(2017). VA central IRB communications with investigators, other study team members, and local participating sites
. Washington, DC: VA Institutional Review Board
for Multisite Studies.
Kahan B. C., & Morris T. P. (2013). Assessing potential sources of clustering in individually randomised trials. BMC Medical Research Methodology
, 13, 58. doi:10.1186/1471-2288-13-58
Kraemer H. C. (2000). Pitfalls of multisite randomized clinical trials of efficacy and effectiveness. Schizophrenia Bulletin
, 26, 533–541. doi:10.1093/oxfordjournals.schbul.a033474
Lagakos S. W. (2006). The challenge of subgroup analyses—Reporting without distorting. New England Journal of Medicine
, 354, 1667–1669. doi:10.1056/NEJMp068070
Lebowitz B. D., Vitiello B., & Norquist G. S. (2003). Approaches to multisite clinical trials: The National Institute of Mental Health perspective. Schizophrenia Bulletin
, 29, 7–13. doi:10.1093/oxfordjournals.schbul.a006992
Lindquist R., Treat-Jacobson D., & Watanuki S. (2000). A case for multisite studies in critical care. Heart & Lung
, 29, 269–277. doi:10.1067/mhl.2000.106939
Knecht L. S., Nahin A. M.. (2008). Study Collaborators Included in MEDLINE®/PubMed®. NLM Technical Bulletin
Oermann M. H., Hallmark B. F., Haus C., Kardong-Edgren S. E., McColgan J. K., & Rogers N. (2012). Conducting multisite research studies in nursing education: Brief practice of CPR skills as an exemplar. Journal of Nursing Education
, 51, 23–28. doi:10.3928/01484834-20111130-05
Parzen M., Lipsitz S. R., & Dear K. B. G. (1998). Does clustering affect the usual test statistics of no treatment effect in a randomized clinical trial? Biometrical Journal
, 40, 385–402. doi:10.1002/(SICI)1521-4036(199808)40:4<385::AID-BIMJ385>3.0.CO;2-#
Petosa R. L., & Smith L. (2017). Effective recruitment of schools for randomized clinical trials: Role of school nurses. Journal of School Nursing
, 1059840517717592. doi:10.1177/1059840517717592
Raudenbush S. W., & Liu X. (2000). Statistical power and optimal design for multisite randomized trials. Psychological Methods
, 5, 199–213.
Smith L. H., & Petosa R. L. (2016). Effective practices to improve recruitment, retention, and partnerships in school-based studies. Journal of Pediatric Health Care
, 30, 495–498. doi:10.1016/j.pedhc.2016.05.004
Smith L. H., Petosa R. L., & Shoben A. (2018). Peer mentor versus teacher delivery of a physical activity program on the effects of BMI and daily activity: Protocol of a school-based group randomized controlled trial in Appalachia. BMC Public Health
, 18, 633. doi: org/10.1186/s12889-018-5537-z
Thornquist M. D., Edelstein C., Goodman G. E., & Omenn G. S. (2002). Streamlining IRB review in multisite trials through single-study IRB Cooperative Agreements: Experience of the Beta-Carotene and Retinol Efficacy Trial (CARET). Controlled Clinical Trials
, 23, 80–86. doi:10.1016/S0197-2456(01)00187-8
Wang R., Lagakos S. W., Ware J. H., Hunter D. J., & Drazen J. M. (2007). Statistics in medicine—Reporting of subgroup analyses in clinical trials. New England Journal of Medicine
, 357, 2189–2194. doi:10.1056/NEJMsr077003
Weinberger M., Oddone E. Z., Henderson W. G., Smith D. M., Huey J., Giobbie-Hurder A., & Feussner J. R. (2001). Multisite randomized controlled trials in health services research: Scientific challenges and operational issues. Medical Care
, 39, 627–634.
Winkler S. J., Witte E., & Bierer B. E. (2015). The Harvard Catalyst Common Reciprocal IRB Reliance Agreement: An innovative approach to multisite IRB review and oversight. Clinical and Translational Science
, 8, 57–66. doi:10.1111/cts.12202
Example of Estimated Annual Recruitment for Multisite Study Effectiveness Study (Duffy et al., 2012)Estimated Annual RecruitmentAppendix CAuthorship in Multisite Studies
Example 1: The site PIs from each participating institution were included as authors (Duffy, Ronis, Ewing, et al., 2016).
Duffy, S. A., Ronis, D. L., Ewing, L. A., Waltje, A. H., Hall, S. V., Thomas, P. L., Olree, C. M., Maguire, K. A., Friedman, L., Klotz, S., Jordan, N., & Landstrom, G. L. (2016, November). Implementation of the Tobacco Tactics intervention versus usual care in trinity health community hospitals. Implementation Science, 11, 147.
Example 2: Investigators who participated in the writing were main authors, whereas authorship was also given to the University of Michigan Head and Neck Cancer SPORE Team; team members who contributed intellectually were named at the end of the article (Duffy et al., 2007).
Duffy, S. A., Ronis, D. L., Valenstein, M., Fowler, K. E., Lambert, M., Bishop, C., Terrell, J. E., & University of Michigan Head and Neck Cancer Team. (20017). Depressive symptoms, smoking, drinking and quality of life among head and neck cancer patients. Psychosomatics, 48(2), 142–148.
The members of the University of Michigan Head and Neck Cancer Team, all of whom are authors of this article, are Carol R. Bradford, MD, Douglas B. Chepeha, MD, Mark E. Prince, MD, Theodoros N. Teknos, MD, and Gregory T. Wolf, MD.
Example 3: Investigators from each of the seven studies who participated in the writing were main authors, whereas authorship was also given to the consortium; consortium members were named at the end of the article (Duffy et al., 2015).
Duffy, S. A., Cummins, S., Harrington, K. F., Fellows, J. L., Kirby, C., Rogers, E., Scheuermann, T. S., Tindle, H. A., Waltje, A. H., & the Consortium of Hospitals to Advance Research on Tobacco (CHART). (2015, September). Fidelity monitoring across the seven studies in the Consortium of Hospitals to Advance Research on Tobacco (CHART). Tobacco Induced Diseases, 13, 29.
For the Consortium of Hospitals to Advance Research on Tobacco (CHART)
- University of Michigan Medical Center: David Ronis, PhD, and Lee Ewing, MPH
- University of California, San Diego: Shu-Hong H. Zhu, PhD
- University of Kansas: Kimberly Richter, PhD
- New York University: Scott Sherman, MD
- University of Alabama at Birmingham: William Bailey, MD
- Kaiser Permanente Center for Health Research: Lisa Waiwaiole, PhD
- Massachusetts General Hospital: Nancy Rigotti, MD
Example 4: Acknowledgments (Duffy et al., 2010).