Clinical Research

High Methodologic Quality But Poor Applicability: Assessment of the AAOS Guidelines Using The AGREE II Instrument

Sabharwal, Sanjeeve MBBS, MRCS, MSc1,a; Patel, Nirav K. MBBS, MRCS, MSc1; Gauher, Salman MBBS, BSc2; Holloway, Ian MBBS, FRCS (Orth)2; Athansiou, Thanos MD, PhD, FRCS, FETCS1

Author Information
Clinical Orthopaedics and Related Research: June 2014 - Volume 472 - Issue 6 - p 1982-1988
doi: 10.1007/s11999-014-3530-0
  • Free



Clinical practice guidelines have been defined by the Institute of Medicine as “statements that include recommendations intended to optimize patient care that are informed by a systematic review of evidence and an assessment of the benefits and harms of alternative care options” [13]. Although it is well recognized that their appropriate use improves clinical practice [31], their importance and recommendations often are questionable owing to failings in their methodologic quality [7].

The American Academy of Orthopaedic Surgeons (AAOS) was founded in 1933 and is a globally recognized organization involved in the production of musculoskeletal and orthopaedic education. One of the key ways the AAOS provides clinician education is through the provision of a spectrum of clinical practice guidelines that are freely available on the AAOS web site (http://www.aaos.org/research/guidelines/guide.asp) [2], and further disseminated through accompanying review articles [16, 22, 27]. These guidelines sometimes have been criticized [19, 21], however to our knowledge, an evaluation of their methodologic quality has yet to be performed.

The AGREE II (Advancing Guideline Development, Reporting and Evaluation in Health Care) instrument is a validated questionnaire that is used to assess the methodologic quality of clinical practice guidelines [5]. A systematic review of 24 different appraisal tools used to assess the methodologic quality of clinical practice guidelines reported that it was the most effective system for guideline assessment [29]. AGREE II covers six domains in guideline development. The scope and purpose domain examines whether the guidelines’ objectives and patient population are explicitly defined. Stakeholder involvement relates to development of the clinical practice guideline by relevant professionals with opinions of the target population sought and the target guideline users stated. Rigor of development evaluates the method of developing the recommendations and clarity of presentation focuses on whether the recommendations are clear and unambiguous. Applicability is a domain that scrutinizes whether there is recognition of barriers to implementing the recommendations and guidance on how healthcare professionals should overcome them. Finally, editorial independence is the final domain that investigates the funding of the guideline and potential conflicts of interest. The use of AGREE II as a means to critically appraise the quality of clinical practice guidelines is increasing in the scientific literature [15, 17]. Furthermore, it has been adopted by the WHO for assessing its clinical practice guidelines [30].

The objective of our study was to evaluate the currently available AAOS guidelines using the AGREE II instrument.

Materials and Methods

Information Sources and Eligibility Criteria

Clinical practice guidelines were identified from the AAOS web site [2] based on a search performed on August 2, 2013. We included all guidelines that provided recommendations on diagnosis of disease, preventive measures, therapeutic interventions, and those that focused on training, legal issues, epidemiology, and research methods.

Data Extraction and Assessment of Guideline Quality

The following descriptive information was extracted from each guideline: year of publication, number of recommendations, guideline focus, and size of the document.

The AGREE II instrument is a tool used to assess the methodologic quality of clinical practice guidelines. The assessor must respond to 23 questions using a scale of 1 for “strongly disagree” to 7 for “strongly agree” based on examples and instructions described in the AGREE II manual [5]. It uses six domains to assess guideline quality: scope and purpose of the guideline, stakeholder involvement, rigor of development, clarity of presentation, applicability, and editorial independence. Assessment of all the guidelines was performed independently by three of the authors (SS, NP, SG). The assessors were clinicians with experience in orthopaedics and healthcare improvement. Two of the assessors are orthopaedic surgeons and the third is a public health fellow who previously worked as an orthopaedic resident. They all completed the online AGREE II overview tutorial and practice exercise. A pilot test was performed on several cardiac clinical practice guidelines before evaluation of the AAOS guidelines.

Data Analysis

Descriptive and statistical analyses were performed using SPSS 20.0 (SPSS Inc, Chicago, IL, USA). The distribution of the data was determined using a Kolmogorov-Smirnov test and interrater reliability between the assessors was examined using a Spearman's Rho test comparing all the domain results. Descriptive statistics for individual and overall AAOS guideline performance were derived from the mean scores from each question responded to by the three assessors. An overall average score across all six domains also was calculated for each guideline. The results were presented as a percentage of the maximum possible score for each domain.

Fourteen guidelines were available on the AAOS website and all 14 matched our eligibility criteria (Table 1). The domain scores for all 14 guidelines (n = 84) were examined using a Kolmogorov-Smirnov test, which indicated that they were not normally distributed. A Spearman's Rho test was adopted as a means to correlate nonparametric data. Analysis of the 14 AAOS guidelines showed statistically significant correlation coefficients of 0.95 or greater among the three reviewers with p values less than 0.001 or all six permutations on interrater comparison.

Table 1:
Summary and characteristics of 14 AAOS clinical practice guidelines [2].


The overall quality of the AAOS guidelines was, on average, high, but all the guidelines scored poorly on the domain related to clinical applicability (Table 2). Focus on the individual AGREE II domains in relation to all the guidelines (Table 3) found that scope and purpose performed the best (median score, 95%). In this domain, areas that were covered comprehensively by the AAOS guidelines included clearly defining the guidelines’ objectives and the target population. Rigor of development (median score, 94%), the largest AGREE II domain, also performed well. Specific aspects of this domain that received high scores included: systematic method for evidence synthesis, a clear criteria for evidence selection, consideration of the health benefits and risks of each treatment, and an external peer review protocol guideline development. Clarity of presentation (median score, 92%) was another AGREE II domain that received a high score as the assessors found the guidelines provided clear recommendations that were easily identifiable. Two of the other AGREE II domains, stakeholder involvement (median score, 83%) and editorial independence (median score, 79%), received slightly lower scores. Applicability (median score, 49%) received the lowest scores of all the AGREE II domains.

Table 2:
Individual AGREE II domain results for each AAOS guideline [2] and mean score of all domains.
Table 3:
Descriptive statistics summarizing the performance of the AAOS guidelines.


Clinical practice guidelines are used by healthcare professionals to improve quality in patient care. There is existing scientific evidence to support improved outcomes when guidelines that have been rigorously evaluated are adopted in clinical practice [14]. Despite an increasing volume of guidelines in clinical medicine during the last two decades, some studies have reported gross failings in their methods and little improvement in quality with time [3, 4, 26]. Clinician concern regarding the quality of guidelines may contribute to poor uptake and this potentially hinders improvements in healthcare. The AAOS guidelines are recognized and accessible to clinicians worldwide. A systematic examination of these guidelines using a validated assessment tool informs orthopaedic surgeons and other healthcare professionals about their quality. Our study showed that although the overall quality of the AAOS guidelines is high across most of the AGREE II domains, their clinical applicability is low. Guidelines that are developed systemically with clear recommendations based on the best available evidence base are laudable, however if they are not clinically applicable they are unlikely to be implemented and therefore healthcare practices will not improve.

There are three limitations to this study that require consideration. First, the AGREE II tool has numerous points that may be allocated to an individual guideline component, which would allude to a subjective assessment system. Despite this view, the AGREE II manual is robust in its instructions on marking each component in the various domains. Our interrater correlation was high, which suggests the validity of AGREE II and of the process used here. Second, AGREE II is validated for use by a minimum of two assessors, however it recommends that there be four assessors. The use of three assessors thus produces a potential limitation in our method, however the assessment still remains acceptable based on the validity of the tool for two assessors. Third, AGREE II assesses methodologic quality of guideline development and presentation. Guidelines may have robust methods, however recommendations may be weak owing to an absence of existing evidence to answer the questions they pose. The importance of such guidelines is questionable and assessment using AGREE II cannot highlight such a deficiency, which scientists and clinicians may view to be the most important aspect when considering the use of a guideline.

There are four main factors that highlight why the guidelines received high scores across many of the AGREE II domains. First, the comprehensive size of each guideline meant that they were more descriptive in their methods than those that are published in peer-reviewed journals and also have been subjected to quality assessment [24]. Second, it was clear to the assessors that these guidelines had a predefined structure that reflected key aspects of the developmental rigor domain. Systematic methods were used to search for evidence, clear criteria for evidence selection was presented, strengths and limitations of available studies were highlighted, and methods for formulating recommendations were performed repeatedly in a robust and uniform way. Furthermore, each guideline explicitly linked the recommendations to the supporting evidence, guidelines all were reviewed externally, and a procedure for updating the guideline was consistently described. Common criticisms of clinical practice guidelines are that they fail to cite high levels of evidence [11]; however, although the volume of Levels I and II evidence has increased in orthopaedics during the last 10 years [8], areas remain in the specialty where high-quality evidence is lacking [6]. Recommendations, although often inconclusive, were based on the best available evidence. Furthermore, the AAOS already uses appropriateness use criteria which can serve as a useful tool in recommending a treatment based on the views of a large body of relevant specialists, if existing evidence on a subject is deficient. The third reason that these guidelines fared well was because large and representative bodies supported their development. The AAOS guidelines that advised on prevention of venous thromboembolism in patients undergoing elective hip or knee arthroplasties were peer-reviewed by seven large healthcare bodies that represented different areas in clinical medicine [16]. Patients with orthopaedic disorders are managed by various healthcare professionals during the course of their patient journey. Allowing a broad range of specialists to contribute to the review process is crucial because some aspects of patient care may not be apparent to one group of specialists. The fourth reason why the AAOS guidelines performed so well is that conflicts of interest of all the study authors and reviewers were well documented.

Applicability was the domain that performed most poorly for the AAOS guidelines. Applicability of a guideline relates to the ability for users to implement them in clinical practice. Four components in AGREE II assess the applicability of a guideline. These are whether facilitators and barriers to implementation are described, if advice or tools to aid implementation are provided, consideration of potential resource implications of applying the guideline, and the presentation of a monitoring or auditing criteria. Studies of guideline quality that use AGREE II commonly cite this deficiency, which implies guideline developers throughout medicine ignore its importance [24, 25]. Guidelines that fail to address these areas may be vulnerable to poor uptake by healthcare professionals and therefore have a limited effect on improving healthcare quality [9]. The findings of our study highlight how guidelines may be methodologically robust in all aspects but their applicability. The implications of this are that many clinicians are likely to be frustrated by the availability of new recommendations in their practice, which they do not have the ability to implement. Rather than individual clinicians working to overcome barriers to implementation, guideline developers should consider steps that they can take to make uptake of their recommendations easier.

A key way that facilitators and barriers to implementation can be identified and described is by pilot testing of guidelines. Feedback from stakeholders involved in such studies could provide invaluable information that would allow uptake of a guideline to be easier. Furthermore, after implementation of a guideline, the development body may use medical meetings or conferences to examine audit practices, gather additional feedback from users and educate them about steps that can be taken to manage difficulties in applying the guidelines in practice. Regarding the AAOS guidelines, the annual AAOS meetings would be an ideal platform for such a process to occur. Another tool that should be adopted by clinical practice guideline developers is barrier analysis. This is used to formulate intervention strategies to support clinical guidelines and could be performed during pilot studies or after implementation of the guidelines [12]. The involvement of health economists in guideline development has been proposed because the assessment of economic evaluations of treatment often are neglected or poorly performed in guideline development [20]. Analysis by health economists of the financial implications of implementing a guideline's recommendations and generation of methods that would allow users to manage these appropriately would be another step that could be taken to further overcome guidelines that have poor applicability. The design of financial incentives to increase uptake of guidelines that will improve patient outcomes has been popular in the United Kingdom [10]. Payment by results is a scheme that provides financial incentive for hospitals that adopt practices recommended by some of the National Institutes for Health and Care Excellence (NICE) guidelines [23]. Whether such steps could be implemented in the United States is debatable given that healthcare is delivered mainly through the public sector in the United Kingdom, however consideration of all proposed points is necessary to improve the applicability of future guidelines.

Another aspect of AAOS guideline quality that warrants attention and could be improved relates to the stakeholder involvement domain. None of the AAOS guidelines stated that views of the target population were sought. Stakeholder involvement requires that the guideline developers seek the views and preferences of the target population. Although some may question the need to incorporate patient preferences in guidelines [28], their preferences on treatment decisions throughout clinical medicine are associated with better outcomes and therefore cannot be ignored [28]. Current opinion in medicine supports treatment based on patient preferences [18], and guidelines in orthopaedic surgery should consider the views of their target population especially if outcomes of treatment are to be measured using patient reported outcome measures.

Interestingly, the AAOS guidelines advocate the use of the AGREE II instrument for review of guidelines submitted to them from external organizations. Furthermore, external peer review of their own guidelines includes many points taken from the AGREE II instrument [1]. The tool is not used in its entirety and perhaps this is a reason why applicability fared poorly. None of the points that are used in guideline assessment are derived from the applicability domain. Assessment of guidelines using the entire tool and its manual is paramount if a valid assessment is to be performed [5]. This point should be considered by the AAOS guideline committee when developing future guidelines.

This study showed that the overall quality of the AAOS guidelines is high, however their applicability was found to be poor. The value of guidelines that have high quality but are difficult for clinicians to implement is questionable. Numerous suggestions have been proposed to improve applicability including health economist involvement in guideline production, implementation of pilot studies and audit to monitor uptake of the guidelines, and clinician feedback sessions and barrier analysis studies. Future AAOS guidelines should consider and implement steps that can improve their applicability.


