In critical illness, bleeding represents a common complication with significant morbidity. Bleeding is associated with longer hospital stays (1, 2), longer duration of mechanical ventilation (3), and increased mortality (4–6). Multiple interventions, including blood product or medication administration and/or surgical procedures, are used to improve hemostasis and decrease the risk of associated morbidity.
A systematic means to uniformly and reliably depict a patient’s bleeding severity is essential for weighing risks, as well as evaluate the benefits of hemostatic interventions. A validated tool which assesses severity of bleeding would benefit transfusion medicine research by clearly defining study inclusion criteria, providing a monitor of bleeding severity over time, and quantifying response to treatment. Initially emerging from cancer research, the first bleeding scales included the World Health Organization (WHO) scale (7) and the National Cancer Institute Common Toxicity Criteria (8). Scores have now expanded to have broader utility. In adults, bleeding scores measure adverse events in response to treatment (5), quantify bleeding severity in patients with coagulation disorders or receiving anticoagulation therapies (9), and assess postoperative bleeding in the surgical field (10). However, most of these scales are based on adult conditions and physiologic variables and may not be relevant to critically ill children. In addition, the heterogeneity of patients encountered in pediatric critical care medicine may render the application of current bleeding scales inadequate. For example, critically ill children have a wide variation in diagnoses or support devices that may promote bleeding (such as disseminated intravascular coagulopathy, thrombocytopenia, renal failure, or need for extracorporeal life support) and may need to be measured separately.
The goal of this systematic review was to summarize current bleeding scales and their validation to evaluate their applicability to critically ill children.
This systematic review was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement. In adherence to these guidelines, a protocol was registered in PROSPERO (registration number CRD42017064289). A medical librarian conducted a systematic literature search on March 23, 2017, to identify studies related to bleeding scores in Ovid MEDLINE (1946 to February 2017), Ovid EMBASE (1974 to February 2017), Cochrane Library (1995 to present), and Web of Science Core Collection databases (1900 to present). No limits were placed in regard to language, publication date, or study type. The team hand-searched reference lists of included studies and systematic reviews. Supplemental Table 1 (Supplemental Digital Content 1, http://links.lww.com/PCC/A945) describes the full search strategy. To summarize, searches included all subject headings and associated keywords for “hemorrhage,” “purpura,” “hemothorax,” “petechiae,” “critical illness,” “critical care,” “intensive care units,” “bleeding score,” “bleeding assessment tool,” “bleeding survey,” “bleeding questionnaire,” “scoring system,” “bleeding severity,” “and “severity of illness index.”
Both adult and pediatric studies were included. Studies lacking a measurement of bleeding severity, scores designed for surgical procedures that assessed bleeding in an open surgical field, and scores designed only for use during invasive procedures (i.e., endoscopic scores) were excluded. Two investigators reviewed citations independently in a hierarchical manner. Exclusions were based on title and abstract. Disagreement regarding eligibility was resolved by consensus and review of full text. References of full-text articles were hand-searched to ensure all relevant studies were included.
The data are summarized using descriptive statistics and described as n (%). Tables were constructed to compare study design, number of participants, patient characteristics, and description of each published bleeding scale. The inter-rater reliability was assessed in several of the studies and described by the chance-independent inter-related agreement (φ) or the Kappa (κ) statistic. Both are calculations to describe the agreement between two assessors and both use statistical techniques to account for chance in order to report only nonrandom agreement. Φ may be more accurate when the proportion of positive ratings is extreme or possible agreement beyond chance agreement is small (11). The interpretation of either is widely accepted as follows: 0 = poor agreement, 0 to 0.2 = slight agreement, 0.2 to 0.4 = fair agreement, 0.4 to 0.6 = moderate agreement, 0.6 to 0.8 = substantial agreements, and 0.8 to 1.0 = almost perfect agreement (12).
We identified 2,097 unique citations. Twenty-six full-text articles were reviewed and eight met eligibility criteria. The majority of abstracts (2,071/2,096) did not meet inclusion criteria and were excluded. Twelve were added by hand-searching references. Twenty full-text articles were included (Fig. 1). There was substantial agreement between the two reviewers with respect to papers inclusion and exclusion; the κ statistic was 0.79 (95% CI, 0.74–0.83).
Of the selected articles, two (10%) were expert consensuses (7, 8), six (30%) were randomized controlled trials (RCTs) (13–18), eight (40%) were prospective cohorts (1, 2, 5, 19–23), three (15%) were case-control studies (24–26), and one (5%) was a retrospective review (27). Of 18 studies that included subjects, seven (39%) were pediatric-only, seven (39%) were adult-only, and four (22%) included both adults and children. Nine (50%) occurred with inpatients (two studies in critical care units), seven (39%) involved outpatients and two (11%) included both. Table 1 describes primary diagnoses of the patients. Thirty-nine percent of the scales were developed for idiopathic thrombocytopenic purpura (ITP) and only two (12%) described patients with critical illness. Supplemental Table 2 (Supplemental Digital Content 2, http://links.lww.com/PCC/A946) summarizes the studies including size, study conclusions, and description of the bleeding tools. Few scales (4/20, 20%) included physiologic variables. The majority (16/20, 80%) included need for treatment (either RBC transfusion or surgical intervention) as a part of the criteria.
Bleeding in Critically Ill Adults
Seven bleeding assessment tools (BATs) have been developed solely for adults. Only three report results of validation studies: an ancillary study to the Tasman study (16), the Bleeding Severity Measurement Scale (BSMS) (26), and the HEME bleeding measurement tool (13).
The Tasman trial was an RCT comparing low molecular weight heparin and standard unfractionated heparin for the treatment of deep venous thrombosis. Bleeding events were classified as major (drop in hemoglobin, leading to RBC transfusion, retroperitoneal or intracranial, or necessitating discontinuation of anticoagulant medication), minor (clinically overt bleeding not meeting criteria for major), or no bleeding. The inter-rater reliability on classification was moderate (chance-independent inter-rater agreement [φ] = 0.67; 95% CI, 0.53–0.82) and the intra-rater reliability (comparing the present comparison to those made during the RCT) was similar (φ = 0.62; 95% CI, 0.42–0.82). Clinical relevance of each bleeding event was graded using a scoring system of severity and a validated disability scale. The association between classification as major or minor and physician graded outcomes was strong (area under the receiver operating characteristic curve 0.98; 95% CI, 0.94–1.0 for severity and 0.99; 95% CI, 0.97–1.0) for perceived disability.
The BSMS was developed for use in inpatients and outpatients with chemotherapy-induced thrombocytopenia. Bleeding episodes were classified as nonclinically significant bleeding (either trace or mild) or clinically significant bleeding (either serious, serious causing significant morbidity, or fatal). Adjudication was based on laboratory measures, level of care required, need for transfusion or surgical intervention, certain high-risk sites (such as intracranial or intraocular), and changes in vital signs. Validation occurred in 126 assessments of 78 patients. Both the inter-rater reliability and intra-rater reliability were good (intra-class correlation coefficients of 0.80 and 1.0, respectively).
The HEME bleeding measurement tool describes bleeding in critically ill adults. Items were generated from three previous studies (two involving thromboembolism and one involving gastrointestinal bleeding) and describe site, severity, duration, and clinical consequences of discrete bleeding episodes. Severity was based on associated physiologic derangements and/or site of bleeding and graded as minor, major, or fatal. Validation occurred in 100 consecutive critically ill adults with 480 discrete bleeding events. The inter-rater agreement on bleeding severity assessments was strong (φ = 0.98; 95% CI, 0.96–0.99), as well as agreement on designation of major bleeding by the HEME tool and WHO scale (φ = 0.98; 95% CI, 0.96–0.99). The classification of major bleeding using the HEME tool was compared with clinical outcomes (receipt of any blood product, duration of ICU stay, and mortality). Patients with major bleeding had more platelet, RBC, and plasma transfusions as compared with those without major bleeding (p > 0.001 for each blood product comparison) and longer length of ICU stay (p = 0.019), but no difference in mortality. No univariate or multivariate logistic regressions were reported to describe any associations.
Pediatric-only studies were limited to children with ITP or a known bleeding diathesis except for the Neonatal BAT (NeoBAT) (2), and the BAT (23), developed to assess bleeding in neonates and critically ill children, respectively.
The items for NeoBAT were generated from review of existing bleeding scales and expert opinion. Items focus on site, severity, duration, and clinical consequences specific to bleeding in neonates. Bleeding severity is categorized into none, mild, moderate, and severe based on physiologic significance and site, similar to the HEME score. Validation occurred in 37 neonates with bleeding episodes. The inter-rater reliability was tested in the United Kingdom, as well as the United States. The respective κ statistics were 0.95 (95% CI, 0.90–1.00; n = 192) and 0.59 (95% CI, 0.36–0.83; n = 284). The tool was not correlated with clinical outcomes.
The items for the BAT were modified by adding criteria particular to critically ill children in severity and duration to criteria defined by the International Society on Thrombosis and Hemostasis to standardize outcomes in clinical trials of anticoagulation. Items focus on site, severity, and duration. Severity is divided into minor, clinically relevant nonmajor and major based on need for intervention, critical sites or drop in hemoglobin. Validation occurred in 405 critically ill children. The crude inter-rater agreement was 0.81 with κ of 0.57. Clinically relevant bleeding events (nonmajor or major) were associated with longer times to discharge from the ICU (hazard ratio [HR], 0.21; 95% CI, 0.13–0.33) and from the hospital (HR, 0.49; 95% CI, 0.33–0.73) after adjusting for age and predicted risk of mortality.
This systematic review of literature on bleeding scales and their validation demonstrates that very few bleeding scales exist to adequately evaluate bleeding in critically ill children. Although several describe bleeding events in children, only one pertains to critical illness and descriptors are subjective, leading to inadequate reliability and precision. Nearly all existing scales include medical treatment and intervention as criteria despite variability in clinical decision making between providers and clinical scenarios. Few scales have been evaluated for reproducibility, and even fewer scales have been validated to clinical outcomes.
Since bleeding occurs in diverse patient populations in a wide variety of settings, it is not surprising that many different scales of bleeding have been published. An ideal score of bleeding in critically ill children (as illustrated in Fig. 2) would: 1) include physiologic variables that are appropriate for children, 2) include sites of possible bleeding that are particular to critical illness (such as endotracheal tubes and urinary catheters), and 3) be untied to interventions (such as transfusion or surgical procedures) that are physician dependent, subjective and highly variable (28–30). Perhaps most importantly, 4) a bleeding score in critically ill children must be validated against meaningful clinical outcomes such as mortality (which may be a difficult outcome given its low prevalence in pediatrics), morbidities, and healthcare costs.
Only 20% of the scales included in this review contain physiologic variables, and no variables are specifically defined for children. Only two scales specifically address critically ill patients and include sites of bleeding that are particular to this unique patient population. Likewise, only 20% of the scales are not linked to physician-dependent interventions. Associations to meaningful clinical outcomes were explored in only two scales but were not completely evaluated.
The development of similar validated scores has been successfully undertaken in critical care medicine. The Sepsis-Related Organ Failure Assessment score was developed to describe the degree of organ dysfunction/failure over time in critically ill adults (31). The ideal variables for the score were described as “objective, simple, easily available, reliable, obtained routinely and regularly in every institution, continuous, independent of the type of patients and independent of the therapeutic interventions.” These characteristics are equally applicable for elements of an ideal bleeding score. In addition, reproducibility and ease of use are essential aspects to consider.
In pediatric critical care medicine, a review of scores for organ dysfunction concluded that existing definitions were developed using diverse methodologic approaches and were never validated (32). The variables that researchers eventually tested were selected using a Delphi process by an expert panel. They excluded variables that were based on therapeutic interventions that may be influenced by practice patterns. With this approach in mind, the PEdiatric Logistic Organ Dysfunction score was developed and validated (33).
The results of this systematic review will be used to inform the development of a bleeding score in critically ill children. The variables will be developed from an international survey of critical care providers and then modified using a Delphi process with a diverse expert panel. The variables will be validated to meaningful clinical outcomes through a large observational study. This validated score of bleeding may be used in clinical trials to evaluate interventions and will assist clinicians in the diagnosis of bleeding events and whether such events require specific tests and/or clinical interventions.
Our study has several limitations. Bleeding scales were developed in diverse patient populations and results cannot be combined. Several studies were added via hand-searching references of included articles. This was necessary as many studies of anticoagulation developed their own bleeding scales to assess safety. However, since their focus was not specific to bleeding, they did not appear in the original search. Given this less precise method of searching, it is possible that other scales were developed for other safety studies, which were not included here.
No bleeding scales exist to adequately describe bleeding in critically ill children that are unlinked to therapeutic interventions and/or validated to clinical outcomes. A bleeding scale specific to critically ill children is necessary to develop and urgently needed, both for clinicians and researchers to prescribe hemostatic products in a safe and efficacious manner.
1. Arnold DM, Donahoe L, Clarke FJ, et al. Bleeding during critical illness
: A prospective cohort study using a new measurement tool. Clin Invest Med 2007; 30:E93–E102
2. White LJ, Fredericks R, Mannarino CN, et al. Epidemiology of bleeding in critically ill children. J Pediatr 2017; 184:114–119.e6
3. Christensen MC, Krapf S, Kempel A, et al. Costs of excessive postoperative hemorrhage
in cardiac surgery. J Thorac Cardiovasc Surg 2009; 138:687–693
4. Eikelboom JW, Mehta SR, Anand SS, et al. Adverse impact of bleeding on prognosis in patients with acute coronary syndromes. Circulation 2006; 114:774–782
5. Nevo S, Swan V, Enger C, et al. Acute bleeding after bone marrow transplantation (BMT)- incidence and effect on survival. A quantitative analysis in 1,402 patients. Blood 1998; 91:1469–1477
6. Dalton HJ, Garcia-Filion P, Holubkov R, et al.; Eunice Kennedy Shriver National Institute of Child Health and Human Development Collaborative Pediatric Critical Care
Research Network: Association of bleeding and thrombosis with outcome in extracorporeal life support. Pediatr Crit Care Med 2015; 16:167–174
7. Miller AB, Hoogstraten B, Staquet M, et al. Reporting results of cancer treatment. Cancer 1981; 47:207–214
9. Landefeld CS, Anderson PA, Goodnough LT, et al. The bleeding severity index: Validation and comparison to other methods for classifying bleeding complications of medical therapy. J Clin Epidemiol 1989; 42:711–718
10. Mariscalco G, Gherli R, Ahmed AB, et al. Validation of the European Multicenter Study on Coronary Artery Bypass Grafting (E-CABG) bleeding severity definition. Ann Thorac Surg 2016; 101:1782–1788
11. Cook RJ, Farewell T. Conditional inference for subject-specific and marginal agreement: Two families of agreement measures. Can J Stat 1995; 23:333–344
12. Maclure M, Willett WC. Misinterpretation and misuse of the kappa statistic. Am J Epidemiol 1987; 126:161–169
13. Blanchette VS, Luke B, Andrew M, et al. A prospective, randomized trial of high-dose intravenous immune globulin G therapy, oral prednisone therapy, and no therapy in childhood acute immune thrombocytopenic purpura. J Pediatr 1993; 123:989–995
14. Buchanan GR, Holtkamp CA. Prednisone therapy for children with newly diagnosed idiopathic thrombocytopenic purpura. A randomized clinical trial. Am J Pediatr Hematol Oncol 1984; 6:355–361
15. Connolly SJ, Ezekowitz MD, Yusuf S, et al.; RE-LY Steering Committee and Investigators: Dabigatran versus warfarin in patients with atrial fibrillation. N Engl J Med 2009; 361:1139–1151
16. Graafsma YP, Prins MH, Lensing AW, et al. Bleeding classification in clinical trials: Observer variability and clinical relevance. Thromb Haemost 1997; 78:1189–1192
17. Rebulla P, Finazzi G, Marangoni F, et al. The threshold for prophylactic platelet transfusions in adults with acute myeloid leukemia. N Engl J Med 1997; 337:1870–1875
18. Slichter SJ, Kaufman RM, Assmann SF, et al. Dose of prophylactic platelet transfusions and prevention of hemorrhage
. N Engl J Med 2010; 362:600–613
19. Bolton-Maggs PH, Moon I. Assessment of UK practice for management of acute childhood idiopathic thrombocytopenic purpura against published guidelines. Lancet 1997; 350:620–623
20. Khellaf M, Michel M, Schaeffer A, et al. Assessment of a therapeutic strategy for adults with severe autoimmune thrombocytopenic purpura based on a bleeding score rather than platelet count. Haematologica 2005; 90:829–832
21. Lacey JV, Penner JA. Management of idiopathic thrombocytopenic purpura in the adult. Semin Thromb Hemost 1977; 3:160–174
22. Page LK, Psaila B, Provan D, et al. The immune thrombocytopenic purpura (ITP) bleeding score: Assessment of bleeding in patients with ITP. Br J Haematol 2007; 138:245–248
23. Venkatesh V, Curley A, Khan R, et al. A novel approach to standardised recording of bleeding in a high risk neonatal population. Arch Dis Child Fetal Neonatal Ed 2013; 98:F260–F263
24. Dean JA, Blanchette VS, Carcao MD, et al. von Willebrand disease in a pediatric-based population–comparison of type 1 diagnostic criteria and use of the PFA-100 and a von Willebrand factor/collagen-binding assay. Thromb Haemost 2000; 84:401–409
25. Rodeghiero F, Castaman G, Tosetto A, et al. The discriminant power of bleeding history for the diagnosis of type 1 von Willebrand disease: An international, multicenter study. J Thromb Haemost 2005; 3:2619–2626
26. Webert KE, Arnold DM, Lui Y, et al. A new tool to assess bleeding severity in patients with chemotherapy-induced thrombocytopenia. Transfusion 2012; 52:2466–2474; quiz 2465
27. Medeiros D, Buchanan GR. Major hemorrhage
in children with idiopathic thrombocytopenic purpura: Immediate response to therapy and long-term outcome. J Pediatr 1998; 133:334–339
28. Dallman MD, Liu X, Harris AD, et al. Changes in transfusion practice over time in the PICU. Pediatr Crit Care Med 2013; 14:843–850
29. Murphy DJ, Needham DM, Netzer G, et al. RBC transfusion practices among critically ill patients: Has evidence changed practice? Crit Care Med 2013; 41:2344–2353
30. Du Pont-Thibodeau G, Tucci M, Ducruet T, et al. Survey on stated transfusion practices in PICUs*. Pediatr Crit Care Med 2014; 15:409–416
31. Vincent JL, Moreno R, Takala J, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. Intensive Care Med 1996; 22:707–710
32. Leteurtre S, Martinot A, Duhamel A, et al. Development of a pediatric multiple organ dysfunction score: Use of two strategies. Med Decis Making 1999; 19:399–410
33. Leteurtre S, Martinot A, Duhamel A, et al. Validation of the paediatric logistic organ dysfunction (PELOD) score: Prospective, observational, multicentre study. Lancet 2003; 362:192–197