Skip Navigation LinksHome > April 2014 - Volume 74 - Issue 4 > Intraobserver and Interobserver Agreement in Visual Inspecti...
Neurosurgery:
doi: 10.1227/NEU.0000000000000291
Research-Laboratory: Editor's Choice

Intraobserver and Interobserver Agreement in Visual Inspection for Xanthochromia: Implications for Subarachnoid Hemorrhage Diagnosis, Computed Tomography Validation Studies, and the Walton Rule

Marshman, Laurence A.G. MD, FRCSN*; Duell, Ryan MBBS*; Rudd, Donna BSc; Johnston, Ross BSc (Hons)§; Faris, Cassandra MBBS*

Free Access
Editor's Choice
Article Outline
Collapse Box

Author Information

*Department of Neurosurgery, Institute of Surgery, The Townsville Hospital, Queensland, Australia;

Departments of Physiology and,

§Marine and Tropical Biology, James Cook University, Queensland, Australia

Correspondence: Laurence A.G. Marshman, MD, FRCSN, Department of Neurosurgery, Institute of Surgery, IMB 20, PO Box 670, The Townsville Hospital, Douglas, Townsville 4810, Queensland, Australia. E-mail: l.a.g.marshman@btinternet.com

Received December 12, 2012

Accepted December 29, 2013

Collapse Box

Abstract

BACKGROUND: Visual inspection for xanthochromia is used to diagnose subarachnoid hemorrhage (SAH), to validate computed tomography subarachnoid hemorrhage diagnosis and was used to determine the Walton rule. No study has assessed the reliability of xanthochromia.

OBJECTIVE: To determine intraobserver and interobserver xanthochromia agreement.

METHODS: Mock cerebrospinal fluid samples contained increasing concentrations of human oxyhemoglobin, bilirubin, and albumin. Non-color-blind observers randomly assessed samples against a white background twice under significantly differing illumination. Specimens were recorded as red, orange, yellow, or clear.

RESULTS: Results were obtained for 26 observers (11 male, 15 female observers). We found that 19.2% of samples were misclassified: red, 11.7%; orange, 28.5%; yellow, 29.6%; and clear, 22.1% (χ2 = 213.2; P < .001). Of the yellow misclassifications, 88% were misclassified as clear. Female observers correctly classified samples significantly more frequently than male observers (P = .03). Intraobserver agreement differed significantly from expected for both male (χ2 = 105.6; P < .001) and female (χ2 = 99.9; P < .001) observers regardless of illumination. Interobserver agreement was poor regardless of sex (χ2 for male observers = 176.96, P < .001; χ2 for female observers = 182.69, P < .001) or illumination (χ2 for bright = 125.64, P < .001; χ2 for dark = 148.48, P < .001). Overall, there was 75% agreement in 46% of the tests and 90% agreement in only 36% of the tests.

CONCLUSION: This simple laboratory study would be expected to maximize agreement relative to clinical practice. Although non-color-blind female observers significantly outperformed non-color-blind male observers, both intraobserver agreement and interobserver agreement for xanthochromia were prohibitively poor regardless of sex or illumination. Yellow was most frequently misclassified, 88% as clear (ie, true positives were commuted to false negatives). Xanthochromia is therefore highly unreliable for subarachnoid hemorrhage diagnosis and computed tomography validation. The Walton rule requires urgent clinical revalidation.

Abbreviation: SAH, subarachnoid hemorrhage

Xanthochromia is the yellow hue present in the cerebrospinal fluid (CSF) supernatant of a patient with a recent subarachnoid hemorrhage (SAH) after centrifugation. Visual inspection for xanthochromia, the result of in vivo bilirubin formation from catabolized extruded oxyhemoglobin, has represented a diagnostic criterion for SAH since Froin's discovery in 1903.1 No study, however, has validated the absolute reliability of xanthochromia, whether intraobserver (observer consistency within the same sample) or interobserver (perceptual consistency across observers).

Textbooks,2,3 national guidelines,4 and review articles5 continue to quote the Walton rule for the correct timing of CSF analysis after SAH. According to the Walton rule, CSF examination should be deferred for 12 hours after SAH to permit sufficient time for bilirubin formation to minimize the probability of false negatives. However, despite such continued recommendation, including that contained in the revised national (UK) guidelines for analysis of CSF for bilirubin in suspected SAH,4 the 12-hour rule is ultimately based on Walton's study6 from 1956, which used solely unvalidated xanthochromia.

Studies validating the sensitivity of computed tomography (CT) in SAH diagnosis (ie, to confirm or refute SAH when the CT is negative) also continue to use unvalidated xanthochromia as a reference standard.7 Interestingly, evidence from these and other studies8 demonstrates that up to 45% of CSF analyses are performed within 12 hours of SAH. Although seemingly dismissive of the Walton rule and general advice,2-5 it is possible that such practice nevertheless reflects Walton's study in that “most” (ie, 64%) samples in his series were xanthochromic by 4 to 6 hours.6 Unfortunately, however, the corollary is that with such early sampling, false negatives may be encountered in 36%.6

Although spectrophotometry is physically similar to xanthochromia, it involves the interpretation of transmitted spectral wavelengths, which are determined by constituent CSF concentrations after excitation by a light source. This enables numerical quantization. Considered empirically superior to xanthochromia, spectrophotometry interpretation, like xanthochromia, is nevertheless prone to human error9 and interobserver variability5 (a fact not emphasized by certain authorities4). For example, estimated tangents are extrapolated from curves of transmitted spectra. Furthermore, spectrophotometric analysis, like xanthochromia, is similarly confounded by excess free oxyhemoglobin.4,10,11 Judgment is therefore required; indeed, prior training and a practice frequency of at least 25 per year are requisites.4,9 As a consequence, whenever spectrophotometry is unavailable or expertise in its use is lacking, xanthochromia remains the diagnostic criterion for SAH. Indeed, most laboratories in North America use xanthochromia.5,12,13 It is therefore imperative that the absolute reliability of xanthochromia is firmly established.

Three extant studies have compared the relative performance of xanthochromia and spectrophotometry; however, they have provided conflicting results. Linn et al14 concluded that it was “unlikely that spectrophotometry could further improve upon xanthochromia.” The results of Perry et al8 even suggested that xanthochromia may be superior to spectrophotometry by virtue of greater xanthochromia specificity. By contrast, Petzold et al11 showed that spectrophotometry was superior to xanthochromia in 11 observers of unstated sex or color-blind status. However, significant design limitations in all 3 studies considerably restricted their conclusions.

In comparisons of the relative performance of 2 tests, the absolute reliability of at least 1 test should be known. When options limit the choice to a single test, any potential error associated with that test should also be thoroughly understood to ensure a reliable diagnosis. We present here the first study to examine absolute xanthochromia reliability by examining the uniformity of intraobserver and interobserver xanthochromia agreement.

Back to Top | Article Outline

MATERIALS AND METHODS

The major determinants of SAH CSF supernatant hue are oxyhemoglobin, bilirubin, and albumin.4,15 A laboratory study, approved by the local ethics committee, was devised to simulate xanthochromia with known ratios of these agents. We specifically avoided using human CSF specimens because albumin, as well as potentially other confounding variables (eg carotenoids), would have varied sporadically; minimizing such variation would therefore maximize intraobserver and interobserver xanthochromia agreement. Because approximately 8% (predominantly men) are classified as color blind,16 we restricted our study to subjects with normal color vision (Ishihara testing).

Mock CSF samples were made of physiological saline solution. The use of mock samples also avoided significant constraints on the quantity of human CSF (approximately 1.3 L) required. Concentrations of bilirubin (0-3.0 μmol/L), albumin (0.3 and 2.0 g/L), and oxyhemoglobin (0-6 g/L) were chosen to represent a range of relevant concentrations found clinically8 that could be interpreted as negative, borderline, or positive according to the UK revised national guidelines4 (in which the cutoff net bilirubin absorbance of 0.007 arbitrary units corresponds to a bilirubin concentration of 0.359 μmol/L). All samples were created fresh immediately before xanthochromia, kept in aluminum foil-wrapped tubes (to prevent photochemical degradation of bilirubin),4 capped with minimal trapped air,17 and kept on ice pending xanthochromia.

All observers were healthy staff members with no significant medical history who were employed in the pathology and neurosurgery departments at The Townsville Hospital. All observers were therefore familiar with xanthochromia. Observers were assigned each standard mixture, in random order, on 2 separate occasions, each in 2 separate rooms with different ambient luminosity (thus permitting assessment of intraobserver reliability under each light regimen): R1, dark (ambient luminosity, 400-430 lux; fluorescent lighting) for 2 rounds; and R2, bright (ambient luminosity, 950-1020 lux; full daylight) for 2 rounds.

Ambient luminosity in each room was measured with a light meter (Lutron LX-1102). R1 dark represented conditions expected in a standard laboratory with blinds closed and no daylight. R2 represented conditions with blinds open and full daylight. Xanthochromia was performed against a white paper background and compared with a sample of distilled water on all 4 occasions. Observers were permitted a minimum of 10 minutes to complete each round of 56 (randomly arranged) samples, ie, at least 10 seconds per sample. A minimum break of 15 minutes was permitted between each round in each room. All tests were completed within 2 to 3 hours of sample creation.

Oxyhemoglobin in solution initially appears red; however, with progressive dilution, it subsequently appears orange.6,10,11,15 In contrast, both bilirubin and albumin in solution appear yellow.6,10,11,15 Observers were asked to record 1 of 4 possible choices for each specimen, ie, red, orange, yellow, or clear, as in Walton's review.6

Back to Top | Article Outline
Statistical Analysis

Factorial analysis of variance was used to screen for any interaction between sex and light on xanthochromia determination that would preclude their independent assessment. Independent assessment for consistency of classification (intraobserver and interobserver) was thereafter investigated separately for sex and light regimen with a χ2 goodness-of-fit test. Statistical significance was considered for values of P < .05. Note that the κ statistic is inappropriate for unbalanced data (eg, different sex sample sizes) and for multiple classification analyses between subgroups.

Note also that consistency, not accuracy, was measured in our study. Because color vision is a perception, it is subjective; there is no mathematically precise definition of color that is independent of human perception. Because accuracy is determined by test deviation from a precisely defined standard, accuracy with color perception is illusive. Color perception is defined statistically in terms of what the standard observer would perceive given a range of wavelengths, luminance, etc (International Commission on Illumination). Interobserver agreement (ie, how often an individual's perception deviates from a majority of recorded values on repeated testing) is therefore a more practical measure of color perception, with direct relevance to a test scenario. Intraobserver agreement (a measure of the natural variance in individual test performance) is also a practical and directly relevant test measure.

Back to Top | Article Outline

RESULTS

Results were obtained for 26 non-color-blind observers (11 male, 15 female observers). There was no significant age difference between male (32.6 ± 2.7 years; range, 23-51 years) and female (36.3 ± 3.1 years; range, 21-55 years) observers (P = .40).

Back to Top | Article Outline
Overall Analysis

From a total of 4916 completed tests across all scenarios, 944 test samples (19.2%) were incorrectly classified. There was a significant trend in classification error that was dependent on sample hue (χ2 = 213.2; df = 3; P < .001); red CSF was incorrectly classified in 11.7% of trials (n = 2529), orange in 28.5% (n = 1521), yellow in 29.6% (n = 335), and clear in 22.1% (n = 521). Of yellow misclassifications, 88% were misclassified as clear.

Across all tests, there was no statistical interaction between light regimen and observer sex (F1,46 = 0.04; P = .8). No independent effect of light regimen per se was found (F1,46 = 0.26; P = .6). However, female observers correctly classified samples significantly more frequently than male observers (F1,46 = 4.84; P = .03).

Back to Top | Article Outline
Intraobserver Agreement

Intraobserver agreement in the sample classifications differed significantly from full agreement for both male (χ2 = 105.6; df = 8; P < .001) and female (χ2 = 99.9; df = 14; P < .001) observers. Male observers demonstrated a greater variability in intraobserver agreement (range, 16%-89%) than female observers (range, 55%-91%).

For female observers, sample classification differed significantly from full agreement in both light (χ2 = 159.3; df = 14; P < .001) and dark (χ2 = 76.9; df = 14; P < .001). Similar results were found for male observers (light: χ2 = 81.8, df = 8, P < .001; dark: χ2 = 72.2, df = 8, P < .001). Variance was greater in dark than light for both sexes (eg, SE for dark = 2.85; SE for light = 2.0). However, the generally large departure from full agreement was such that no independent effect could be attributed to the light regimen.

Back to Top | Article Outline
Interobserver Agreement

Interobserver agreement was equally poor regardless of sex (χ2 for male observers, 176.96, P < .001; χ2 for female observers, 182.69, P < .001) or illumination (χ2 for bright = 125.64, P < .001; χ2 for dark = 148.48, P < .001). Overall, across all 4 test scenarios, interobserver agreement was 75% in 46% of tests, 90% in 36% of tests, and 100% in only 2% of tests. Interobserver agreement was 100% across all 4 scenarios for all female observers in 6 of 56 samples but in only 1 of 56 samples for all male observers. Only 1 of 56 samples was classified identically by all observers across all 4 scenarios.

Back to Top | Article Outline

DISCUSSION

Barrows et al15 noted, in a seminal spectrophotometry study of SAH CSF, that “…the sensitivity of the human retina to CSF color changes can be quite impressive.” When spectrophotometry is unavailable or expertise in its use is lacking (prior training and a practice frequency exceeding 25 per year are required4,9), xanthochromia remains the diagnostic criterion for SAH. Consequently, xanthochromia continues to be practiced by laboratories even in the developed world (especially the United States and Canada).7,12,13 Furthermore, xanthochromia continues to be used to validate CT SAH diagnosis7 and was used to determine the Walton rule for optimum SAH CSF analysis timing.6 Given the continued influence of xanthochromia, it is surprising that no study has rigorously assessed the reliability of xanthochromia.

One previous study assessed xanthochromia sensitivity. In CSF specimens spiked with bilirubin, Linn et al14 found that xanthochromia was so sensitive (99% for medical students, 100% for physicians) that it was unlikely that spectrophotometry could further improve results. However, the specimens specifically excluded blood (and therefore oxyhemoglobin after lysis). Even in older series in which xanthochromia predominated, most (ie, 76%18-92%19) SAH CSF samples were heavily bloodstained. In a recent clinical series, 79% of CSF samples positive for bilirubin on spectrophotometry were so contaminated with oxyhemoglobin that they would have appeared red or orange to a standard observer; only 21% would have appeared yellow.10 Because extravasated red cells remain viable in CSF for 4 to 20 days after SAH,20 a reservoir for oxyhemoglobin (via lysis) is thus extant. Hence, when Linn et al14 concluded that “…if CSF is crystal clear and colorless, the chance of SAH is negligible,” they referred to a clinically uncommon scenario.

Using mock SAH solutions containing bilirubin and the same observers, Petzold et al11 showed that spectrophotometry was superior to xanthochromia; however, only 11 observers of unstated sex or color vision status were used. Furthermore, ambient luminosity was measured qualitatively, not quantitatively, and with tungsten lighting (which biases perception toward spectral red).11 Although xanthochromia did not detect bilirubin under gross oxyhemoglobin contamination or high dilution, Petzold et al11 stated that it was detected “in every case” by spectrophotometry. However, this contrasts markedly with Ungerer et al,9 who found that bilirubin detection was confounded on spectrophotometry, just as with xanthochromia, by gross oxyhemoglobin contamination or high dilution (bilirubin ≤274 nmol/L). Indeed, such obscuration is explicitly acknowledged in the revised national (UK) guidelines for spectrophotometric bilirubin analysis in suspected SAH,4 in which a permitted report reads “oxyhaemoglobin is present in sufficient concentration to impair the ability to detect bilirubin: SAH not excluded.” Because xanthochromia and spectrophotometry are both subjective,8 the absolute reliability of each is paramount clinically.

Observer bias is a familiar problem with clinical measurement yet has not been considered before with xanthochromia. How consistently would the same observer(s) classify xanthochromia on repeated occasions and under different conditions? That is, what is the precision of xanthochromia? Although the mean of a set of estimations may be accurate, any major departure from 100% agreement with repeated analysis (ie, imprecision) would be unacceptable for individual SAH diagnosis clinically. False-positive xanthochromia SAH diagnosis risks potentially unacceptable morbidity and mortality from treating otherwise incidental aneurysms found on angiography. In contrast, false-negative xanthochromia SAH diagnosis risks repeated hemorrhage from untreated, but ruptured, intracranial aneurysms.

Our simple laboratory study would be expected to maximize agreement relative to clinical practice. However, 944 of 4916 test samples (19.2%) were incorrectly classified across all scenarios. Intraobserver agreement and interobserver agreement differed significantly from expected regardless of either illumination or sex (P < .001 for both). Thus, interobserver agreement was 75% in 46% of tests, 90% in 36% of tests, and 100% in only 2% of tests. Indeed, only 1 of 56 samples was classified identically by all observers across all 4 scenarios. Such results show that xanthochromia is prohibitively imprecise for SAH diagnosis. Furthermore, incorrect sample classification also varied significantly with the color tested. Red CSF was incorrectly classified in 11.7% of tests, orange in 28.5%, yellow in 29.6%, and clear in 22.1% (P < .001). Thus, yellow, the defining hue for xanthochromia, was misclassified most frequently. When yellow was misclassified, it was most frequently (88%) misclassified as clear; ie, a true positive (by xanthochromia criteria) was directly commuted to a false negative. Although the results of Perry et al8 suggested that xanthochromia had greater specificity than spectrophotometry in diagnosing SAH, our findings demonstrate that xanthochromia per se is highly unreliable.

Anecdotally, it is believed that women perceive colors more accurately than men (Indeed, Jameson et al21 discuss this in relation to the possession of a fourth photopigment). In our study, despite having excluded gross color blindness on the Ishihara test (approximately 8% of men16), women correctly classified samples significantly more frequently than men (F1,46 = 4.84; P = .03). Men also demonstrated greater variability in intraobserver agreement (range, 16%-89% vs 55%-91%). Because milder color deficiencies (eg, as detected by the Farnsworth-Munsell 100-hue test) might remain undetected on the Ishihara test22 and because such deficiencies are more common in men,16,23-25 the interobserver sex inconsistency can potentially be explained. Because subjects with subnormal hue discrimination may also make inconsistent color matches on repeated testing,22 intraobserver inconsistency may also be explained. However, psychological factors such as attentional variations could also explain our results. Age, in contrast, was not a factor. Because clinically significant photodegradation takes 19 hours,26 sporadic bilirubin deterioration also could not have explained findings. The inclusion of subjects with color-perceptual anomalies could clearly have explained the superiority of spectrophotometry over xanthochromia in a solitary previous study.11

In principle, it is possible to improve intraobserver and interobserver agreement. For example, apart from restricting analysis to a distinction between clear and yellow CSF, Linn et al14 also used clear tap water as a direct reference aid (as others had previously14). Such maneuvers probably accounted for the exceptional sensitivities that Linn et al13 observed; however, such maneuvers could influence only the minority of SAH samples with minor oxyhemoglobin contamination clinically (ie, essentially clear specimens).10 Indeed, despite such maneuvers, tubes bereft of bilirubin were erroneously classified as yellow (ie, xanthochromic) by 6% of observers.14 Given the extent of our findings, it seems unlikely that any modification could overcome the gross imprecision noted. At present, standard xanthochromia practice is to merely assess specimen hue against a white background.7,8

Empirically, ambient luminosity should significantly affect xanthochromia. Thus, peak retinal luminance sensitivity shifts toward the blue end of the visual spectrum at low ambient luminosity (the Purkinje effect), with subsequent distortion of color perception. Despite this and despite greater variance in our study for intraobserver agreement in dark than light for both sexes, luminosity effects were not significant overall. Because both intraobserver agreement and interobserver agreement were so generally poor, it is possible that true luminosity effects had been largely obfuscated by other factors. If xanthochromia agreement per se could be generally improved, luminosity effects might therefore become more significant. The latter was indeed suggested in the limited study of Petzold et al11 when observer choice was limited to yellow or clear.

Regardless of whether spectrophotometry is preferred over xanthochromia, textbooks,2,3 national guidelines,4 and review articles5 continue to recommend the Walton rule for the correct timing of CSF analysis after SAH. According to the Walton rule, CSF examination should be deferred for 12 hours after SAH to permit sufficient time for bilirubin formation to thus minimize false-negative SAH diagnoses. However, the 12-hour rule is ultimately based on Walton's 1956 study, which used solely xanthochromia on diagnostic samples (n = 286) varying from 0 to 2 hours to >2 weeks after SAH.5 Although Vermuelen27 later found that all cases positive for SAH on CT between 12 and 24 hours (n = 34) were also positive for bilirubin on CSF spectrophotometry, no cases were analyzed within 12 hours. The only studies that have analyzed samples within 12 hours have been spectrophotometric animal models28,29; these potentially do not extrapolate to humans. Walton's study is therefore the only clinical study to have assessed the first 12 hours after SAH (n = 89). The results showed that most (64%) of these samples were xanthochromic by 4 to 6 hours.6 Given the poor intraobserver and interobserver xanthochromia agreement in our study and that yellow was most frequently misclassified (typically as clear), the Walton rule requires urgent clinical revalidation, with accurate and precise measurement of CSF bilirubin.

Finally, studies that set out to validate CT sensitivity in SAH diagnosis (ie, to confirm or refute SAH when CT is negative) also continue to use xanthochromia. For example, a recent prospective multicenter Canadian study incorporating 3132 patients with SAH used concurrent xanthochromia to validate CT in this way.7 The results of our study, however, suggest that some of the results of the previous study may have been unsafe. Interestingly, evidence from that study7 and others8 demonstrates that up to 45% of CSF analyses are performed within 12 hours of SAH. However, considering both Walton's results6 and the delays regarding enzyme induction, a significant number of specimens could have been false negative during this premature interval. This again presages an urgent clinical revalidation of the Walton rule.

Back to Top | Article Outline

CONCLUSION

This simple laboratory study would be expected to maximize agreement relative to clinical practice. Although non-color-blind female observers significantly outperformed non-color-blind male observers, both intraobserver agreement and interobserver agreement for xanthochromia were prohibitively poor regardless of sex or illumination. Yellow was most frequently misclassified: 88% of yellow misclassifications were misclassified as clear (ie, true positives were commuted to false negatives). Xanthochromia in both SAH diagnosis and CT validation is therefore highly unreliable. The Walton rule requires urgent clinical revalidation.

Back to Top | Article Outline
Disclosure

The authors have no personal financial or institutional interest in any of the drugs, materials, or devices described in this article.

Back to Top | Article Outline

REFERENCES

1. Froin G. Inflammation meninges avee reaction chromatique, fibrineuse, et cytologique due lquide cephalo-rachidien. Gaz D'Hop. 1903;76:1005.

2. Longmore M, Wilkinson I, Rajagopalan S. The Oxford Handbook of Clinical Medicine. 6th ed. Longmore M, Wilkinson I, Rajagopalan S, eds. New York, NY: Oxford; 2004:362.

3. Greenberg MS. Handbook of Neurosurgery. 7th revised ed. Vol 14. New York, NY: Thieme Medical Publisher, Inc; 2010.

4. Cruikshank A, Auld P, Beetham R, et al.. Revised national guidelines for analysis of cerebrospinal fluid for bilirubin in suspected subarachnoid haemorrhage. Ann Clin Biochem. 2008;45(pt 3):238–244.

5. Nagy K, Skagervik I, Tumani H, et al.. Cerebrospinal fluid analyses for the diagnosis of subarachnoid haemorrhage and experience from a Swedish study: what method is preferable when diagnosing a subarachnoid haemorrhage? Clin Chem Lab Med. 2013;51(11):2073–2086.

6. Walton JN. Subarachnoid Haemorrhage. Edinburgh, UK: E & S Livingstone; 1956.

7. Perry JJ, Stiell IG, Sivilotti ML, et al.. Sensitivity of computed tomography performed within six hours of onset of headache for diagnosis of subarachnoid haemorrhage: prospective cohort study. BMJ. 2011;343:d4277.

8. Perry JJ, Sivilotti MLA, Stiell IG, et al.. Should spectrophotometry be used to identify xanthochromia in the cerebrospinal fluid of alert patients suspected of having subarachnoid haemorrhage? Stroke. 2006;37(10):2467–2472.

9. Ungerer JPJ, Southby SJ, Florkowski CM, George PM. Automated measurement of cerebrospinal fluid bilirubin in suspected subarachnoid hemorrhage. Clin Chem. 2004;50(10):1854–1856.

10. Petzold A, Keir G, Sharpe LT. Spectrophotometry for xanthochromia. N Engl J Med. 2004;351(16):1695–1696.

11. Petzold A, Keir G, Sharpe TL. Why human color vision cannot reliably detect cerebrospinal fluid xanthochromia. Stroke. 2005;36(6):1295–1297.

12. Judge B. Laboratory Analysis of Xanthochromia in Patients With Suspected Subarachnoidal Hemorrhage: A National Survey. Philadelphia, PA: Scientific Assembly, American College of Emergency Physicians; 2000.

13. Edlow JA, Bruner KS, Horowitz GL. Xanthochromia. Arch Pathol Lab Med. 2002;126(4):413–415.

14. Linn FH, Voorbij HA, Rinkel GJ, Algra A, van Gijn J. Visual inspection versus spectrophotometry in detecting bilirubin in cerebrospinal fluid. J Neurol Neurosurg Psychiatry. 2005;76(10):1452–1454.

15. Barrows LJ, Hunter FT, Banker BQ. The nature and clinical significance of pigments in the cerebrospinal fluid. Brain. 1955;78(1):59–80.

16. Fletcher RVJ. Defective Colour Vision. Bristol, UK: Adam Hilger; 1985.

17. Kristensen SR, Salling AM, Kristensen ST, Hansen AB. Unrecognized preanalytical problem with the spectrophotometric analysis of cerebrospinal fluid for xanthochromia. Clin Chem. 2008;54(11):1924–1925.

18. Fetter WJ. Subarachnoid haemorrhage. Penn Med J. 1943;49:949–956.

19. Taylor AB, Whitfield AGW. Subarachnoid haemorrhage: based on observations of eighty-one cases. Q J Med. 1936;5(4):461–472.

20. Richardson JC, Hyland HH. Intracranial aneurysms. In: Medicine. Vol 20. Baltimore, MD: Lippincott, Williams & Wilkins; 1941:1.

21. Jameson KA, Highnote SM, Wasserman LM. Richer color experience in observers with multiple opsin genes. Psychon Bull Rev. 2001;8(2):244–261.

22. Dain SJ. Clinical colour vision tests. Clin Exp Optom. 2004;87(4-5):276–293.

23. Cole BL. The handicap of abnormal colour vision. Clin Exp Optom. 2004;87(4-5):258–275.

24. Campbell JL, Griffin L, Spalding JA, Mir FA. The effect of abnormal colour vision on the ability to identify and outline coloured clinical signs and to count stained bacilli in sputum. Clin Exp Optom. 2005;88(6):376–381.

25. Spalding JA. Medical students and congenital colour vision deficiency: unnoticed problems and the case for screening. Occup Med (Lond). 1999;49(4):247–252.

26. Foroughi M, Parikh D, Wassell J, Hatfield R. Influence of light and time on bilirubin degradation in CSF spectrophotometry for subarachnoid haemorrhage. Br J Neurosurg. 2010;24(4):401–404.

27. Vermeulen M. Subarachnoid haemorrhage: diagnosis and treatment. J Neurol. 1996;243(7):496–501.

28. Roost KT, Pimstone NR, Diamond I, Schmid R. The formation of cerebrospinal fluid xanthochromia after subarachnoid haemorrhage. Neurology. 1972;22(9):973–977.

29. Morgan CJ, Pyne-Geithman GJ, Jauch EC, et al.. Bilirubin as a cerebrospinal fluid marker of sentinel subarachnoid haemorrhage: a preliminary report in pigs. J Neurosurg. 2004;101(6):1026–1029.

Back to Top | Article Outline
COMMENTS

The authors seek to determine the reliability of visual inspection for cerebrospinal fluid (CSF) xanthochromia as it applies to the diagnosis of subarachnoid hemorrhage in the setting of a normal computed tomography (CT) scan. There are several laboratory methods for determining the presence of blood products in CSF. Strong regional differences and access to technology, rather than diagnostic accuracy, influence the dominant method in any particular hospital.1 The presence of CSF xanthochromia by lumbar puncture >12 hours after ictus, as evaluated by xanthochromia, is considered positive for subarachnoid hemorrhage in much of the United States.2 Numerous studies have demonstrated the weakness inherent in xanthochromia.3-5 Spectrophotometry, considered by many to be more accurate than xanthochromia, requires expensive equipment and still relies on technician experience to detect CSF bilirubin. Regardless, the popularity of xanthochromia suggests that a study to determine the reliability of xanthochromia discrimination is of value because it has been used as the standard for CT-negative subarachnoid hemorrhage in recent large-scale publications.6

Previous studies attempting to determine xanthochromia accuracy using artificial or real CSF suffered from several methodological weaknesses, according to the authors of the present study. No previous study considered the effects of ambient lighting, sex, and presence of color-blindness among operators, and some excluded CSF contaminated with blood or oxyhemoglobin, although these are commonly encountered in clinical practice from both subarachnoid hemorrhage and traumatic lumbar puncture.

The present study uses mock CSF with various concentrations of bilirubin, albumin, and oxyhemoglobin, which result in fluid that could be interpreted as negative, borderline, or positive for xanthochromia on visual inspection. The authors then recruited 26 experienced operators to evaluate each tube of mock CSF in 2 different ambient light conditions: dark (fluorescent lighting) and light (full daylight), standardized as measured by a light meter. Importantly, all 11 male evaluators were screened for color-blindness. Each tube-light combination was evaluated twice, permitting intrarater and interrater reliability testing.

The authors found that far from being considered the gold standard, xanthochromia suffers severely from poor accuracy and reliability. Overall, 19.2% of samples were incorrectly classified; 88% of tubes colored yellow (indicative of high concentrations of bilirubin and thus true positives for subarachnoid hemorrhage) were misclassified as clear, suggesting a high false-negative rate for xanthochromia. Female raters were more accurate than male raters, and there was no effect of ambient light on the accuracy of classifications. More worrisome was the extremely poor intrarater and interrater reliability across all experimental conditions; only 1 of 56 samples was correctly identified by all observers in all tests. Unfortunately, the authors do not include spectrophotometry data of their mock CSF, preventing direct comparison with xanthochromia.

What do these results mean in the setting of widespread xanthochromia use? Delay of diagnosis of subarachnoid hemorrhage from a ruptured aneurysm is associated with high morbidity and mortality.7 It is often cited that approximately 2% of subarachnoid hemorrhages are CT negative. However, studies that attempt to determine the rate of CT-negative subarachnoid hemorrhage within a population have often used xanthochromia as the gold standard. If xanthochromia in clinical practice leads to a substantial number of false negatives, the rate of patients with actual subarachnoid hemorrhage missed by this so-called gold standard would be much higher, perhaps up to 20%. This does not appear to be the case, as is demonstrated in a recent multicenter Canadian study of clinical decision making in subarachnoid hemorrhage.6 Rather, any patient with a convincing history and borderline or inconclusive CSF results should undergo definitive cerebral angiography, as is practice in many institutions, including ours. The stakes are simply too high to trust in any single negative result.

Michael R. Levitt

Louis J. Kim

Seattle, Washington

1. Nagy K, Skagervik I, Tumani H, et al.. Cerebrospinal fluid analyses for the diagnosis of subarachnoid haemorrhage and experience from a Swedish study: what method is preferable when diagnosing a subarachnoid haemorrhage? Clin Chem Lab Med. 2013;51(11):2073–2086. Cited Here...

2. Edlow JA, Bruner KS, Horowitz GL. Xanthochromia. Arch Pathol Lab Med. 2002;126(4):413–415. PubMed Cited Here... |

3. Petzold A, Keir G, Sharpe TL. Why human color vision cannot reliably detect cerebrospinal fluid xanthochromia. Stroke. 2005;36(6):1295–1297. View Full Text | PubMed | CrossRef Cited Here... |

4. Arora S, Swadron SP, Dissanayake V. Evaluating the sensitivity of visual xanthochromia in patients with subarachnoid hemorrhage. J Emerg Med. 2010;39(1):13–16. PubMed | CrossRef Cited Here... |

5. Linn FH, Voorbij HA, Rinkel GJ, Algra A, van Gijn J. Visual inspection versus spectrophotometry in detecting bilirubin in cerebrospinal fluid. J Neurol Neurosurg Psychiatry. 2005;76(10):1452–1454. View Full Text | PubMed | CrossRef Cited Here... |

6. Perry JJ, Stiell IG, Sivilotti ML, et al.. Clinical decision rules to rule out subarachnoid hemorrhage for acute headache. JAMA. 2013;310(12):1248–1255. View Full Text | PubMed | CrossRef Cited Here... |

7. Mayer PL, Awad IA, Todor R, et al.. Misdiagnosis of symptomatic cerebral aneurysm: prevalence and correlation with outcome at four institutions. Stroke. 1996;27(9):1558–1563. View Full Text | PubMed | CrossRef Cited Here... |

The authors have prepared a manuscript of the intra- and inter-observer agreement in visual inspection for “xanthochromia” and it implications for the diagnosis of subarachnoid hemorrhage. This is an interesting study and is well written and the study design is good. The major criticism I have is that how do the authors know that artificial CSF reacts similarly to “real” CSF to the human retina. Substances such as the carotenoids may act as “interfering variables” to the human retina and may alter the results. Therefore this study may be significantly improved and may become a landmark study if the authors add in a cohort of real CSF to be analyzed using similar study techniques. The additional logistical problems will be well worth a great study. It is also important to remain a clinician as missing a ruptured aneurysm is disastrous and if the clinical suspicion is high, regardless of laboratory tests results, it is best to also get a MRA or a CTA. Xanthochromia is only one aspect of making the diagnosis and a highly suggestive history with no xanthrochromia is still not enough evidence to exclude a subarachnoid hemorrhage.

Gavin W. Britz

Houston, Texas

Keywords:

Observer agreement; Xanthochromia

Copyright © by the Congress of Neurological Surgeons

Login

Article Tools

Share

Search for Similar Articles
You may search for similar articles that contain these same keywords or you may modify the keyword list to augment your search.