With the looming explosion of clinical use of digital whole-slide imaging (WSI) in the United States, following the first of what will be many Food and Drug Administration approvals of scanning platforms for primary diagnostic use, an immediate set of operational questions arises: “How are pathology departments/practices going to pay for this?” “How can this technology add value?” and “What can WSI do that conventional microscopy cannot already handily address?” Recognizing that each whole-slide image represents an average storage requirement of nearly 1.6 GB, and in tandem, the average surgical pathology case including recuts, special stains, and immunohistochemistry slides contains 12.2 slides, this yields a whopping 19.5 GB of storage needs per case (Parwani et al, unpublished data). The question of costs associated with the maintenance of WSI storage has triggered pathology organizations to promote practice guidelines allowing for deletion of digital slide content used for clinical diagnosis. For example, in their most recent Laboratory Accreditation Program Checklists (2017), the College of American Pathologists requires only those cases where the glass is not available for the digital slides to be kept for the required 10-year regulatory period (ANP.12500, Record Retention).1
While these guidelines would seem to have the desired effect of curbing exponentially expanding storage requirements, they may also impart some unfortunate and irrecoverable operational consequences. For example, the signature property of digital workflow lies in the creation of an “indelible facsimile” of the original slide that is not subject to the same vagaries of glass slides, such as being misplaced, permanently lost, or otherwise degraded with the passage of time. Indeed, an internal study carried out at Massachusetts General Hospital in 2004 regarding the availability of critical slides from select cases surfaced the concerning trend that, with increasing time, an increasing percentage of slides was found to be missing. This metric started at 4% for the initial time point of 14 days post-sign-out, increasing to 6% at 21 days, 14% at 30 days, and plateauing at a staggering 19% at 60 days (Balis et al, unpublished data). For cases where there was a smaller number of slides, with perhaps only 1 or 2 “critical slides” in terms of establishing the diagnosis (typically cancer cases), the missing rate for at least one of such critical slides was even worse at 28%. The causes for these loss rates are manyfold, including attrition to personal collections, retention in pathologists’ offices, slide breakage, and slides being left in extramural locations (eg, tumor board conferences and remote send-out consultation sites). Clearly, capturing a high-quality digital image at the moment of slide creation results in a robust form of primary data at a point in time where the best possible optical morphology could be reasonably expected and while the case is still completely intact. Conversely, attempting to recollate and then scan a complex case with many slides, after a number of weeks have transpired, could prove to be an impossible task in many settings. In fact, empirical observation demonstrates that no amount of searching will surface the original full set of slides. Often in such circumstances, the most diagnostically important slides are the ones that are missing, as these are the most handled and most valued for slide sets.
Adding fuel to the fire, there are circumstances where the entire paraffin block for a case is consumed, and the only tissue available for additional studies, such as molecular analysis, is in the remaining physical slides. Extraction of the tissue from these clinical slides destroys the slides themselves, leaving WSI as the only way to fundamentally preserve the original diagnostic data. This scenario illustrates why proper slide retention policies and procedures are necessary to ensure the accidental deletion of these critical slide assets does not occur.
Finally, given the explosive growth of machine learning techniques being applied to histomorphology, there is significant utility in having immediate access to large image repositories, including both normal and abnormal cases. With ever-increasing computational scale available to assist in carrying out complex image analysis tasks, the metric of increasing algorithm accuracy is unambiguously coupled to ever-increasing data set sizes. For the anticipated WSI libraries that will be in use, this equates to greater and greater numbers of cases and associated slides—with associated diagnostic metadata. Given this expanding need for primary image data (in academic centers, at a minimum), it is lamentable that investigators would consider deleting their hard-fought WSI data, especially considering the difficulty associated with reconstructing cohorts of digital cases after some length of time has elapsed.
Indeed, the cost of deleting WSI content may exceed the incremental storage cost of maintaining a complete digital image library, especially in light of the effort needed to generate such data in the first place. While keeping all slides indefinitely would be ideal, given the reality of the costs associated with such an archive, 2 potential compromise scenarios emerge. The first would be the selective culling of WSI content that is of “secondary importance,” leaving only the “high-value targets.” This scenario relies exclusively on the pathologist to both identify and flag the slides that do contain high-value diagnostic and prognostic information versus the “uninteresting” ones that do not. Unfortunately, pathologists are not infallible in this regard. As shown in Beck and colleagues'2 seminal 2011 article, computationally extracted attributes of stroma, and not the foreground breast carcinoma, were found to be the most predictive features of patient outcome. The second scenario involves the regular off-loading of digital slides used for primary diagnosis from a more expensive, high-performance storage array to a cheaper, research-use archive. The goal in this setting would be to properly manage storage by recognizing the potential future use cases that have not yet been described or discovered, while doing so in a manner that is not uninformed of cost.
A hybrid between these approaches would incorporate a balanced approach, where some percentage of high-volume cases are archived, including those cases flagged by a pathologist, and in tandem, keeping all imagery derived from low-incidence cases. This would mitigate storage costs to some degree while still capturing a cross section of WSI data that can be used for research and training. Finally, in what might be envisioned as a postmodern state, where digital pathology is more completely utilized in an automated fashion, it is not impossible to imagine the availability of computational pipelines that utilize machine vision to autonomously decide which slides are most meritorious for retention as WSI data. While such capability is not available now, it certainly could be within the decade.
Clearly, there is still much to discover about the encoded diagnostic and prognostic data that sit fallow in WSI content. Consequently, deleting these data, en masse, is the last thing we should be considering, especially given that the “quantitative age” of anatomic pathology is about to unfold, harking the dawn of a new golden age for morphology.