Optimal skull base neurosurgery requires personalized surgical treatment strategies based on clinical, radiographical, and pathological data. Skull base lesions are diverse and span the full pathology spectrum, including inflammatory, infectious, and neoplastic diseases. Look-a-like lesions and uncommon radiographical or clinical features can lead to diagnostic errors and potentially increase surgical morbidity.1-4 In addition to tumor diagnosis, rapid microscopic assessment of tumor resection cavities for residual tumor burden could increase gross total resection and reduce tumor recurrence rates. Residual tumor burden is the major cause of tumor recurrence in both benign and malignant skull base tumors.5,6 An intraoperative pathology workflow that could provide rapid and accurate evaluation of skull base surgical specimens has the potential to guide personalized treatment strategies and improve surgical outcomes.
Our standard of care for intraoperative assessment of surgical specimens is based on hematoxylin and eosin staining of processed surgical specimens and requires interpretation by a board-certified pathologist. Tissue processing is extensive, requiring transport, staining, sectioning, and mounting of the specimen. The turnaround times for intraoperative specimen interpretation (20-90 minutes) discourage routine use in skull base neurosurgery, particularly for tumor margin assessment.7 Moreover, the pathology workforce is contracting, with an overall reduction of 18% between 2007 and 2017.8,9 In this study, we propose an alternative workflow for rapid interpretation of surgical specimens using optical imaging and artificial intelligence (AI).
Stimulated Raman histology (SRH) is a rapid, label-free, high-resolution, optical imaging method used for intraoperative evaluation of fresh, unprocessed tissue specimens.10,11 We have previously shown that SRH combined with AI models can achieve human-level performance for the intraoperative diagnosis of the most common brain tumor subtypes and recurrent primary brain tumors.12,13 Our models detect cytological and histomorphological features in brain tumors to provide near real-time diagnoses (<2 minutes) without the need for tissue processing or human interpretation.
In this study, we aim to develop an integrated computer vision system for rapid intraoperative interpretation of skull base tumors using SRH and AI. To improve on our previous methods, we applied a new AI training technique, contrastive representation learning, which boosted our model's ability to detect diagnostic features in SRH images. We show that this model can effectively segment tumor-normal margins and detect regions of microscopic tumor infiltration in grossly normal surgical specimens, allowing for robust margin delineation in meningioma surgery.
Study objectives were to (1) determine whether SRH can capture the diagnostic features of skull base tumors, (2) develop an AI-based computer vision system that combines clinical SRH and deep neural networks to achieve human-level performance on the intraoperative classification of skull base tumors, and (3) demonstrate the feasibility of using our model to detect microscopic tumor infiltration in meningioma surgery. After Institutional Review Board approval (HUM00083059), this study began on June 1, 2015. Inclusion criteria were the following: (1) patients with planned brain tumor resection, including skull base surgery, at Michigan Medicine (UM) and New York University; (2) subject or durable power of attorney able to give informed consent; and (3) subjects in whom there was additional specimen beyond what was needed for routine clinical diagnosis. We then trained and validated a benchmarked convolutional neural network (CNN) architecture (ResNets14) on the classification of fresh surgical specimens imaged with SRH. CNN performance was then tested using a held-out, multicenter (UM and NYU) prospective testing SRH data set.
All images were obtained using a clinical fiber laser–based stimulated Raman scattering (SRS) microscope.12,15 The NIO Laser Imaging System (Invenio Imaging, Inc) is delivered ready to use for image acquisition and requires a single technician to operate with minimal training. Viewing SRH images can be performed directly in the operating room or remotely through medical center radiographic system or cloud-based viewer. Fresh, unprocessed, surgical specimens are excited with a dual-wavelength fiber laser as specified in our previous publications.11,12 These specifications allow for imaging at Raman shifts in the range of 2800 to 3130 cm−1. The NIO Imaging System was used to acquire all images in the testing set.12 For SRH, 2850 and 2950 cm−1 are the wavenumbers used to acquire the 2 channel images Lipid-rich regions (eg, myelinated white matter) demonstrate high SRS signal at 2845 cm−1 because of CH2 symmetric stretching in fatty acids. Cellular regions produce high 2930 cm−1 intensity and large signal 2930 to 2845 ratios to high protein and nucleic acid content. A virtual hematoxylin and eosin color scheme is applied to transform the raw SRS images into SRH images for clinical use and pathological review.
SRH combined with AI is an off-label use of the NIO Laser Imaging System. The AI and algorithms discussed are for research purposes only and have not been reviewed or approved by the US Food and Drug Administration.
Image Data Set and Data Preprocessing
SRH imaging was completed using 2 imaging systems: a prototype clinical SRH microscope11 and the NIO Imaging System. All collected clinical specimens were imaged in the operating room using our SRH imagers. In addition, we used cadaveric specimens of normal tissue (brain, dura, and pituitary gland) to improve our classifiers ability to detect normal tissue and avoid false-positive errors. Specimens compromised by hemorrhage, excessive coagulation, or necrosis were excluded. For image preprocessing, the 2845 cm−1 image was subtracted from the 2930 cm−1 image, and the resultant image was concatenated to generate a 3-channel SRH image (2930 cm−1 minus 2845 cm−1, red; 2845 cm−1, green; and 2930 cm−1, blue). A 300 × 300 pixel2 nonoverlapping sliding-window algorithm was used to generate image patches. Our laboratory has previously trained a neural network model that filters images into 3 classes for automated patch-level annotation: normal brain, tumor tissue, and nondiagnostic tissue.12,13 Normal dura was included in the nondiagnostic class because it lacks cytological features (Figure 1).
Only tumor classes with >15 patients were included: pituitary adenomas, meningiomas, schwannomas, primary central nervous system lymphoma, and metastases. Normal classes included normal brain (gray matter and white matter) and normal pituitary gland (anterior gland and posterior gland). Six hundred patients were included in the training set. We implemented the ResNet50 CNN architecture with 25.6 million trainable parameters for our SRH feature extractor.14 Three loss functions were used for model training: supervised categorical cross-entropy, self-supervised contrastive,16 and supervised contrastive.17 The general contrastive loss function iswhere is the vector representation of image X after a feedforward pass through the SRH feature extractor, px is the representation of positive examples for image X, and is the set of negative examples for image X (Figure 1B). Positive examples can be transformations of the same image (self-supervised) or different images sampled from the same class (supervised). The feature extraction model produces a 2048-dimension feature vector for each input image, and each feature vector is further projected down to 128 dimensions before the cosine similarity metric (sim) is computed. Contrastive loss functions have some theoretical advantages over cross-entropy (ie, robustness to label noise), and we hypothesize that contrastive representation learning is ideally suited for patch-based classification. The contrastive learning models were optimized using stochastic gradient descent, and each model was trained using a batch size of 176 for 4 days on 8 Nvidia GeForce RTX 2080 Ti graphical processing units (GPUs). After the feature extraction model training was completed, these features were classified using a linear classifier trained using cross-entropy loss (see Figure 1C). Each linear classification layer was trained using the Adam optimizer and a batch size of 64 for 24 hours on 2 Nvidia GeForce GPUs. We compared our approaches with a conventional model trained using cross-entropy and a batch size of 64 for 24 hours on 2 Nvidia GeForce GPUs.
We randomly held out 20% of our data as a testing data set consisting of 118 patients and 489 whole slides. Similar to our training data preparation, 300 × 300 pixel patches were generated from a whole-slide image, and each patch underwent a feedforward pass through our trained models to compute a probability distribution over the output classes. To compute the whole-slide–level or patient-level accuracy, we summed the patch-level probability distributions for each whole slide or patient, respectively. The aggregated probabilities were then renormalized to compute the final slide–level or patient-level class probabilities. This “soft” aggregation of the classification is superior to “hard” aggregation of the patches, such as a simple majority voting procedure, because it takes into account the full probability distribution for each patch.12
SRH Semantic Segmentation of Skull Base Tumors
We previously developed a method for segmenting SRH images using patch-level predictions.12,13 This technique integrates a local neighborhood of overlapping patch prediction to generate a high-resolution probability heatmap. In a previous study, we implemented a 3-channel (RGB) probability heatmap which included spatial information for tumor, normal brain, and nondiagnostic predictions. In this study, we used a novel technique that generated a 2-channel image with the predicted tumor class (eg, pituitary adenoma or craniopharyngioma) as the first channel (ie, red) and the most probable nontumor class (eg, normal pituitary, normal brain, and nondiagnostic) as the second channel (ie, blue). This method has an advantage in the setting of skull base tumors by allowing the nontumor class to vary depending on the surgical specimen. For example, it will automatically produce a meningioma-normal dura margin heatmap based on the predicted meningioma diagnosis.
The data that support the findings of this study are available from the corresponding author/authors on reasonable request.
Diagnostic Features of Skull Base Tumors
We first assessed the ability of SRH to effectively capture the diagnostic features of normal skull base parenchyma and skull base tumors. Figure 1A shows the general workflow for obtaining SRH images. Figures 2A-2C show the SRH images of normal brain, anterior pituitary gland, and skull base dura. Classic histological features are seen, including neuronal cell bodies in gray matter, acinar histoarchitecture in pituitary gland, and dense collagen extracellular matrix in dura. Meningiomas, pituitary adenomas, and schwannomas are the most common skull base tumors encountered (Figure 2D-2F). SRH captures spindle cell cytology and Antoni histoarchitectural patterns in schwannomas, monotonous hypercellularity in pituitary adenomas, and meningioma whorls. Less common and malignant tumors are shown in Figure 2G-2I. Wet keratin is well-visualized in adamantinomatous craniopharyngiomas. Bubble, physaliferous cells are abundant in clival chordomas. Chondrocytes embedded in a dense cartilaginous matrix are seen in skull base chondrosarcomas.
Automated Classification of Skull Base Tumors
After determining that SRH can effectively capture the diagnostic features in SRH images, we then trained our CNN using the 3 representation learning methods (Figure 1B). All models were trained for 4 days and then tested on our held-out multicenter data set (Table). We evaluated our model at the patch, slide, and patient levels using overall top-1 accuracy, top-2 accuracy, and mean class accuracy. Using these metrics, the model trained using supervised contrastive representation learning had the best overall performance, with top scores in all 3 metrics. Our supervised contrastive model achieves a patient-level diagnostic accuracy of 96.6% (114 of 118 patients) and a mean class accuracy of 93.4%. These results outperformed our cross-entropy model and significantly improved on our previous results (Figure 3).12 An important finding was that the metastatic tumor class was a major source of diagnostic errors for the cross-entropy model. We believe that this represents the inability of cross-entropy to effectively represent classes with highly diverse image features (eg, melanoma vs adenocarcinoma vs squamous cell carcinoma).
Model Performances on Held-Out, Multicenter SRH Testing Set
|SSL + linear
|SupCon + linear
The bold entries signifies the best performing model in each metric.
Acc, accuracy; MCA, mean class accuracy; CE, cross-entropy; SSL, self-supervised contrastive learning; SupCon, supervised contrastive learning.
Top 2, correct class was predicted first or second more probable.
Visualizing Learned SRH Representations
We aimed to qualitatively evaluate how effectively the models represented our SRH images. We used a data visualization technique called t-distributed stochastic neighbor embedding, which projects high-dimensional data onto a 2-dimensional plane by preserving the local patterns in the data. Data points with similar representations are located in close proximity, forming discrete clusters. Compared with cross-entropy or self-supervised contrastive learning, the supervised contrastive model shows the most well-formed clusters that match tumor diagnoses (Figure 4). The most salient improvement is how much more effectively the metastatic class is clustered; contrastive representation learning explicitly enforces that the model learns image features which are common to each tumor class, regardless of how diverse the underlying pathology may be (eg, melanoma vs adenocarcinoma).
Detection of Microscopic Tumor Infiltration in Skull Base Specimens
Using a patch-based classification method allows for a computationally efficient whole-slide SRH semantic segmentation method. SRH segmentation allows for improved image interpretation by surgeons and pathologists by providing spatial information along with the predicted diagnosis (Figure 5). Moreover, regions of microscopic tumor infiltration can be automatically detected and highlighted in SRH images. Tumor infiltration can be identified using the patch-level predictions (Figure 6).12,13 Importantly, detection of meningioma infiltration into grossly normal dura can improve extent of resection and potentially decrease recurrence rates. Our model detected microscopic tumor infiltration during skull base meningioma surgery (Figure 7). Some dural regions with contrast enhancement (ie, dural tails) did not show evidence of microscopic tumor infiltration, whereas other dural regions with no enhancement had clear evidence of meningioma involvement. These results demonstrate both the feasibility and the importance of microscopic evaluation of meningioma tumor margins.
In this study, we show that the combination of SRH and AI can provide an innovative pathway for intraoperative skull base tumor diagnosis and detection of microscopic tumor infiltration. We were able to achieve a +5.1% boost in diagnostic classification accuracy using contrastive representation learning compared with our previous AI training methods using cross-entropy. The model effectively identified regions of microscopy brain tumor infiltration and tumor-normal margins in meningioma SRH images.
Over the previous decade, the applications of AI in clinical medicine and neurosurgery have grown tremendously. Human-level diagnostic accuracy for image classification tasks has been achieved in multiple medical specialties, including ophthalmology,18 radiology,19 dermatology,20 and pathology.21,22 AI for intraoperative diagnostic decision support has been combined with mass spectrometry,23,24 optical coherence tomography,25 infrared spectroscopy,26,27 and Raman spectroscopy.28,29 We believe that the combination of advanced biomedical optical imaging and the latest discoveries in AI has the potential to provide accurate and real-time decision support for surgeons and pathologists.
A limitation of our study is the limited subset of skull base tumors, consisting of the most common skull base tumors and the most common “look-a-like” lesions. We aimed to determine whether, given a sufficient amount of training data, we could develop an alternative diagnostic system using SRH and AI. Because additional SRH training data become available for rare tumors, future studies will include additional skull base tumor diagnoses. Our proposed contrastive representation learning method is able to accommodate additional diagnostic classes without changing the training methodology described here.
Future directions include moving beyond histopathological diagnosis toward phenotypic and molecular characterization of brain tumors. The proposed model training technique is flexible, and data labels/model output can be easily changed or extended to include tumor grade, proliferation indices, and molecular diagnostic mutations. In addition, access to fresh tumor specimens provides a unique opportunity to develop optical imaging–based prognostic biomarkers that have the potential to predict response to treatment (eg, immunotherapy) and long-term clinical outcomes better than standard diagnostic methods alone.
Rapid intraoperative margin delineation in both benign tumors and malignant skull base tumors, including chordomas and sinonasal carcinomas, has the potential to improve recurrence-free and overall survival. This study demonstrated the general feasibility of using SRH and AI for the detection of microscopic tumor infiltration in real time at the surgical bedside. We applied these methods specifically to meningiomas because intraoperative Simpson grading is at risk for underestimating residual tumor, especially for grade I and II meningiomas.5 The proposed method may reduce residual tumor burden through rapid microscopic assessment of meningioma specimens.
We would like to acknowledge Tom Cichonski for his editorial contributions.
The study was partially funded by R01CA226527 (Dr Orringer).
Dr Orringer and Dr Hollon are shareholders in Invenio Imaging, Inc. Dr Pandian is an employee of Invenio Imaging, Inc. Dr Freudiger is an employee, executive, and shareholder in Invenio Imaging, Inc. The other authors have no personal, financial, or institutional interest in any of the drugs, materials, or devices described in this article. Dr Orringer has also received grants/payments from NX Development Corporation, Stryker Instruments, Designs for Vision, and DXCover (for serving on the Scientific Advisory Board).
1. Altshuler DB, Andrews CA, Parmar HA, Sullivan SE, Trobe JD. Imaging errors in distinguishing pituitary adenomas from other sellar lesions. J Neuroophthalmol. 2021;41(4):512-518.
2. Hollon T, Camelo-Piragua SI, McKean EL, Sullivan SE, Garton HJ. Surgical management of skull base Rosai-Dorfman disease. World Neurosurg. 2016;87:661.e5-661.e12.
3. Ierokomos A, Goin DW. Primary CNS lymphoma in the cerebellopontine angle. Report of a case. Arch Otolaryngol. 1985;111(1):50-52.
4. Kunimatsu A, Kunimatsu N. Skull base tumors and tumor-like lesions: a pictorial review. Pol J Radiol. 2017;82:398-409.
5. Ueberschaer M, Vettermann FJ, Forbrig R, et al. Simpson grade revisited—intraoperative estimation of the extent of resection in meningiomas versus postoperative somatostatin receptor positron emission tomography/computed tomography and magnetic resonance imaging. Neurosurgery. 2020;88(1):140-146.
6. Zhai Y, Bai J, Li M, et al. A nomogram to predict the progression-free survival of clival chordoma. J Neurosurg. 2019;134(1):144-152.
7. Novis DA, Zarbo RJ. Interinstitutional comparison of frozen section turnaround time. A College of American Pathologists Q-Probes study of 32868 frozen sections in 700 hospitals. Arch Pathol Lab Med. 1997;121(6):559-567.
8. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.
9. Robboy SJ, Weintraub S, Horvath AE, et al. Pathologist workforce in the United States: I. Development of a predictive model to examine factors influencing supply. Arch Pathol Lab Med. 2013;137(12):1723-1732.
10. Freudiger CW, Min W, Saar BG, et al. Label-free biomedical imaging with high sensitivity by stimulated Raman scattering microscopy. Science. 2008;322(5909):1857-1861.
11. Orringer DA, Pandian B, Niknafs YS, et al. Rapid intraoperative histology of unprocessed surgical specimens via fibre-laser-based stimulated Raman scattering microscopy. Nat Biomed Eng. 2017;1:0027.
12. Hollon TC, Pandian B, Adapa AR, et al. Near real-time intraoperative brain tumor diagnosis using stimulated Raman histology and deep neural networks. Nat Med. 2020;26(1):52-58.
13. Hollon TC, Pandian B, Urias E, et al. Rapid, label-free detection of diffuse glioma recurrence using intraoperative stimulated Raman histology and deep neural networks. Neuro Oncol. 2020;23(1):144-155.
14. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. Paper presented at 23rd International Conference on Pattern Recognition, Cancun, Mexico. International Association of Pattern Recognition; 2016.
15. Freudiger CW, Yang W, Holtom GR, Peyghambarian N, Xie XS, Kieu KQ. Stimulated Raman scattering microscopy with a robust fibre laser source. Nat Photon. 2014;8(2):153-159.
16. Chen T, Kornblith S, Norouzi M, Hinton G. A Simple Framework for Contrastive Learning of Visual Representations; 2020. arXiv.2002.05709.
17. Khosla P, Teterwak P, Wang C, et al. Supervised Contrastive Learning; 2020. arXiv.2004.11362.
18. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410.
19. Titano JJ, Badgeley M, Schefflein J, et al. Automated deep-neural-network surveillance of cranial images for acute neurologic events. Nat Med. 2018;24(9):1337-1341.
20. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
21. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.
22. Lu MY, Chen TY, Williamson DFK, et al. AI-based pathology predicts origins for cancers of unknown primary. Nature. 2021;594(7861):106-110.
23. Calligaris D, Feldman DR, Norton I, et al. MALDI mass spectrometry imaging analysis of pituitary adenomas for near-real-time tumor delineation. Proc Natl Acad Sci USA. 2015;112(32):9978-9983.
24. Santagata S, Eberlin LS, Norton I, et al. Intraoperative mass spectrometry mapping of an onco-metabolite to guide brain tumor surgery. Proc Natl Acad Sci USA. 2014;111(30):11121-11126.
25. Juarez-Chambi RM, Kut C, Rico-Jimenez JJ, et al. AI-assisted in situ detection of human glioma infiltration using a novel computational method for optical coherence tomography. Clin Cancer Res. 2019;25(21):6329-6338.
26. Hollon TC, Orringer DA. Shedding light on IDH1 mutation in gliomas. Clin Cancer Res. 2018;24(11):2467-2469.
27. Uckermann O, Juratli TA, Galli R, et al. Optical analysis of glioma: Fourier-transform infrared spectroscopy reveals the IDH1 mutation status. Clin Cancer Res. 2018;24(11):2530-2538.
28. Jermyn M, Mok K, Mercier J, et al. Intraoperative brain cancer detection with Raman spectroscopy in humans. Sci Transl Med. 2015;7(274):274ra219.
29. Kast R, Auner G, Yurgelevic S, et al. Identification of regions of normal grey matter and white matter from pathologic glioblastoma and necrosis in frozen sections using Raman imaging. J Neurooncol. 2015;125(2):287-295.