Automatic Evaluation of Histological Prognostic Factors Using Two Consecutive Convolutional Neural Networks on Kidney Samples : Clinical Journal of the American Society of Nephrology

Journal Logo

Original Article: Glomerular and Tubulointerstitial Diseases

Automatic Evaluation of Histological Prognostic Factors Using Two Consecutive Convolutional Neural Networks on Kidney Samples

Marechal, Elise1,2,3; Jaugey, Adrien2,4; Tarris, Georges2,5; Paindavoine, Michel2,4,6; Seibel, Jean1,7; Martin, Laurent2,8; Funes de la Vega, Mathilde8; Crepin, Thomas2,3,7; Ducloux, Didier2,3,7; Zanetta, Gilbert1; Felix, Sophie5; Bonnot, Pierre Henri1; Bardet, Florian2,9; Cormier, Luc2,9; Rebibou, Jean-Michel1,2,3; Legendre, Mathieu1,2,3

Author Information
CJASN 17(2):p 260-270, February 2022. | DOI: 10.2215/CJN.07830621
  • Free
  • Infographic
  • SDC



Several studies have recently highlighted the importance of specific histologic criteria in the analysis of the kidney parenchyma on radical nephrectomies and pretransplant biopsies (1–4). These criteria include glomerular density, glomerular volume, vascular intimal thickness, and severity of interstitial fibrosis/tubular atrophy (IF/TA). The glomerular density, especially when extrapolated in total nephron number, has thus appeared as a major prognostic factor (5). Furthermore, glomerular hypertrophy, a classic feature of diabetes (6), hypertension (7), and obesity (8), seems associated with the onset of kidney failure in patients with surgical nephron mass reduction (1–3). Glomerular permeability, vascular intimal thickness, and IF/TA are also well-established prognostic markers of kidney function decrease (9–11).

Despite their interest, these markers are rarely used in common practice and/or are on the basis of imprecise and poorly reproducible assessments. Indeed, manual tracing of the elements of interest (such as the glomerular area to calculate the glomerular density and the glomerular volume) is time consuming and tedious. Although the degree of luminal stenosis is a key element in the evaluation of vascular lesions, semiquantitative visual assessments are preferred to the exact measurements of the areas of internal and external elastic laminas. Similarly, by lack of any reliable evaluation method, IF/TA analysis relies mainly on imprecise visual scales with broad interindividual variability (11,12).

Artificial intelligence and particularly deep learning has led to many advances in nephropathology (12–15). Convolutional neural networks (CNN) allowed precise segmentation of kidney structures, such as glomeruli, vessels, and tubules (16,17). These results represent significant advances but lack clinical applications (12,18). Most of the studies are concerned in descriptive histology on a previously delineated cortical area. The evaluation of IF/TA in a medullary area could thus lead to a false overvaluation. The development of two CNNs, to first delimit the cortical area and then the kidney structures, could allow the quantification of histologic lesions. Automated collection of histologic prognostic data would increase accuracy and reproducibility, while saving time.

This work sought to develop a free tool (composed of two CNNs) to automatically segment the cortex and then obtain glomerular volume, glomerular density, glomerular permeability, and the degree of IF/TA and intimal thickness.

Materials and Methods

Study Population

The kidney biopsies were obtained from kidney tissue analyzed between January 2010 and December 2020 at the Dijon University Hospital or between January 2016 and December 2020 at the Besançon University Hospital. The diagnosis associated with the biopsies were either minimal change disease (MCD), IgA nephropathy (without optical glomerular lesions in regions of interest; ROI), or focal interstitial nephritis (inflammatory focus at a distance from ROI). The nephrectomy samples were obtained from patients who underwent a kidney tumor resection between January 2016 and December 2018 at Dijon University Hospital. The samples were analyzed by three pathologists who confirmed the absence of any glomerular lesion.

Clinical and biologic data from patients were collected. The eGFR were estimated using the Chronic Kidney Disease Epidemiology Collaboration formula. For patients who were nephrectomized, data regarding follow-up 1 year after nephrectomy were collected. This work complied with the Declaration of Helsinki and was approved by the local ethics committee. Patients gave oral informed consent before the study.

Histologic Analyses

The kidney samples were formalin fixed and paraffin embedded, cut into sections 5 µm thick, and stained with blue (n=101) or green (n=140) Masson's Trichrome. The digitization of biopsy samples was carried out with a Nanozoomer 2.0 HT (Hamamatsu, Japan) (model C9600–12) at ×200 magnification and a 454 nm/pixel resolution. ROI were manually annotated using the ASAP annotation software ( The annotations were made by two pathologists who had a degree in nephropathology. Their results were then merged.

Training, Test, and Application Cohorts

Biopsies and nephrectomies (n=241) were split into three independent cohorts (Figure 1). The Training cohort (n=65) was used to train separately two CNNs: one neural network to train the detection of the cortical area and a second to train the recognition of kidney structures. The Test cohort (n=50) was used to validate the performances of the CNNs by comparing outlined ROI to predicted ones. The Application cohort (n=126) evaluated prognostic data on kidney samples, which compared histopathological assessment to CNN assessment. Cortical areas of kidney biopsies were segmented by the first CNN. These areas were then segmented by the second CNN.

Figure 1.:
Flow chart of the Training/validation, Test, and Application cohorts used to develop the convolutional neural networks. ROI, region of interest; MCD, minimal change disease.

Training and Testing the Neural Networks

We used Matterport’s implementation of Mask R-CNN, a deep neural network used to detect and classify objects of an image, while predicting a mask representing the pixels associated with each object, also called instance segmentation. It is directly on the basis of the original Mask R-CNN and uses a ResNet-101-FPN, which is adapted to recognize small objects that can exist at different sizes. This model is on the basis of the Faster R-CNN detector, which predicts a mask for each detection (19). Images were processed on a computer equipped with a GTX 1080-Ti graphics card (11 GB VRAM). The only data modification performed in this work was a spatial augmentation, with a 50% probability at each epoch of the training.

One to nine ROI per kidney sample were processed with ×25 magnification (30,720*17,400 pixels) for cortex segmentation (first CNN) and ×200 magnification (3840*2176 pixels) for segmentation of kidney structures (second CNN). The localization of each ROI was randomly selected. The number of ROI was chosen to include ≥80 elements per class. For the first CNN, 220 ROI were used for training and 54 ROI for testing. The categories of annotations were: cortical area, medullary area, and capsule. Preprocessing of the pictures decreased the resolution to 2048*2048 pixels and cut them into four pictures of 1024*1024 pixels. The network was trained on 150 epochs. For the second CNN, 173 ROI and 67 ROI were used for training and testing, respectively. To assess the mean glomerular volume and glomerular density, glomeruli were annotated differently depending on whether the vascular pole of their tuft was visible (complete glomeruli) or not (partial glomeruli) (2,4,20). The other annotation categories were permeable glomerulus (delimited by the glomerular capsule), sclerotic glomerulus, normal tubule, atrophic tubule, artery, vein, internal, and external elastic laminas. Nonannotated cortical tissue was labeled interstitium. The numbers of annotated objects in each category are described in Supplemental Table 1. A preprocessing of the pictures divided them into smaller vignettes (1024*1024 pixels), with overlapping areas of ≥33% between them. Therefore, 2205 vignettes were used to train the CNN on 236 epochs. For both CNNs, the pictures underwent postprocessing to fuse the masks from different vignettes and filter the masks according to preestablished rules.

Application of the Neural Networks

For the Application cohort, large kidney samples were analyzed (the whole biopsy samples and five randomly distributed rectangles of cortical area of 1.7 mm2 each of nontumoral kidney). The manual assessment was made before, and independent of, the automated one. Only the cortical area, glomerular tufts, and sclerotic glomeruli were manually outlined. Due to the large size of the samples, no performance evaluation or confusion matrix was performed in this cohort. The degree of IF/TA was visually assessed (with a 5% step). If a 20% difference (or more) was met in the assessment by the two pathologists, a third nephropathologist was involved. Otherwise, the ground truth was the mean of their two assessments. For biopsies with a medium-sized cross-section artery (n=96 out of 126), an additional ROI was selected to assess luminal stenosis. Methods for calculating the prognosis morphometric data and histologic lesions level are detailed in Supplemental Table 2 (3,4,16,21).

Kidney Computed Tomography Scans

The prenephrectomy computed tomography scans of the nontumor kidneys were used to determine the cortical kidney volume by measuring the large axis, the average cortical thickness, and the cortex/parenchyma ratio in transhilar axial cut, according to the method described by Glodny et al. (22). The cortical area enhanced in the arterial phase was delineated over a series of 20 axial cuts (Syngo.via, Siemens Healthineers). By extrapolation, the cortical volume was considered similar between the two kidneys (23). The total number of glomeruli per kidney was estimated by multiplying the glomerular density by the cortical volume (24) (Supplemental Figure 1).

Statistical Analysis

Quantitative variables were expressed as mean±SD. Performance for the detection and classification of objects was assessed by calculating Precision (the percentage of items belonging to a class among all of the items predicted to belong to it) and Recall (the percentage of items predicted to belong to a class among all of the items belonging to it). The F-score, which allowed us to reunite these two concepts simultaneously, was calculated according to the following formula: 2*(Precision*Recall)/(Precision+Recall). Intersection over union was also evaluated and calculated (25). The Matthews correlation coefficient suitable for multiclass classification was calculated to evaluate the performances of the second CNN (26). The correlations between the data were assessed by Pearson or Spearman correlation tests on the basis of whether the distribution was normal. Bland–Altman analyses were used to assess biases and the accuracy of automated analysis (27). Receiver operating characteristic (ROC) curves were constructed for the prediction of clinically relevant parameters. Statistical analyses were performed with a bilateral alpha risk of 5% using GraphPad PRISM 6.01 software (GraphPad Software, La Jolla, CA) and IBM SPSS 23 (Statistical Package for the Social Sciences 23.0, IBM, Chicago).


Evaluation on the Test Cohort

Both developed CNNs are freely accessible. They can be used on local computers or online ( Patients’ characteristics are described in Table 1. In the Test cohort for the first network, 94% of pixels were correctly predicted to be part of the cortical area. The second network had a strong recognition of parenchymatous structures (Figure 2). Indeed, 96% of permeable glomeruli and 90% of normal tubules were correctly identified. The partial glomeruli, atrophic tubules, veins, and arteries were the structures with the lowest recognition rate. The Precision, Recall, and F-scores for object recognition are described in Table 2. The confusion matrix for the pixels of the structures detected by the second CNN is described in Figure 3. The elements were well segmented by the second CNN. For instance, the neural network correctly predicted the category for 96% of the pixels belonging to the category of normal tubules (pixels having been manually assigned to this category) and for 93% of the pixels belonging to the category of nonsclerotic glomeruli.

Table 1. - Characteristics of patients with a kidney sample included in the Test and Application cohorts
Characteristics Test Cohort Application Cohort
All Patients (n=50) All Patients (n=126) Nephrectomy (n=42) Biopsy (n=84)
Age, yr 48±20 51±21 65±11 43.7±20
Male sex, n (%) 35 (70) 88 (70) 31 (74) 57 (68)
Hypertension, n (%) 26 (52) 55 (44) 31 (74) 24 (29)
Diabetes, n (%) 6 (12) 26 (21) 23 (24) 3 (4)
IgA nephropathy, n (%) 37 (74) 24 (19) 0 (0) 24 (29)
Minimal change disease, n (%) 7 (14) 60 (48) 0 (0) 60 (71)
Kidney tumor resection, n (%) 6 (12) 42 (33) 42 (100) 0 (0)
eGFR at diagnosis, ml/min per 1.73 m2 78±29 82±27 71±28 88±27
eGFR 12 months after nephrectomy, ml/min per 1.73 m2 49±19
Chemotherapy within the year after biopsy, n (%) 7 (17)
Cortical volume of nontumoral kidney, mm3 146,779±53,168
Variables are expressed as mean±SD.

Figure 2.:
Kidney samples stained with Masson’s trichrome before and after convolutional neural network predictions. A region of interest (A) with the neural network prediction (B) (×200 magnification). False positives were observed with a peritubular capillary wrongly recognized as a vein (*) and interstitial tissue as an artery (¥). Internal and external elastic laminas were wrongly overexpanded in the upper vessel. A kidney biopsy of a patient with a minimal change disease (C) with the prediction of the cortical area (D) and smaller kidney elements (E) (×10 magnification). (B), (E) normal tubules (red), atrophic tubules (orange), Bowman’s capsule (yellow), nonsclerotic glomeruli (light green), and globally sclerotic glomeruli (light blue), internal elastic lamina (pink), external elastic lamina (purple), vein (deep blue). (D) Cortical area (red), capsule (deep blue), medullary area (green).
Table 2. - Convolutional neural network accuracy and recall for elements of interest (in number) in 45 regions of interest (Test cohort)
Elements of Interest Precision, % a Recall, % b F-score c Intersection over Union d
Normal tubules 91 90 0.90 0.88
Atrophic tubules 70 68 0.69 0.71
Nonsclerotic glomeruli 98 96 0.97 0.92
Global glomeruli 75 65 0.70 0.75
Partial glomeruli 79 61 0.69 0.66
Globally sclerotic glomeruli 53 89 0.67 0.78
Arteries 57 64 0.61 0.62
Veins 46 66 0.54 0.63
aPrecision: percentage of items belonging to the interest class among items identified as belonging to the interest class.
bRecall: percentage of items identified as belonging to the interest class among all items belonging to the interest class.
cF-score: 2*(Precision*Recall)/(Precision+Recall).
dIntersection over union: (common area between the predicted and the annotated object)/(area of the predicted object + area of the annotated object - common area of the annotated and predicted object).

Figure 3.:
Confusion matrix per pixel to assess the performance of the Mask R-CNN neural network for multiclass segmentation in the Test cohort. For example, for pixels belonging to the category of nonsclerotic glomeruli (pixels having been manually assigned to this category), the neural network correctly predicted the right category for 93% of those pixels and predicted a wrong one (interstitium) for 7% of those pixels. The Matthews correlation coefficient was 0.85 for this confusion matrix.

Application of the Neural Network

A total of 84 biopsies and 42 nephrectomies were analyzed in the Application cohort (Table 1). The automated analysis (through the first and second CNN) required about 15 minutes for a kidney biopsy and 7 minutes for a rectangle of nephrectomy (second CNN only) (Figure 2, Supplemental Figure 2, Supplemental References). CNN performances are presented in Figure 4 and Table 3. The cortical area predicted by the first CNN was close to the manually outlined one (r=0.89, P<0.001) but tended to be overestimated. The algorithm derived from the second CNN provided predicted prognostic data that were strongly correlated to the expected ones. Regarding glomeruli analyses, the highest correlations were observed with glomerular volume (r=0.85, P<0.001). Glomerular density and percentage of permeable glomeruli had lower correlation coefficients (r=0.51 and r=0.36, respectively, both P<0.001). Computed evaluation of TA and IF was significantly associated with visual evaluation. The average differences between the expected and observed scores are presented in Figure 4. The algorithm had a good ability to predict TA and IF >25% (ROC curve with an area under the curve at 0.92 and 0.91, respectively, with both P<0.001).

Figure 4.:
Evaluation of prognostic factors in the Application cohort. The factors assessed were mean glomerular volume (A), glomerular density (B), interstitial fibrosis (C–E), tubular atrophy (F–H), and vascular luminal stenosis through intimal thickening (I–K). The significant correlations between the observed factors and those predicted by the convolutional neural networks were assessed by Pearson or Spearman correlation tests on the basis of whether the distribution was normal (A–C), (F), (I). Bland–Altman plot showing a systematic overestimation of predicted tubular atrophy or interstitial fibrosis (D), (G) and an underestimation of luminal stenosis (J). The mean bias is represented by the big dashed black lines with the 95% limits of agreement represented by the small dashed lines. Receiver operating characteristic curves assessing the capacity of the algorithm to predict interstitial fibrosis, and tubular atrophy over 25% (E), (H) and a luminal stenosis over 50% (K).
Table 3. - Correlations between prognostic factors observed and predicted in the Application cohort
Prognostic Factors All Samples (n=126) Biopsy (n=84) Nephrectomy (n=42)
Predicted Observed r Coefficients Predicted Observed r Coefficients Predicted Observed r Coefficients
Cortical area, mm2 6.2±3.0 5.6±2.9 0.89 (P<0.001)
Mean glomerular volume, µm3×106 11.2±5.6 12.8±5.9 0.85 (P<0.001) 8.8±4.0 11.4±5.3 0.83 (P<0.001) 16.1±4.9 15.5±6.2 0.92 (P<0.001)
Glomerular density, glomeruli per cortical mm2 17.3±7.8 11.5±6.2 0.51 (P<0.001) 17.3±8.5 10.3±5.4 0.45 (P<0.001) 17.1±6.1 13.8±7.1 0.77 (P<0.001)
Permeable glomeruli, % 84.5±16.8 94.6±7.1 0.36 (P<0.001) 80.5±18.6 94.8±7.8 0.41 (P<0.001) 92.8±7.9 94.3±5.7 0.41 (P=0.01)
Glomeruli per kidney 1.3E+006±58.8E+004 1.1E+006±53.3E+004 0.75 (P<0.001
Interstitial fibrosis, % 27.1±11.3 17.3±9.6 0.75 (P<0.001) 29.6±12.3 17.7±10.7 0.83 (P<0.001) 22.2±7.3 16.7±7.2 0.75 (P<0.001)
Tubular atrophy, % 26.9±14.3 17.3±9.6 0.71 (P<0.001) 29.9±15.7 17.7±10.7 0.77 (P<0.001) 21.2±8.8 16.7±7.2 0.68 (P<0.001)
Variables are expressed as mean±SD.

For artery-centered ROI analysis, the average percentage of luminal stenosis was 51%±16 in manual analysis and 37%±17 (r=0.73, P<0.001) in automated analysis, with an underestimation of the degree of stenosis by the automated analysis. The performance of the algorithm to predict a luminal stenosis >50% is presented in Figure 4.

Clinico-biologic Effect

Among patients who were nephrectomized, the average total number of glomeruli per kidney was 1.1E+006±53.3E+004 for the manual analysis and 1.3E+006±58.8E+004 for the automated analysis (r=0.75, P<0.001). The number of glomeruli, either manually or automatically calculated, was correlated with the eGFR on the day of the nephrectomy (r=0.43 with P=0.02 and r=0.38 with P=0.04 for manual and automated analysis, respectively) and 1 year after the nephrectomy (r=0.58 with P=0.001 and r=0.61 with P<0.001 for manual and automated analysis, respectively). The other histologic parameters did not correlate with the initial or the follow-up GFR.


In this histologic study, we segmented kidney samples using two CNNs to obtain prognostic data. A first network isolated the cortical area, while the second segmented the structures of interest.

The number of annotations in the Training cohort was relatively small for a multiclass segmentation issue, but was not smaller than in previous studies (16,17). However, a higher number of objects could have improved the performance of the CNNs. To the best of our knowledge, this work was the first to automatically isolate the cortex area before kidney segmentation. However, cortical delimitation is mandatory to evaluate IF/TA or glomerular density and should be systematically performed in kidney segmentation with CNNs. Performances for detection and segmentation of smaller kidney structures in the Test cohort were similar to the published data. Indeed, we were able to detect 96% and 89% of the nonsclerotic and sclerotic glomeruli, whereas Hermsen et al. detected 93% and 76%, respectively (16). Jayapandian et al. obtained an F-score for the recognition of permeable glomerulus of 0.89 on trichrome staining, although we reached a 0.97 F-score in our work. Besides, optimal recognition of tubules and glomeruli was not obtained at the same level of magnification in their study, requiring multiple inferences if their tool were to be used in common practice (17).

We designed our CNNs to evaluate prognostic data that had never been studied with artificial intelligence. The correlation coefficients between the two modes of evaluation (manual and automatic) were >0.70 for the glomerular volume, the total nephrons, IF/TA, and the luminal stenosis estimation. The CNNs demonstrated a great predictive ability to detect a significant IF/TA and luminal stenosis, even if suboptimal identification results were obtained on atrophic tubule, vein, and artery recognition. Inaccurate classification of some tubules and veins could weaken the automated evaluation of TA and/or IF and lead to an overestimation of IF/TA. The use of a 10% corrective factor is probably necessary to compare visual and automated estimations of IF/TA and luminal stenosis.

Despite compatible orders of scale (4,16), comparing our data with those of the literature is difficult because of the variabilities in tissue-cutting and preservation techniques. The mean glomerular volume greatly varies depending on the studies, making it difficult to elaborate abacas or to fix a threshold defining glomerular hypertrophy. By combining the computed tomography scan data with the histologic data, we obtained two estimates (manual and automated) of the number of nephrons per kidney. These two estimates were well correlated and close to each other. The estimated number of glomeruli per kidney in our study was between 1 and 1.3 million, which was consistent with the previously published data (28). As expected, there was a high correlation between the predicted number of glomeruli and kidney function before and after nephrectomy in our study. This correlation was observed whether the count of glomeruli was manual or automated. Therefore, we believe our algorithm can help to determine the nephrotic reserve of patients. Although significant, our results were less convincing and therefore need to be nuanced for glomerular density and permeability. Several factors could explain these suboptimal correlations. One of the factors involved in the glomerular density estimation was the partial or complete nature of the glomerulus. Yet, the automated distinction between these two types of glomeruli was weakened because the vascular pole was sometimes confused with a merging of the tuft with the capsule. Moreover, to facilitate the cortical/medullary areas distinction, manual annotations were drawn on low-magnification images. The visual delimitation of two areas may be subjective, with a gray area at the junction, and could explain some differences between the CNN and manual evaluations of cortical area. Thus, even if our automation of glomerular density measure seems performant and can provide a good evaluation of the total nephron number, it is probably less reliable than a manual one.

The set of collected histologic data was chosen for its prognostic value either in kidney transplantation or after nephrectomy. In living kidney donors, glomerular volume proved its interest to predict either the onset of CKD (2,3) or proteinuria (3). Total nephron number, glomerular density, and IF/TA were able to predict the occurrence of proteinuria or CKD (2,3). Arteriosclerosis, evaluated through the luminal stenosis by intimal thickening, was predictive of new-onset hypertension (2). Similarly, glomerular volume, glomerular density, glomerular permeability, low nephron number, and IF/TA were predictive of CKD progression in patients who underwent nephrectomy for kidney cancer (1,2,7,24). Our automatically acquired histologic dataset allows to apprehend the nephrotic reserve, the glomerular compartment, the tubulo-interstitial compartment, and finally the vascular compartment. It is a fast, reproducible tool that spares the need for multiple evaluations by different pathologists.

The proportion of patients with MCD or a kidney tumor resection was lower in the Test cohort than in the Application cohort. Even if the characteristics such as sex, age, and eGFR were close between groups, differences in patient distribution could lead to a bias in the interpretation of the results. Although training images covered a wide range of fibrosis severity, most biopsies in the Application cohort had few fibrotic lesions. This is probably linked to the fact that most patients were young and had an MCD. Moreover, evaluation of the correlation between the histologic criteria and kidney function of these patients did not appear to be appropriate, due to the risk of functional kidney failure secondary to nephrotic syndrome. To reduce this bias, patients who were nephrectomized were also evaluated. These patients were older and more prone to have hypertension and fibrotic lesions. In these patients, only the total nephron number was correlated with the kidney function, probably because of a relatively short follow-up. Larger cohorts with a long-term follow-up are necessary to generalize our work and to compare the prognostic performances of the automated and visual assessments.

Although Periodic Acid–Schiff or Jones silver stains may be more appropriate to evaluate glomerular and tubular basement membranes (1,2,16), we chose Masson’s trichrome because it seemed to provide the best visual assessment of fibrosis, while allowing assessment of TA and elastic membranes. This staining is widespread and performed routinely in most pathologic centers. Training and evaluation focused on blue and green trichomes in two different centers, which enabled better adaptation of the CNN and facilitates further generalization. However, future validation in other centers is mandatory. Because interindividual variability for the appreciation of IF/TA is well known, we decided to merge our evaluation instead of comparing them. We used a ×25 and then a ×200 level of magnification with a high resolution. The use of another level of magnification by the user could weaken the results.

CNNs are the most widely used and efficient technique for achieving object segmentation in the biomedical field (29–31). Most of the previous work focusing on object recognition and segmentation in kidney pathology used another neural network (U-Net network) (16,32,33). Mask R-CNN, however, enabled us to identify and separate smaller elements, which we thought was interesting for the delimitation of small structures such as tubules (34). Mask R-CNN is sparsely used in kidney pathology but its effectiveness in biomedical imaging is well established (34–36). To obtain a better-generalized model, we used spatial augmentation in the training. To avoid alteration of the images, no color augmentation was performed. Nevertheless, this could have been useful to prevent differences in the staining quality of the images, depending on the pathology center or the origin of the tissue.

There are several previously reported clinical applications of artificial intelligence on kidney tissue. Diagnostic aids are available in kidney histology (37), especially in IgA nephropathy (33) and immunofluorescence (38). In kidney transplantation, studies have been published with an automated determination of the proportion of the glomerular permeability on pretransplant biopsies (39) or algorithms to predict C4d staining detection in allograft biopsies (40). However, these initiatives are rare and mainly limited by the use of one single histologic parameter. The interest of our work lies in the assessment of several histologic parameters of acknowledged prognostic value. This assessment is fast and does not require expensive specific computer equipment because it can be used online. However, our algorithm’s performance on kidney graft or more pathologic tissues might be less convincing and need to be evaluated. Some other features, such as the use of other stains and the evaluation of inflammation, could be implemented to allow evaluation on frozen tissue and allograft rejection among others (12).

This work presented a freely available algorithm that enables a fast, reliable, and fully automatic assessment of glomerular volume, glomerular density, glomerular permeability, IF/TA, and intimal fibrosis.


J.-M. Rebibou reports having other interests/relationships with the French Society of Nephrology Dialysis and Transplantation and the International Society of Nephrology. L. Cormier reports receiving honoraria from Astellas, Ipsen, Janssen, and Sanofi. All remaining authors have nothing to disclose.


This work was funded by the NEPHRIN-APJ2019 (Appel d’offre jeunes chercheurs) GIRCI EST (47755 euros) (to M. Legendre).

Published online ahead of print. Publication date available at


We thank the University of Dijon, the University Hospital of Dijon, and the ESIREM engineering school.

Supplemental Material

This article contains the following supplemental material online at

Supplemental Figure 1. Evaluation of the cortical volume of the nontumoral kidney on a computed tomography scan.

Supplemental Figure 2. Kidney cortical samples from the Application cohort after evaluation with the convolutional neural networks.

Supplemental Table 1. Number of annotated objects in each category for Training and Test cohorts.

Supplemental Table 2. Formulas for parameters of interest.

Supplemental References


1. Denic A, Elsherbiny H, Mullan AF, Leibovich BC, Thompson RH, Ricaurte Archila L, Narasimhan R, Kremers WK, Alexander MP, Lieske JC, Lerman LO, Rule AD: Larger nephron size and nephrosclerosis predict progressive CKD and mortality after radical nephrectomy for tumor and independent of kidney function. J Am Soc Nephrol 31: 2642–2652, 2020
2. Issa N, Vaughan LE, Denic A, Kremers WK, Chakkera HA, Park WD, Matas AJ, Taler SJ, Stegall MD, Augustine JJ, Rule AD: Larger nephron size, low nephron number, and nephrosclerosis on biopsy as predictors of kidney function after donating a kidney. Am J Transplant 19: 1989–1998, 2019
3. Merzkani MA, Denic A, Narasimhan R, Lopez CL, Larson JJ, Kremers WK, Chakkera HA, Park WD, Taler SJ, Stegall MD, Alexander MP, Issa N, Rule AD: Kidney microstructural features at the time of donation predict long-term risk of chronic kidney disease in living kidney donors. Mayo Clin Proc 96: 40–51, 2021
4. Issa N, Lopez CL, Denic A, Taler SJ, Larson JJ, Kremers WK, Ricaurte L, Merzkani MA, Alexander MP, Chakkera HA, Stegall MD, Augustine JJ, Rule AD: Kidney structural features from living donors predict graft failure in the recipient. J Am Soc Nephrol 31: 415–423, 2020
5. Rule AD, Semret MH, Amer H, Cornell LD, Taler SJ, Lieske JC, Melton LJ 3rd, Stegall MD, Textor SC, Kremers WK, Lerman LO: Association of kidney function and metabolic risk factors with density of glomeruli on renal biopsy samples from living donors. Mayo Clin Proc 86: 282–290, 2011
6. Tonneijck L, Muskiet MHA, Smits MM, van Bommel EJ, Heerspink HJL, van Raalte DH, Joles JA: Glomerular hyperfiltration in diabetes: Mechanisms, clinical significance, and treatment. J Am Soc Nephrol 28: 1023–1039, 2017
7. Hoy WE, Bertram JF, Denton RD, Zimanyi M, Samuel T, Hughson MD: Nephron number, glomerular volume, renal disease and hypertension. Curr Opin Nephrol Hypertens 17: 258–265, 2008
8. Yang S, Cao C, Deng T, Zhou Z: Obesity-related glomerulopathy: A latent change in obesity requiring more attention. Kidney Blood Press Res 45: 510–522, 2020
9. Zhang J, Song H, Li D, Lv Y, Chen B, Zhou Y, Ding X, Chen C: Role of clinicopathological features for the early prediction of prognosis in lupus nephritis. Immunol Res 69: 285–294, 2021
10. Coppo R, D’Arrigo G, Tripepi G, Russo ML, Roberts ISD, Bellur S, Cattran D, Cook TH, Feehally J, Tesar V, Maixnerova D, Peruzzi L, Amore A, Lundberg S, Di Palma AM, Gesualdo L, Emma F, Rollino C, Praga M, Biancone L, Pani A, Feriozzi S, Polci R, Barratt J, Del Vecchio L, Locatelli F, Pierucci A, Caliskan Y, Perkowska-Ptasinska A, Durlik M, Moggia E, Ballarin JC, Wetzels JFM, Goumenos D, Papasotiriou M, Galesic K, Toric L, Papagianni A, Stangou M, Benozzi L, Cusinato S, Berg U, Topaloglu R, Maggio M, Ots-Rosenberg M, D’Amico M, Geddes C, Balafa O, Quaglia M, Cravero R, Lino Cirami C, Fellstrom B, Floege J, Egido J, Mallamaci F, Zoccali C; ERA-EDTA Immunonephrology Working Group: Is there long-term value of pathology scoring in immunoglobulin A nephropathy? A validation study of the Oxford Classification for IgA nephropathy (VALIGA) update. Nephrol Dial Transplant 35: 1002–1009, 2020
11. Ginley B, Jen K-Y, Han SS, Rodrigues L, Jain S, Fogo AB, Zuckerman J, Walavalkar V, Miecznikowski JC, Wen Y, Yen F, Yun D, Moon KC, Rosenberg A, Parikh C, Sarder P: Automated computational detection of interstitial fibrosis, tubular atrophy, and glomerulosclerosis. J Am Soc Nephrol 32: 837–850, 2021
12. Becker JU, Mayerich D, Padmanabhan M, Barratt J, Ernst A, Boor P, Cicalese PA, Mohan C, Nguyen HV, Roysam B: Artificial intelligence and machine learning in nephropathology. Kidney Int 98: 65–75, 2020
13. Xie G, Chen T, Li Y, Chen T, Li X, Liu Z: Artificial intelligence in nephrology: How can artificial intelligence augment nephrologists’ intelligence? Kidney Dis 6: 1–6, 2020
14. Hou J, Nast CC: Artificial intelligence: The next frontier in kidney biopsy evaluation. Clin J Am Soc Nephrol 15: 1389–1391, 2020
15. Burlacu A, Iftene A, Jugrin D, Popa IV, Lupu PM, Vlad C, Covic A: Using artificial intelligence resources in dialysis and kidney transplant patients: A literature review. BioMed Res Int 2020: 9867872, 2020
16. Hermsen M, de Bel T, den Boer M, Steenbergen EJ, Kers J, Florquin S, Roelofs JJTH, Stegall MD, Alexander MP, Smith BH, Smeets B, Hilbrands LB, van der Laak JAWM: Deep learning-based histopathologic assessment of kidney tissue. J Am Soc Nephrol 30: 1968–1979, 2019
17. Jayapandian CP, Chen Y, Janowczyk AR, Palmer MB, Cassol CA, Sekulic M, Hodgin JB, Zee J, Hewitt SM, O’Toole J, Toro P, Sedor JR, Barisoni L, Madabhushi A, Sedor J, Dell K, Schachere M, Negrey J, Lemley K, Lim E, Srivastava T, Garrett A, Sethna C, Laurent K, Appel G, Toledo M, Barisoni L, Greenbaum L, Wang C, Kang C, Adler S, Nast C, LaPage J, Stroger JH, Athavale A, Itteera M, Neu A, Boynton S, Fervenza F, Hogan M, Lieske J, Chernitskiy V, Kaskel F, Kumar N, Flynn P, Kopp J, Blake J, Trachtman H, Zhdanova O, Modersitzki F, Vento S, Lafayette R, Mehta K, Gadegbeku C, Johnstone D, Quinn-Boyle S, Cattran D, Hladunewich M, Reich H, Ling P, Romano M, Fornoni A, Bidot C, Kretzler M, Gipson D, Williams A, LaVigne J, Derebail V, Gibson K, Froment A, Grubbs S, Holzman L, Meyers K, Kallem K, Lalli J, Sambandam K, Wang Z, Rogers M, Jefferson A, Hingorani S, Tuttle K, Bray M, Kelton M, Cooper A, Freedman B, Lin JJ; Nephrotic Syndrome Study Network (NEPTUNE): Development and evaluation of deep learning-based segmentation of histologic structures in the kidney cortex with multiple histologic stains. Kidney Int 99: 86–101, 2021
18. Santo BA, Rosenberg AZ, Sarder P: Artificial intelligence driven next-generation renal histomorphometry. Curr Opin Nephrol Hypertens 29: 265–272, 2020
19. He K, Gkioxari G, Dollár P, Girshick R: Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 42: 386–397, 2020
20. Haruhara K, Tsuboi N, Sasaki T, Amano H, Tanaka M, Koike K, Kanzaki G, Okabayashi Y, Miyazaki Y, Ogura M, Yokoo T: Volume ratio of glomerular tufts to bowman capsules and renal outcomes in nephrosclerosis. Am J Hypertens 32: 45–53, 2019
21. Weibel ER, Gomez DM: A principle for counting tissue structures on random sections. J Appl Physiol 17: 343–348, 1962
22. Glodny B, Unterholzner V, Taferner B, Hofmann KJ, Rehder P, Strasak A, Petersen J: Normal kidney size and its influencing factors: A 64-slice MDCT study of 1.040 asymptomatic patients. BMC Urol 9: 19, 2009
23. Seibel J, Rebibou J-M, Legendre M: Can total nephron number predict progressive CKD after radical nephrectomy? J Am Soc Nephrol 32: 517, 2020
24. Sasaki T, Tsuboi N, Kanzaki G, Haruhara K, Okabayashi Y, Koike K, Kobayashi A, Yamamoto I, Ogura M, Hoy WE, Bertram JF, Shimizu A, Yokoo T: Biopsy-based estimation of total nephron number in Japanese living kidney donors. Clin Exp Nephrol 23: 629–637, 2019
25. Seo H, Badiei Khuzani M, Vasudevan V, Huang C, Ren H, Xiao R, Jia X, Xing L: Machine learning techniques for biomedical image segmentation: An overview of technical aspects and introduction to state-of-art applications. Med Phys 47: e148–e167, 2020
26. Gorodkin J: Comparing two K-category assignments by a K-category correlation coefficient. Comput Biol Chem 28: 367–374, 2004
27. Giavarina D: Understanding Bland–Altman analysis. Biochem Med (Zagreb) 25: 141–151, 2015
28. Denic A, Lieske JC, Chakkera HA, Poggio ED, Alexander MP, Singh P, Kremers WK, Lerman LO, Rule AD: The substantial loss of nephrons in healthy human kidneys with aging. J Am Soc Nephrol 28: 313–320, 2017
29. Chen C, Qin C, Qiu H, Tarroni G, Duan J, Bai W, Rueckert D: Deep learning for cardiac image segmentation: A review. Front Cardiovasc Med 7: 25, 2020
30. Zegers CML, Posch J, Traverso A, Eekers D, Postma AA, Backes W, Dekker A, van Elmpt W: Current applications of deep-learning in neuro-oncological MRI. Phys Med 83: 161–173, 2021
31. Khan MA, Sharif M, Akram T, Damaševičius R, Maskeliūnas R: Skin lesion segmentation and multiclass classification using deep learning features and improved moth flame optimization. Diagnostics (Basel) 11: 811, 2021
32. Bouteldja N, Klinkhammer BM, Bülow RD, Droste P, Otten SW, Freifrau von Stillfried S, Moellmann J, Sheehan SM, Korstanje R, Menzel S, Bankhead P, Mietsch M, Drummer C, Lehrke M, Kramann R, Floege J, Boor P, Merhof D: Deep learning-based segmentation and quantification in experimental kidney histopathology. J Am Soc Nephrol 32: 52–68, 2021
33. Zeng C, Nan Y, Xu F, Lei Q, Li F, Chen T, Liang S, Hou X, Lv B, Liang D, Luo W, Lv C, Li X, Xie G, Liu Z: Identification of glomerular lesions and intrinsic glomerular cell types in kidney diseases via deep learning. J Pathol 252: 53–64, 2020
34. Vuola AO, Akram SU, Kannala J: Mask-RCNN and U-Net Ensembled for Nuclei Segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp 208–212, 2019
35. Zhang Y, Chan S, Park VY, Chang K-T, Mehta S, Kim MJ, Combs FJ, Chang P, Chow D, Parajuli R, Mehta RS, Lin C-Y, Chien S-H, Chen J-H, Su M-Y: Automatic detection and segmentation of breast cancer on MRI using mask R-CNN trained on non-fat-sat images and tested on fat-sat images [published online ahead of print December 11, 2020]. Acad Radiol
36. Loh R, Yong WX, Yapeter J, Subburaj K, Chandramohanadas R: A deep learning approach to the screening of malaria infection: Automated and rapid cell counting, object detection and instance segmentation using Mask R-CNN. Comput Med Imaging Graph 88: 101845, 2021
37. Uchino E, Suzuki K, Sato N, Kojima R, Tamada Y, Hiragi S, Yokoi H, Yugami N, Minamiguchi S, Haga H, Yanagita M, Okuno Y: Classification of glomerular pathological findings using deep learning and nephrologist-AI collective intelligence approach. Int J Med Inform 141: 104231, 2020
38. Ligabue G, Pollastri F, Fontana F, Leonelli M, Furci L, Giovanella S, Alfano G, Cappelli G, Testa F, Bolelli F, Grana C, Magistroni R: Evaluation of the classification accuracy of the kidney biopsy direct immunofluorescence through convolutional neural networks. Clin J Am Soc Nephrol 15: 1445–1454, 2020
39. Marsh JN, Matlock MK, Kudose S, Liu T-C, Stappenbeck TS, Gaut JP, Swamidass SJ: Deep learning global glomerulosclerosis in transplant kidney frozen sections. IEEE Trans Med Imaging 37: 2718–2728, 2018
40. Kim Y-G, Choi G, Go H, Cho Y, Lee H, Lee A-R, Park B, Kim N: A fully automated system using a convolutional neural network to predict renal allograft rejection: Extra-validation with giga-pixel immunostained slides. sci Rep 9: 5123, 2019

renal pathology; deep learning; prognosis; neural networks; computer

Copyright © 2022 by the American Society of Nephrology