Using a Multi-Institutional Pediatric Learning Health System to Identify Systemic Lupus Erythematosus and Lupus Nephritis: Development and Validation of Computable Phenotypes : Clinical Journal of the American Society of Nephrology

Journal Logo

Original Article: Glomerular and Tubulointerstitial Diseases

Using a Multi-Institutional Pediatric Learning Health System to Identify Systemic Lupus Erythematosus and Lupus Nephritis

Development and Validation of Computable Phenotypes

Wenderfer, Scott E.1; Chang, Joyce C.2; Goodwin Davies, Amy3; Luna, Ingrid Y.3; Scobell, Rebecca1,3; Sears, Cora2; Magella, Bliss4; Mitsnefes, Mark4,5; Stotter, Brian R.6; Dharnidharka, Vikas R.6; Nowicki, Katherine D.7; Dixon, Bradley P.8; Kelton, Megan9,10; Flynn, Joseph T.9,10; Gluck, Caroline11; Kallash, Mahmoud12,13; Smoyer, William E.12,13; Knight, Andrea14; Sule, Sangeeta15; Razzaghi, Hanieh3; Bailey, L. Charles3,16; Furth, Susan L.16,17; Forrest, Christopher B.3,16; Denburg, Michelle R.16,17,18; Atkinson, Meredith A.19

Author Information
CJASN 17(1):p 65-74, January 2022. | DOI: 10.2215/CJN.07810621
  • Free
  • Infographic
  • SDC



Childhood-onset SLE is associated with higher rates of kidney involvement and mortality than adult-onset disease. Data are limited on the comparative effectiveness of most diagnostic and management approaches in childhood SLE. Up to 20% of all SLE cases are diagnosed before age 18 years (12345–6), and 35%–70% develop kidney involvement (2,3,7,8). Lupus nephritis develops earlier and behaves more aggressively in childhood SLE compared with adult-onset SLE (9). The morbidity of kidney disease in childhood SLE is high, contributing 57% of SLE-related hospitalizations (10). Only half of children with lupus nephritis achieve remission of their nephritis, leaving them at high risk for kidney failure (1,11).

There are substantial barriers to conducting clinical trials in childhood SLE and lupus nephritis, including limited patient populations dispersed across numerous clinical centers, which raise recruitment costs and reduce trial feasibility. Electronic health records (EHRs) are an important tool in clinical research and can efficiently identify and study large numbers of patients with uncommon diseases (12). EHR-based clinical data research networks offer opportunities to utilize data from multiple pediatric health care systems to assess disease manifestations, natural history, practice patterns, and response to therapy. This opportunity is dependent, however, on accurate identification of affected patients.

PEDSnet (, a member of the National Patient-Centered Clinical Research Network (PCORnet;, is a multicenter network of pediatric health care systems containing data on >6.7 million children. PEDSnet is a national-scale learning health system that enables identification and study of patient cohorts using case-defining algorithms (1314–15). It has the potential to be a useful resource for rare diseases that are challenging to study outside of specific patient registries. The objective of this study was to develop and validate the first computable phenotypes for identification of patients with childhood SLE and lupus nephritis in PEDSnet.

Materials and Methods

Data Source

Institutions that contributed data for this study included the Children’s Hospital of Philadelphia (CHOP), Cincinnati Children’s Hospital Medical Center, Children’s Hospital Colorado, Nemours Children’s Health System, Nationwide Children’s, St. Louis Children’s, and Seattle Children’s Hospital (14). PEDSnet harmonizes data from these institutions’ EHRs into a common data model. Database version 3.7 was used for this study, which includes EHR data for over 6.7 million children with at least one encounter or diagnosis between January 2009 and December 2019 (16). Visit diagnoses are standardized to the Systematized Nomenclature of Medicine, Clinical Terms (SNOMED-CT).

PEDSnet data management includes extensive quality assessment, with >1000 tests performed quarterly by the Data Coordinating Center at CHOP (17,18). The core data resource is implemented as a Health Insurance Portability and Accountability Act–limited dataset containing structured primary data from clinical care, including dates but excluding direct patient identifiers.

The study was approved by the institutional review board at CHOP (institutional review board protocol numbers 14–011242, 16–012878, and 16–013563) with a waiver of consent. A Master Reliance Agreement covered research by all other institutions.

Development of Computable Phenotypes

The study objective was to identify a cohort of children <18 years old with childhood SLE and a subcohort with an additional diagnosis of lupus nephritis. Computable phenotypes for childhood SLE and lupus nephritis were designed in an iterative manner, beginning with algorithms on the basis of ICD-9-CM codes in the published literature (192021–22). Algorithms were initially developed using an existing highly phenotyped discovery cohort, consisting of a REDCap database of patients with childhood SLE and one or more outpatient visits between 2008 and 2016 at a single center (Supplemental Table 1). This discovery cohort was derived from all patients with one or more ICD-9/10-CM diagnosis codes for SLE in the EHR subdivided into gold standard definite SLE cases, probable SLE cases, and non-SLE cases through manual chart review by a single trained study coordinator (23).

Utilization criteria on the basis of provider/clinic specialty were developed and applied to both cases and noncases to maximize specificity for both SLE and lupus nephritis algorithms. Visit types included ambulatory office encounters, inpatient hospitalizations, and infusion center appointments. The provider code set included nephrology, rheumatology, pediatric nephrology, and pediatric rheumatology codes (Supplemental Table 2).

Optimization and testing of algorithm criteria included combinations of diagnosis, procedure, and medication codes to improve specificity compared with algorithms utilizing diagnosis alone. The SLE and lupus nephritis diagnosis code sets used codes from the SNOMED-CT terminology (Supplemental Table 2). Lupus nephritis algorithms could not include biopsy findings, as EHRs at PEDSnet institutions do not routinely codify data from pathology reports. The procedural code set for kidney biopsy used combinations of CPT4, ICD-9-CM, and SNOMED-CT codes to capture percutaneous and open biopsies performed by nephrology, radiology, or surgery (Supplemental Table 3).

Motivated by observations of how ICD-9-CM codes for lupus nephritis in the source data were mapped to SNOMED-CT terminology, the final lupus nephritis algorithm incorporated an SLE diagnosis accompanied by a kidney disease or glomerular disease code on the same date (Supplemental Table 4).

Chart Review Evaluation

The highest-performing algorithms were tested by manual chart review at the six additional PEDSnet institutions that composed the validation cohort. Virtual training sessions for completion of standardized chart review forms were held. Chart reviews were performed by either pediatric nephrologists or study coordinators, with mechanisms in place for nephrologist review of ambiguous cases. In total, each PEDSnet institution reviewed 100 charts, including 50 putative SLE cases as identified by the final SLE algorithm, with 25 identified as putative nephritis cases using the final lupus nephritis algorithm and 25 identified as nonlupus nephritis cases. The remaining 50 charts were randomly selected noncases of SLE, defined as patients not meeting the computable phenotypes for SLE but who had ≥60 days of follow-up and two or more in-person visits with either a nephrology or rheumatology provider/clinic. This maximized the power to interrogate both algorithms using the same chart review process. Site reviewers were masked to case status.

Data Analyses

For assessments involving the discovery cohort, performance statistics were calculated for all patients meeting utilization criteria between 2008 and 2016. For assessments involving the validation cohort, all data available for the cohort through 2019 were analyzed. Sensitivity, specificity, and positive and negative predictive values were calculated for each case-finding algorithm, and 95% confidence intervals were calculated using the exact binomial test.


There were 6,753,739 patients in the PEDSnet database, including 115,465 with two or more visits with nephrology or rheumatology and ≥60 days of follow-up (Figure 1). The latter population served as the denominator for all performance analyses on SLE and lupus nephritis algorithms. The race and ethnicity breakdown aligns with the US pediatric population. Demographic data for the full patient population and the denominator population are available from only six of seven sites. Patients with 2 or more nephrology or rheumatology visits were older than the full PEDSnet population and had three-fold–longer follow-up (Table 1).

Figure 1.:
Final computable phenotype algorithms for SLE and lupus nephritis (LN) both include utilization and diagnostic criteria. Seven-institution data are shown.
Table 1. - Characteristics of source populations (six-institution data)
Characteristic a Full PEDSnet Population Population with ≥2 Rheumatology or Nephrology Visits SLE Cases Lupus Nephritis Cases
Total count 5,926,218 108,923 1302 469
Age at initial visit, yr 3 (<1–9) 5 (<1–11) 12 (7–15) 12 (7–15)
Girls, n (%) 2,881,012 (49) 58,490 (54) 1107 (85) 374 (80)
Asian race, n (%) 205,026 (4) 2940 (3) 106 (8) 39 (8)
Black race, n (%) 963,250 (16) 17,299 (16) 384 (30) 148 (32)
Multiracial, n (%) 164,918 (3) 2383 (2) 42 (3) 11 (2)
Other race, n (%) 510,513 (9) 10,178 (9) 188 (14) 97 (21)
Unknown race, n (%) 645,136 (11) 6693 (6) 89 (7) 30 (6)
White race, n (%) 3,437,375 (58) 69,430 (64) 493 (38) 144 (31)
Hispanic ethnicity, n (%) 729,822 (12) 13,349 (12) 263 (20) 113 (24)
Unknown ethnicity, n (%) 678,304 (11) 6801 (6) 83 (6) 27 (6)
Follow-up time, yr 2 (<1–7) 6 (3–11) 7 (3–11) 6 (3–11)
aContinuous data are presented as median (interquartile range).

Comparing the demographics for SLE and lupus nephritis cases with the entire rheumatology/nephrology population in PEDSnet (Table 1), the age at initial visit was older, and higher proportions were girls, Hispanic, or patients with Asian or African ancestry. Follow-up time was roughly equal. Patients with SLE made up 1% of this subspecialty population and two per 10,000 of the full PEDSnet population.

Derivation of Case-Finding Algorithms

Algorithms for SLE and lupus nephritis were initially tested at a single institution using a highly phenotyped childhood SLE cohort (discovery), which included validated cases of SLE, incomplete SLE, and related connective tissue diseases. Algorithms for SLE utilizing diagnosis codes only were tested initially and demonstrated acceptable sensitivity (92%–95%) but lacked specificity (61%–67%). Conversely, lupus nephritis diagnostic codes only were insensitive (51%–55%). Thus, medication (hydroxychloroquine) and procedure (kidney biopsy) codes were utilized and tested iteratively to improve specificity. The algorithms tested for SLE or lupus nephritis case determination in the discovery cohort are presented in Table 2 (code sets are in Supplemental Table 2). SLE_A7 and LN_A7 were chosen as final algorithms to address different coding practices across institutions noted on data quality analyses. Complete details of algorithm development are in Supplemental Material.

Table 2. - Performance of computable phenotypes for SLE and lupus nephritis in the discovery cohort
Algorithm Name Full Algorithm Sensitivity, % Specificity, % Positive Predictive Value, % Negative Predictive Value, %
SLE algorithms
 SLE_A1 ≥2 SLE codes, a ≥60 d apart 95 61 84 86
 SLE_A2 ≥3 SLE codes, ≥30 d apart 92 67 86 79
 SLE_A3 (≥2 SLE codes, ≥60 d apart) or (≥1 SLE code + ≥1 HCQ exposure b ) 98 32 76 89
 SLE_A4 (≥3 SLE codes, ≥30 d apart) or (≥1 SLE code + ≥1 HCQ exposure) 98 32 76 89
 SLE_A5 (≥2 SLE codes, ≥60 d apart) and (≥1 HCQ exposure) 92 65 85 79
 SLE_A6 (≥3 SLE codes, ≥30 d apart) and (≥1 HCQ exposure) 89 71 87 74
 SLE_A7 (final) (≥1 HCQ exposure) and [(≥3 SLE codes, ≥30 d apart) or (≥1 SLE code + ≥1 biopsy procedure c )] 89 71 87 75
LN algorithms
 LN_A1 ≥2 LN codes, d ≥60 d apart 55 99 96 87
 LN_A2 ≥3 LN codes, ≥30 d apart 52 99 95 86
 LN_A3 (≥2 LN codes, ≥60 d) or (≥1 SLE code + ≥1 biopsy procedure) or (≥1 SLE code + ≥1 nephritis-related diagnosis) e 79 95 85 93
 LN_A4 (≥3 LN codes, ≥30 d) or (≥1 SLE code + ≥1 biopsy procedure) or (≥1 SLE code + ≥1 nephritis-related diagnosis) 79 95 85 93
 LN_A5 (≥2 LN codes, ≥60 d) or (≥1 SLE code + ≥1 biopsy procedure) 79 98 93 93
 LN_A6 (≥3 LN codes, ≥30 d) or (≥1 SLE code + ≥1 biopsy procedure) 79 98 93 93
 LN_A7 (final) (≥1 HCQ exposure) and [(≥3 LN or LN combination f codes, ≥30 d apart) or (≥1 SLE code + ≥1 biopsy procedure c )] 80 98 94 94
Bold indicates final algorithms and performance metrics. HCQ, hydroxychloroquine; LN, lupus nephritis.
aSLE inclusion diagnosis codes are listed in Supplemental Table 2.
bMedication prescription for HCQ.
cKidney biopsy procedure code set is listed in Supplemental Table 3.
dLN inclusion diagnosis codes (subset of SLE codes) are listed in Supplemental Table 2.
eNephritis-related diagnosis codes are in Supplemental Table 4.
fLN diagnosis combination = greater than or equal to one SLE code + greater than or equal to one kidney disease/glomerular disease diagnosis on the same date. These diagnosis codes are provided in Supplemental Table 4.

Evaluation across PEDSnet Institutions

To interrogate the final algorithms from the discovery cohort (SLE_A7 and LN_A7) (Table 2), we performed chart reviews at the remaining six PEDSnet health systems. We randomly selected 300 cases of SLE identified by the final SLE algorithm and 300 noncases from the same combined nephrology/rheumatology populations (50 per site). Chart review identified 28 false positives for SLE in total, without a single false negative. The overall sensitivity was 100%, and positive predictive value was 91% (Table 3). Across institutions, positive predictive value varied from 86% to 98%. Specificity was higher than in the discovery cohort.

Table 3. - Performance characteristics of the computable phenotype algorithm across PEDSnet discovery and validation sites with two or more rheumatology or nephrology encounters
Center Sensitivity, % (95% Confidence Interval) Specificity, % (95% Confidence Interval) Positive Predictive Value, % (95% Confidence Interval) Negative Predictive Value, % (95% Confidence Interval)
PEDSnet discovery site 100 (92 to 100) 93 (82 to 98) 92 (81 to 0.98) 100 (93 to 100)
  2 100 (92 to 100) 89 (78 to 96) 88 (76 to 95) 100 (93 to 100)
  3 100 (93 to 100) 98 (90 to 100) 98 (89 to 100) 100 (93 to 100)
  4 100 (92 to 100) 89 (78 to 96) 88 (76 to 95) 100 (93 to 100)
  5 100 (92 to 100) 88 (76 to 95) 86 (73 to 94) 100 (93 to 100)
  6 100 (92 to 100) 88 (76 to 95) 86 (73 to 94) 100 (93 to 100)
  7 100 (93 to 100) 98 (90 to 100) 98 (89 to 100) 100 (93 to 100)
PEDSnet validation (6 sites) 100 (99 to 100) 91 (88 to 94) 91 (87 to 94) 100 (99 to 100)
 All (7 sites) 100 (99 to 100) 92 (88 to 94) 91 (87 to 94) 100 (99 to 100)
Lupus nephritis
PEDSnet discovery site 96 (80 to 100) 100 (86 to 100) 100 (86 to 100) 96 (80 to 100)
  2 81 (63 to 93) 100 (82 to 100) 100 (86 to 100) 76 (55 to 91)
  3 85 (66 to 96) 91 (72 to 99) 92 (74 to 99) 84 (64 to 95)
  4 88 (69 to 97) 88 (69 to 97) 88 (69 to 97) 88 (69 to 97)
  5 92 (75 to 99) 96 (79 to 100) 96 (80 to 100) 92 (74 to 99)
  6 95 (76 to 100) 83 (64 to 94) 80 (59 to 93) 96 (80 to 100)
  7 96 (80 to 100) 100 (86 to 100) 100 (86 to 100) 96 (80 to 100)
PEDSnet validation (6 sites) 89 (83 to 94) 92 (87 to 96) 93 (0.87 to 0.96) 89 (82 to 93)
 All (7 sites) 90 (85 to 94) 93 (89 to 97) 94 (89 to 97) 90 (84 to 94)
On the basis of six-institution data (discovery site not included) or seven-institution data, there were 50 SLE cases and 50 noncases per site, of which 25 SLE cases had nephritis and 25 SLE cases had no history of nephritis. Performance metrics for six-site and seven-site aggregate data indicated in bold.

To explore potential reasons for false positives, the performance of the final SLE algorithm was compared between patients with SLE with and without lupus nephritis (Supplemental Table 5). Although specificity was identical, positive predictive value was higher in patients with nephritis (98% versus 83%; 95% confidence interval, 95% to 99% versus 77% to 89%). Only three false-positive SLE cases were identified by the LN_A7 algorithm to have nephritis, whereas 25 patients identified by SLE_A7 who did not meet criteria for nephritis by LN_A7 were deemed false positives by chart review. Variation in coding practices may have contributed, as four PEDSnet institutions contributed six to seven false positives each, compared with one to four each from the other three institutions.

In the validation cohort, the final algorithm identified lupus nephritis with an overall 93% positive predictive value and 89% sensitivity. Algorithm performance was consistent across institutions (Table 3). Among true-positive lupus nephritis cases, 92% had a kidney biopsy procedure performed. The remainder were identified by the presence of abnormal proteinuria on chart review or had their biopsy at a center outside of PEDSnet. Of 11 false positives for lupus nephritis, three were also false positive and eight were true positive for SLE. The institution with the highest false-negative rate was found to have the most cases with lupus nephritis diagnosed clinically by urinary protein assessment (data not shown).

Applying the computable phenotypes to the full PEDSnet population (one discovery plus six validation centers), we identified 1508 cases of SLE and 537 cases of lupus nephritis (Table 4). The proportion of patients diagnosed did not change over time; rates were similar during periods from 2009 to 2012 and from 2013 to 2016. Patients with SLE were followed for a median of 7 years (interquartile range [IQR], 3–11), with a median of four (IQR, 1–6) in-person rheumatology visits per person-year. Patients with lupus nephritis made up 36% of the SLE population. Most patients with nephritis were diagnosed between 10 and 19 years of age. Patients with lupus nephritis were followed for a median of 6 years (IQR, 3–11), with a median of two (IQR, 1–5) nephrology and four (IQR, 1–7) rheumatology in-person visits per person-year.

Table 4. - Clinical and health care utilization characteristics of the cohort identified by the SLE computable phenotype algorithm
Characteristic a SLE Cases Lupus Nephritis Cases
N 1508 537
Cohort by institution, N (%)
PEDSnet discovery site 294 (20) 97 (18)
 2 209 (14) 62 (12)
 3 164 (11) 43 (8)
 4 305 (20) 117 (22)
 5 167 (11) 87 (16)
 6 163 (11) 63 (12)
 7 206 (14) 68 (13)
Age at diagnosis, yr, N (%)
 <5 11 (1) 5 (<1)
 5–9 123 (8) 49 (9)
 10–14 597 (40) 230 (43)
 15–19 669 (44) 227 (42)
 >19 105 (7) 25 (5)
 Unavailable 3 (<1) 1 (<1)
Year of diagnosis, N (%)
 <2009 256 (17) 78 (15)
 2009–2012 489 (32) 163 (30)
 2013–2016 466 (31) 175 (33)
 2017–2020 294 (20) 120 (22)
 Unavailable 3 (<1) 1 (<1)
Follow-up time since diagnosis, yr, median (IQR) 76.6 (4–11) 6 (3–10)
No. of nephrology visits per person-yr, median (IQR) 0 (0–1) 2 (<1–5)
No. of rheumatology visits per person-yr, median (IQR) 4 (1–6) 4 (1–7)
Actively followed in past 12 mo, N (%) 809 (54) 328 (61)
Clinical and health care utilization characteristics of the cohort are on the basis of seven-institution data. The sensitivity analysis on six-institution data (excludes the discovery site) showed no significant effect on cohort characteristics. IQR, interquartile range.
aContinuous data are presented as median (IQR), and categorical data are presented as N (percentage).


We developed and evaluated the classification accuracy of EHR-based computable phenotypes to identify children with SLE with and without nephritis. This study is the first to test computable phenotypes that incorporate medication data with SNOMED-CT diagnostic codes for SLE and lupus nephritis. The algorithms not only performed well, with high positive predictive value in the discovery cohort, but performance improved when applied to the full PEDSnet network. Algorithms incorporating hydroxychloroquine medication further improved specificity. Because hydroxychloroquine is not prescribed in 100% of children with SLE, inclusion did decrease sensitivity. However, the improved specificity was prioritized for our final algorithm to identify eligible pediatric patients not only for inclusion in prospective interventional clinical trials but also for comparative effectiveness research.

Computable phenotypes have been developed in other PCORnet clinical research networks for similar initiatives in nephrology, including measuring hypertension prevalence rates in adults (24). PEDSnet data have been used to identify pediatric patients with leukemia, lymphoma (25), Crohn disease (26), and, most recently, glomerular disease (16). Learning health systems are ideal for use in pragmatic clinical trials, such as the Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-Term Effectiveness study (27) and the Clinical Outcomes of Methotrexate Binary Therapy in Practice ( trial for children with inflammatory bowel disease in PEDSnet. Pragmatic trial designs attempt to mimic usual clinical practice. They assess effectiveness in real-world settings, whereas explanatory trials assess efficacy. Data on the effectiveness of lupus treatments in children are especially lacking.

The performance of our final SLE and lupus nephritis algorithms in PEDSnet is comparable with previously published algorithms (Supplemental Table 6). Previous algorithms for identification of SLE in adult cohorts have relied on ICD-9-CM codes, single-center EHRs (19,20,28,29), Medicaid databases (21), or EHR data from the Department of Veterans Affairs (VA) (30) (Supplemental Table 6). Using ICD-9-CM codes only, sensitivity for identification of SLE cases was 41%–87%, with 92%–100% specificity (19). Algorithms including diagnosis codes and nephrology utilization codes demonstrated positive predictive values of 89%–92% for SLE identification and 79%–88% for lupus nephritis identification using single-center Medicaid data, but because chart reviews were performed only on putative cases, negative predictive value and specificity could not be assessed (21). Finally, a lupus nephritis algorithm using a reference standard of biopsy-proven nephritis demonstrated high specificity and negative predictive value (95%–99.8%) in a VA patient population; however, sensitivity and positive predictive value was variable (58%–94%) (22). The VA study highlights the lack of performance of kidney biopsies in adult patients with SLE, which is often a barrier to clinical trial enrollment. The excellent performance of our final SLE and lupus nephritis algorithms was likely due to the availability of SNOMED-CT concept codes for diagnosis, inclusion of biopsy procedure codes, and code set combinations.

Our study is the first to measure algorithm performance for SLE specifically in children. An SLE algorithm requiring three or more ICD-9-CM codes (710.0), each ≥30 days apart, has previously been used to identify 2959 childhood SLE cases in a Medicaid database between 2000 and 2004 (3) and 682 childhood SLE cases in the Clinformatics DataMart (OptumInsight) between 2000 and 2013 (31). However, because these datasets are completely deidentified, manual chart review to assess validity of the algorithms was not possible. The estimated prevalence of childhood SLE was ten per 100,000 children in the Medicaid populations (3). Here, we report 1508 cases of SLE in PEDSnet for an estimated prevalence of 22 per 100,000 children. This more than two-fold–higher rate of childhood SLE may be due to referral patterns, as PEDSnet health systems are all tertiary centers covering large geographic regions.

Similarly, our study is the first to assess lupus nephritis algorithm performance specifically in children. The Chibnik #1 algorithm (Supplemental Table 6) (two or more nephritis/proteinuria/kidney failure ICD-9-CM codes [580–588, 630–640, and 791.0], each ≥30 days apart) (21) identified 1106 cases of pediatric lupus nephritis from 2000 to 2004 in the US Medicaid database (3) and 166 cases from 2000 to 2013 in the Clinformatics DataMart (31). The estimated prevalence of lupus nephritis in childhood SLE on the basis of these studies is 24%–37%. Here, we identified 537 cases of lupus nephritis in PEDSnet and showed that the prevalence of lupus nephritis in childhood SLE was 36% between 2009 and 2019. Although the prevalence of nephritis in our cohort was lower than some previous reports, the rates are similar to those reported in the larger epidemiologic studies.

As confirmed in our study, the incorporation of medication exposure data into algorithms as inclusion criteria for identification of SLE and lupus nephritis can improve specificity. EHR-based studies have reported frequencies of exposure to antimalarial medications in adult SLE ranging from 70% to 94%, with decreased exposure in adults managed without a rheumatologist (28,32). Because patients with childhood SLE are more likely to be followed at academic medical centers with access to both rheumatology and nephrology, medication codes for antimalarials are likely to be higher.

There were differences in the performance of our SLE algorithm between discovery and validation cohorts. Because of differences in inclusion criteria (the discovery cohort only required greater than or equal to one SLE diagnosis code) and requirement for follow-up (the discovery cohort required only a single visit compared with 6 years of follow-up time in the validation cohort), there was a higher likelihood of incomplete SLE, Sjogren, and mixed connective tissue disorders among the patients in the manually assembled discovery dataset than for the larger validation cohort. This likely accounted for the lower specificity of the SLE algorithm in the discovery cohort. This highlights the importance of understanding the characteristics of the patient population before applying a new case definition.

Our approach to identification of childhood SLE and lupus nephritis has several limitations. Because the algorithms were on the basis of EHR data, they were limited to elements recorded during routine clinical care. Referral bias is possible, as the population of children in PEDSnet accesses tertiary care at children’s hospitals. However, SLE is uncommonly managed by general pediatricians, and the pediatric nephrology and rheumatology workforce is largely based at academic medical centers (33,34). Moreover, some PEDSnet health systems do include large primary care networks in addition to specialty care. Adolescents who receive care from adult providers would not be captured in the PEDSnet database. Although the algorithm demonstrated excellent classification accuracy within PEDSnet health systems, it may perform differently in other data resources. Evaluation across seven institutions with different EHR systems was a strength, and there was little variation in performance across sites.

Because our chart review limited the assessment of lupus nephritis algorithm performance to individuals who also met the case definition for SLE, our study does not report performance metrics of the LN_A7 algorithm alone. The chart review assessed SLE with and without nephritis separately, with a deliberate over-representation of lupus nephritis (50% of our putative SLE cases selected for chart review met criteria for lupus nephritis). This allowed for performance testing of algorithms to identify patients with SLE both with and without kidney involvement. The typical approach to studying new interventions in phase 3 clinical trials in adults with SLE has been to enroll patients with active SLE but without active lupus nephritis. After efficacy has been established in SLE without nephritis, follow-up trials then study the adult SLE population with active lupus nephritis. Our study provides computable phenotypes using the same approach to evaluate interventions for comparative effectiveness research in PEDSnet.

In summary, we developed and tested highly sensitive and specific computable phenotypes in PEDSnet to accurately identify the largest cohort of children with SLE and lupus nephritis to date. Our algorithms performed above the thresholds deemed necessary for rigorous observational research. This tool for rapid cohort ascertainment applied to a robust resource of multi-institutional longitudinal EHR data holds great promise to enhance and accelerate comparative effectiveness and health outcomes research. Validated algorithms as case definitions are the first step toward designing pragmatic clinical trials, where studies can be performed within the context of routine care across large, diverse populations.


M.A. Atkinson reports consultancy agreements with GlaxoSmithKline and financial interest in AstraZeneca. M.A. Atkinson's spouse is employed by AstraZeneca. L.C. Bailey reports receiving research funding with Bristol Myers Squibb and Jazz Pharmaceuticals. J.C. Chang reports grant support from GlaxoSmithKline. M.R. Denburg reports a consultancy agreement with Trisalus Life, receiving research funding from Mallinckrodt, serving as a scientific advisor or member of the National Kidney Foundation Delaware Valley Medical Advisory Board, other interests/relationships with the American Society of Pediatric Nephrology Research and Program Committees and the National Kidney Foundation Pediatric Education Planning Committee, and financial interest in In-Bore and Precision Guided Interventions LLC. M.R. Denburg's spouse reports consultancy agreements with Trisalus Life Sciences, ownership interest in In-Bore LLC and Precision Guided Interventions LLC, and serving as a scientific advisor or member of the Trisalus Life Sciences Scientific Advisory Board. V.R. Dharnidharka reports consultancy agreements with Atara Biotherapeutics and Medincell, receiving research funding from CareDx, receiving honoraria from CareDx, serving as a scientific advisor or member of North American Pediatric Renal Trials and Collaborative Studies, and other interests/relationships with Akebia/MedPace and the Independent Data Safety Monitoring Committee. B.P. Dixon reports consultancy agreements with Alexion Pharmaceuticals and Apellis Pharmaceuticals and receiving honoraria from Alexion Pharmaceuticals and Apellis Pharmaceuticals. J.T. Flynn reports receiving royalties from Springer, Inc. and UpToDate, Inc.; serving as an editorial board member of Blood Pressure Monitoring, an editorial board member of Hypertension, an editorial board member of Journal of Pediatrics, Editor-in-Chief of Pediatric Nephrology, and a board member of the Renal Physicians Association; and other interests/relationships with the American Society of Pediatric Nephrology, the International Pediatric Nephrology Association, and the Renal Physicians Association. C.B. Forrest does not receive any personal funding, but his employer (CHOP) receives funding that he oversees from Bayer, Lily, Sanofi, and UCB. C.B. Forrest reports patents and inventions with Johns Hopkins University. C. Gluck reports receiving honoraria from and serving as a scientific advisor or member of Retrophin and Sanofi Genzyme. M. Kallash reports receiving research funding from Duplex–Retrophin. A. Knight reports receiving honoraria from the American College of Rheumatology, CHOP, the Hospital for Special Surgery (New York), and the University of Minnesota and serving as a scientific advisor or member of the American Autoimmune Related Diseases Association, the Childhood Arthritis and Rheumatology Research Alliance, and the Lupus Foundation of America. M. Mitsnefes reports serving on the American Journal of Kidney Diseases Editorial Board, CJASN Editorial Board, and the Kidney Medicine Editorial Board. W.E. Smoyer reports consultancy agreements with Visterra; receiving research funding from Aurinia; receiving honoraria from Montefiore, the University of California Los Angeles (UCLA)–Clinical and Translational Science Awards (CTSA) External Advisory Committee, UpToDate chapter authorship, and the University of Southern California (USC)–CTSA External Advisory Committee; serving as a scientific advisor or member of the Institute for the Advancement of Clinical Trials in Children, NephCure Kidney International, and the Pediatric Nephrology Research Consortium (PNRC); and serving as a member of the board of directors of NephCure Kidney International and a member of the board of directors of PNRC. S. Sule reports consultancy agreements with Spring Nature Medicine Matters, receiving research funding from Pfizer, and serving as a scientific advisor or member of the National Institutes of Health. S.E. Wenderfer reports a consultancy agreement for an unrelated project with Bristol Myers Squibb; receiving honoraria from the Food and Drug Administration, the National Institutes of Health, and New York University; and serving as cochair of the Lupus Nephritis Working Group of the Childhood Arthritis and Rheumatology Research Alliance, on the editorial board of Pediatric Nephrology, and as cochair of the Glomerular Working Group of PNRC. All remaining authors have nothing to disclose.


Research reported in this publication was funded by the CHOP Pediatric Center of Excellence in Nephrology and the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health under Children’s Hospital of Philadelphia award P50DK114786. C.B. Forrest and PEDSnet are also supported by Patient-Centered Outcomes Research Institute grant RI-CRN-2020-007.

Published online ahead of print. Publication date available at


The authors thank the Glomerular Disease Learning Network (GLEAN) investigators and the CHOP Pediatric Center of Excellence in Nephrology for their support and guidance in the development and execution of this study, especially external advisors Dr. Alicia Neu and Dr. Michael Somers. The authors also thank Ms. Susan Hague, M.S., from the PEDSnet Data Coordinating Center for managing the data operations and ensuring the availability of the data used for analyses. PEDSnet is a Partner Network Clinical Data Research Network in PCORnet, an initiative funded by the Patient-Centered Outcomes Research Institute.

The research presented includes data from the following PEDSnet institutions: CHOP, Children's Hospital of Colorado, Cincinnati Children’s Hospital Medical Center, Nationwide Children's Hospital, Nemours Children's Health System (a Delaware and Florida health system), St. Louis Children’s Hospital, and Seattle Children’s Hospital.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Supplemental Material

This article contains the following supplemental material online at

Supplemental Material. Derivation of case-finding algorithms.

Supplemental Table 1. Characteristics of the manually assembled single-center cohort for pilot algorithm review at the discovery site to determine data elements for inclusion in the computable phenotype.

Supplemental Table 2. Diagnostic and provider code sets used in computable phenotypes.

Supplemental Table 3. Procedural code sets used in computable phenotypes.

Supplemental Table 4. Kidney disease/glomerular disease code set used in computable phenotypes.

Supplemental Table 5. Performance characteristics of the SLE computable phenotype algorithm across the PEDSnet population with two or more rheumatology or nephrology encounters stratified by presence or absence of kidney involvement.

Supplemental Table 6. Performance metrics of SLE and algorithms compared with historical controls.


1. Sule S, Fivush B, Neu A, Furth S: Increased risk of death in pediatric and adult patients with ESRD secondary to lupus. Pediatr Nephrol 26: 93–98, 2011
2. Sule S, Fivush B, Neu A, Furth S: Increased risk of death in African American patients with end-stage renal disease secondary to lupus. Clin Kidney J 7: 40–44, 2014
3. Hiraki LT, Feldman CH, Liu J, Alarcón GS, Fischer MA, Winkelmayer WC, Costenbader KH: Prevalence, incidence, and demographics of systemic lupus erythematosus and lupus nephritis from 2000 to 2004 among children in the US Medicaid beneficiary population. Arthritis Rheum 64: 2669–2676, 2012
4. Wenderfer SE, Ruth NM, Brunner HI: Advances in the care of children with lupus nephritis. Pediatr Res 81: 406–414, 2017
5. Smith EMD, Lythgoe H, Midgley A, Beresford MW, Hedrich CM: Juvenile-onset systemic lupus erythematosus: Update on clinical presentation, pathophysiology and treatment options. Clin Immunol 209: 108274, 2019
6. Oni L, Wright RD, Marks S, Beresford MW, Tullus K: Kidney outcomes for children with lupus nephritis. Pediatr Nephrol 36: 1377–1385, 2021
7. El-Garf K, El-Garf A, Gheith R, Badran S, Salah S, Marzouk H, Farag Y, Khalifa I, Mostafa N: A comparative study between the disease characteristics in adult-onset and childhood-onset systemic lupus erythematosus in Egyptian patients attending a large university hospital. Lupus 30: 211–218, 2021
8. Vazzana KM, Daga A, Goilav B, Ogbu EA, Okamura DM, Park C, Sadun RE, Smitherman EA, Stotter BR, Dasgupta A, Knight AM, Hersh AO, Wenderfer SE, Lewandowski LB; CARRA Registry investigators: Principles of pediatric lupus nephritis in a prospective contemporary multi-center cohort. Lupus 30: 1660–1670, 2021
9. Wenderfer SE, Lane JC, Shatat IF, von Scheven E, Ruth NM: Practice patterns and approach to kidney biopsy in lupus: A collaboration of the Midwest Pediatric Nephrology Consortium and the Childhood Arthritis and Rheumatology Research Alliance. Pediatr Rheumatol Online J 13: 26, 2015
10. Tanzer M, Tran C, Messer KL, Kroeker A, Herreshoff E, Wickman L, Harkness C, Song P, Gipson DS: Inpatient health care utilization by children and adolescents with systemic lupus erythematosus and kidney involvement. Arthritis Care Res (Hoboken) 65: 382–390, 2013
11. Rianthavorn P, Buddhasri A: Long-term renal outcomes of childhood-onset global and segmental diffuse proliferative lupus nephritis. Pediatr Nephrol 30: 1969–1976, 2015
12. Pathak J, Kho AN, Denny JC: Electronic health records-driven phenotyping: Challenges, recent advances, and perspectives. J Am Med Inform Assoc 20[e2]: e206–e211, 2013
13. Deans KJ, Sabihi S, Forrest CB: Learning health systems. Semin Pediatr Surg 27: 375–378, 2018
14. Forrest CB, Margolis PA, Bailey LC, Marsolo K, Del Beccaro MA, Finkelstein JA, Milov DE, Vieland VJ, Wolf BA, Yu FB, Kahn MG: PEDSnet: A national pediatric learning health system. J Am Med Inform Assoc 21: 602–606, 2014
15. Forrest CB, Margolis P, Seid M, Colletti RB: PEDSnet: How a prototype pediatric learning health system is being expanded into a national network. Health Aff (Millwood) 33: 1171–1177, 2014
16. Denburg MR, Razzaghi H, Bailey LC, Soranno DE, Pollack AH, Dharnidharka VR, Mitsnefes MM, Smoyer WE, Somers MJG, Zaritsky JJ, Flynn JT, Claes DJ, Dixon BP, Benton M, Mariani LH, Forrest CB, Furth SL: Using electronic health record data to rapidly identify children with glomerular disease for clinical research. J Am Soc Nephrol 30: 2427–2435, 2019
17. Khare R, Utidjian L, Ruth BJ, Kahn MG, Burrows E, Marsolo K, Patibandla N, Razzaghi H, Colvin R, Ranade D, Kitzmiller M, Eckrich D, Bailey LC: A longitudinal analysis of data quality in a large pediatric data research network. J Am Med Inform Assoc 24: 1072–1079, 2017
18. Khare R, Ruth BJ, Miller M, Tucker J, Utidjian LH, Razzaghi H, Patibandla N, Burrows EK, Bailey LC: Predicting causes of data quality issues in a clinical data research network. AMIA Jt Summits Transl Sci Proc 2017: 113–121, 2018
19. Hanly JG, Thompson K, Skedgel C: Identification of patients with systemic lupus erythematosus in administrative healthcare databases. Lupus 23: 1377–1382, 2014
20. Barnado A, Casey C, Carroll RJ, Wheless L, Denny JC, Crofford LJ: Developing electronic health record algorithms that accurately identify patients with systemic lupus erythematosus. Arthritis Care Res (Hoboken) 69: 687–693, 2017
21. Chibnik LB, Massarotti EM, Costenbader KH: Identification and validation of lupus nephritis cases using administrative data. Lupus 19: 741–743, 2010
22. Li T, Carls GS, Panopalis P, Wang S, Gibson TB, Goetzel RZ: Long-term medical costs and resource utilization in systemic lupus erythematosus and lupus nephritis: A five-year analysis of a large Medicaid population. Arthritis Rheum 61: 755–763, 2009
23. Chang JC, White BR, Elias MD, Xiao R, Knight AM, Weiss PF, Mercer-Rosa L: Echocardiographic assessment of diastolic function in children with incident systemic lupus erythematosus. Pediatr Cardiol 40: 1017–1025, 2019
24. Young DR, Waitzfelder BA, Arterburn D, Nichols GA, Ferrara A, Koebnick C, Yamamoto A, Daley MF, Sherwood NE, Horberg MA, Cromwell L, Lewis KH: The Patient Outcomes Research to Advance Learning (PORTAL) network adult overweight and obesity cohort: Development and description. JMIR Res Protoc 5: e87, 2016
25. Phillips CA, Razzaghi H, Aglio T, McNeil MJ, Salvesen-Quinn M, Sopfe J, Wilkes JJ, Forrest CB, Bailey LC: Development and evaluation of a computable phenotype to identify pediatric patients with leukemia and lymphoma treated with chemotherapy using electronic health record data. Pediatr Blood Cancer 66: e27876, 2019
26. Khare R, Kappelman MD, Samson C, Pyrzanowski J, Darwar RA, Forrest CB, Bailey CC, Margolis P, Dempsey A; And the PEDSnet Computable Phenotype Working Group: Development and evaluation of an EHR-based computable phenotype for identification of pediatric Crohn’s disease patients in a National Pediatric Learning Health System. Learn Health Syst 4: e10243, 2020
27. Jones WS, Mulder H, Wruck LM, Pencina MJ, Kripalani S, Muñoz D, Crenshaw DL, Effron MB, Re RN, Gupta K, Anderson RD, Pepine CJ, Handberg EM, Manning BR, Jain SK, Girotra S, Riley D, DeWalt DA, Whittle J, Goldberg YH, Roger VL, Hess R, Benziger CP, Farrehi P, Zhou L, Ford DE, Haynes K, VanWormer JJ, Knowlton KU, Kraschnewski JL, Polonsky TS, Fintel DJ, Ahmad FS, McClay JC, Campbell JR, Bell DS, Fonarow GC, Bradley SM, Paranjape A, Roe MT, Robertson HR, Curtis LH, Sharlow AG, Berdan LG, Hammill BG, Harris DF, Qualls LG, Marquis-Gravel G, Modrow MF, Marcus GM, Carton TW, Nauman E, Waitman LR, Kho AN, Shenkman EA, McTigue KM, Kaushal R, Masoudi FA, Antman EM, Davidson DR, Edgley K, Merritt JG, Brown LS, Zemon DN, McCormick 3rd TE, Alikhaani JD, Gregoire KC, Rothman RL, Harrington RA, Hernandez AF; ADAPTABLE Team: Comparative effectiveness of aspirin dosing in cardiovascular disease. N Engl J Med 384: 1981–1990, 2021
28. Jorge A, Castro VM, Barnado A, Gainer V, Hong C, Cai T, Cai T, Carroll R, Denny JC, Crofford L, Costenbader KH, Liao KP, Karlson EW, Feldman CH: Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms. Semin Arthritis Rheum 49: 84–90, 2019
29. Murray SG, Avati A, Schmajuk G, Yazdany J: Automated and flexible identification of complex disease: Building a model for systemic lupus erythematosus using noisy labeling. J Am Med Inform Assoc 26: 61–65, 2019
30. Li T, Lee I, Jayakumar D, Huang X, Xie Y, Eisen S, Ranganathan P: Development and validation of lupus nephritis case definitions using United States Veterans Affairs electronic health records. Lupus 30: 518–526, 2021
31. Chang JC, Mandell DS, Knight AM: High health care utilization preceding diagnosis of systemic lupus erythematosus in youth. Arthritis Care Res (Hoboken) 70: 1303–1311, 2018
32. Xiong WW, Boone JB, Wheless L, Chung CP, Crofford LJ, Barnado A: Real-world electronic health record identifies antimalarial underprescribing in patients with lupus nephritis. Lupus 28: 977–985, 2019
33. Correll CK, Ditmyer MM, Mehta J, Imundo LF, Klein-Gitelman MS, Monrad SU, Battafarano DF: 2015 American College of Rheumatology Workforce Study and Demand Projections of Pediatric Rheumatology Workforce, 2015-2030 [published online ahead of print October 27, 2020]. Arthritis Care Res (Hoboken) 10.1002/acr.24497
34. Primack WA, Meyers KE, Kirkwood SJ, Ruch-Ross HS, Radabaugh CL, Greenbaum LA: The US pediatric nephrology workforce: A report commissioned by the American Academy of Pediatrics. Am J Kidney Dis 66: 33–39, 2015

systemic lupus erythematosus; lupus nephritis; pediatrics; children; PEDSnet; learning health system; multi-institutional systems; health education

Copyright © 2022 by the American Society of Nephrology