Journal Logo

Original Clinical Science—General

Predicting Cellular Rejection With a Cell-Based Assay

Preclinical Evaluation in Children

Ashokkumar, Chethan PhD; Soltys, Kyle MD; Mazariegos, George MD; Bond, Geoffrey MD; Higgs, Brandon W. PhD; Ningappa, Mylarappa PhD; Sun, Qing MS; Brown, Amanda BS; White, Jaimie MS; Levy, Samantha BS; Fazzolare, Tamara MPAS, PAC; Remaley, Lisa MPAS, PAC; Dirling, Katie RN, BSN, CCTC; Harris, Patricia RN, CCTC, CRNP, DNP; Hartle, Tara RN, BSN, CCTC; Kachmar, Pamela RN, CPN, CCTC; Nicely, Megan RN, BSN, CPN; O'Toole, Lindsay RN, BSN, CPN; Boehm, Brittany CRNP; Jativa, Nicole CRNP; Stanley, Paula MSN, RN, CPN; Jaffe, Ronald MD; Ranganathan, Sarangarajan MD; Zeevi, Adriana PhD; Sindhi, Rakesh MD

Author Information
doi: 10.1097/TP.0000000000001076

Predicting acute cellular rejection (ACR) accurately can enhance safe use of immunosuppression in the rare population of children with liver transplantation (LTx) or intestine transplantation (ITx). Inadequate immunosuppression can lead to ACR in 30% to 40% LTx and 30% to 60% ITx, whereas overimmunosuppression is a leading cause of late mortality due to life-threatening infections and lymphoma.1-7 Immunosuppression dosing is based on the risk of rejection, which is assessed with a combination of clinical and laboratory findings and biopsy. These parameters lack specificity for rejection-risk. Features of ITx rejection, such as fever or diarrhea, or of LTx rejection, such as elevated liver function tests, are also seen with systemic viral illnesses. The crossmatch blood test predicts antibody-mediated rejection, but not ACR. Biopsies detect ongoing rejection, cannot predict a future episode, and are invasive surgical procedures, which can also cause bleeding or perforation.

Noninvasive prediction of rejection can add specificity to clinical rejection-risk assessment, but remains an unmet need and is challenging. Roughly 500 children receive LTx and 50 children receive ITx in the United States each year.8 These low numbers preclude powered organ-specific test evaluation, but qualify such an assay for regulatory consideration as an orphan device, because the disease condition affects 4000 or less patients per year.9 Augmenting analyzable subjects by combining LTx and ITx populations is a potential solution but would require a test system predicated on common mechanisms, for example, donor-specific alloresponse, a universal mechanism of transplant rejection. The humanitarian device exemption regulatory path incentivizes device development for orphan populations by requiring that such a test (1) addresses an unmet need and has no predicate for the intended use, (2) does not pose an unreasonable or significant risk of injury, and (3) demonstrates probable benefit which outweighs the risk of injury or illness related to its intended use.10 Impending regulation of in vitro diagnostics is likely to foster interest in this mechanism for rare and high-risk diseases.11,12

A prospective immune monitoring protocol at our center (National Clinical Trial 1163578) shows that allospecific T-cytotoxic memory cells (TcM), which express the inflammatory marker, CD154 (CD154+TcM) predict and associate with ACR after several types of transplants with high sensitivity and specificity in training set validation set testing of small cohorts.13-16 Described in our previous reports, the innovations in this test system relative to others include coculture of living responder and stimulator cells prelabeled with fluorochrome-labeled antibody, inclusion of monensin and detector antibodies to CD154 in the culture medium, and prediction of rejection with CD154+TcM.13-16 The CD154+TcM are measured in recipient peripheral blood leukocytes (PBL) after overnight stimulation with donor and HLA nonidentical PBL in parallel reactions. If donor-induced CD154+TcM exceed those induced by reference PBL, the resulting ratio termed the immunoreactivity index (IR) exceeds 1 and implies increased risk of rejection (Figure 1). An index less than 1 implies decreased risk. This concept was derived from the proliferative mixed lymphocyte culture, in which donor-specific alloreactivity was enhanced among rejection-prone children compared with those who were rejection-free.17,18 The IR is a personalized output because donor-specific CD154+TcM are normalized to those induced by a reference allostimulus for the same recipient. Disease specificity has been established with regression models, in which CD154+TcM emerged as the best predictor of rejection from among naive and memory T-helper and T-cytotoxic cells (Tc) in independent analyses of liver, intestine, and renal allograft recipients.13-16 If donor cells are not available for extended testing, PBL from normal human subjects, which match donor at 1 antigen each at the HLA-A, -B, and -DR loci, have been used as “surrogate” donor cells in this test system without compromising rejection-risk assessment.16 Based on these data and unmet clinical needs, CD154+TcM received Humanitarian Use Device designation (08-0206) for the measurement of rejection risk and the management of immunosuppression in children with LTx or ITx by the Food and Drug Administration (FDA)'s Office of Orphan Products in 2009. Here, we describe preclinical performance evaluation of this test system leading to its FDA approval.19 The additional innovations described here include a negative control reaction condition to enhance reliability of the flow cytometry gating strategy, statistical comparison of stimulated and background reaction conditions to enhance reliable detection of true-positive CD154+TcM, test standardization with current good manufacturing practices reagents and extensive reproducibility testing, and validation of test performance in training set samples in independent validation samples.20,21

FIGURE 1
FIGURE 1:
Upper panel with 4 scatterplots shows increased risk of rejection, because CD154+TcM induced by stimulation with donor PBL exceed those produced after stimulation with HLA nonidentical PBL in the reference reaction. Lower panel with 4 scatterplots shows decreased risk of rejection, because donor-induced CD154+TcM are exceeded by those in the reference reaction. The antibody to CD154 is labeled with the fluorochrome, phycoerythrin. T-cytotoxic memory cells which express CD154 (green dots) are separated from those that do not express CD154 (magenta dots) by implementing the gating strategy described in Supplementary Figure 1 in negative control reaction condition. SSC, side scatter.

MATERIALS AND METHODS

Subjects

After informed consent (University of Pittsburgh IRB 0405628 National Clinical Trial 1163578), blood samples were obtained prospectively from children younger than 21 years with LTx or ITx to determine immunoreactivity indices of CD154+TcM (IR).

Samples and Assay

Samples were obtained before (IR0) or after transplantation during the first 60 days (IR1), days 61 to 199, and at days 200 onward (IRx) at surveillance visits or “for cause” biopsies. Ficoll-purified PBL from 3 to 5 mL whole blood were deidentified and cryopreserved in liquid nitrogen for batched analysis of allospecific CD154+TcM with flow cytometry after overnight 16-hour culture with donor cells and HLA nonidentical human cells in parallel reactions, as described previously (Figure 1).13 Because recipients return to referring facilities during days 61 to 199, sample collection was inconsistent during this period. Therefore, these samples were not analyzed. Samples in which stimulation with donor and HLA nonidentical PBL failed to generate increased CD154+TcM cell counts over background (P ≤ 0.05, Poisson test) were not analyzed.20 Samples with less than 0.45 million viable PBL after thawing were inadequate for assay setup and were discarded.

Endpoints and Terminology

The ACR within the 60-day period after sampling or after transplantation was the study endpoint. Biopsy-proven rejection was confirmed by re-review of all biopsies by either 1 of 2 senior pathologists (R.J. or S.R.) using established criteria.21 In some LTx recipients, who could not be biopsied, elevated liver function tests and absence of bile duct dilatation on ultrasound implied rejection. Subjects with and without ACR in the 60-day postsampling period were termed rejectors and nonrejectors, respectively.

Study and Assay Design

The test system was evaluated in 3 phases between 2006 and 2012: on training set subject samples, on normal human PBL for assay standardization and precision testing, and on validation set subject samples.

Deidentified training set samples were analyzed with research-grade fluorochrome-labeled antibodies and the LSRII flow cytometer (BDBiosciences, San Jose, CA) between 2006 and 2010. Test results were merged with outcomes. Threshold IR values which predicted rejection within 60 days after the sample were established with training set samples. A separate threshold was developed for pretransplant IR0, when no immunosuppression is used. Posttransplant IR1 and IRx samples were analyzed together because they were obtained from immunosuppressed subjects. Only 1 sample was used in the pretransplant or posttransplant periods from any given subject so only independent measurements existed within respective posttransplant and pretransplant models. To capture as many early rejection events in these rare subjects, the IR1 sample was used preferentially over the IRx sample if both were available from a recipient. The general approach to training set/validation set testing is illustrated in Figure 2.

FIGURE 2
FIGURE 2:
Flow chart with timelines for testing of training set samples, assay standardization and precision testing, and testing of validation set samples.

Before testing the performance of predictive IR thresholds in validation set samples, a standardized test format was developed between 2011 and 2012 using assays between HLA-mismatched PBL from normal human subjects. Test reproducibility was established per guidelines of the National Committee of Clinical Laboratory Standards.22 These assays used cGMP-synthesized versions of antibodies used previously and which were conjugated to brighter fluorochromes (BD Biosciences, San Jose, CA) and the FDA-approved FACS-CANTO flow cytometer (BD Biosciences). Stimulator and responder PBL were prelabeled with an identical clone of anti-Tc antibody conjugated to 2 different tandem dyes to distinguish responder from stimulator (Figure S1, SDC,http://links.lww.com/TP/B235). The brighter tandem dyes, allophycocyanin-H7 (APCH7, catalog number 641409) for responder Tc and phycoerythrin-cyanin-7 (catalog number 335805) for stimulator Tc, prevented loss of cell counts due to dye quenching, and confirmed that the tandems did not dissociate and stain other cells in the culture. Other reagents included the viability dye 7-aminoactinomycin-D, catalog number 559925) and fluorochrome-labeled antibodies to the T-cell marker CD3-FITC (flourescein isothiacyanate, catalog number 349201), and the memory marker CD45RO-APC (allophycocyanin, catalog number 340438) (Figure S1, SDC,http://links.lww.com/TP/B235). No change was made to (a) the anti-CD154 antibody (catalog number 555700) which is custom conjugated to the fluorochrome phycoerythrin (PE) for our purposes under cGMP conditions by BD Biosciences, San Jose, and (b) the cell culture medium consisted of Roswell Park Memorial Institute (Invitrogen, catalog number 22400-089), fetal calf serum (Invitrogen, catalog number 10082-147), and monensin (Golgi stop, BD Biosciences, catalog number 5544724).

In the final assay used for reproducibility studies, recipient PBL prelabeled with anti–CD8-allophycocyanin-H7 were incubated without (negative control) or with anti–CD154-PE (background) in culture medium. For the variability studies, prelabeled recipient PBL was also incubated 1:1 with HLA nonidentical PBL prelabeled with anti–CD8-PECy7 (stimulated). The stimulated reaction was replaced with the donor and reference reactions in assays performed in subject samples. The donor and reference reactions consisted, respectively, of prelabeled recipient PBL incubated 1:1 with prelabeled donor PBL (donor) and prelabeled HLA nonidentical PBL (reference). Figure S1, SDC,http://links.lww.com/TP/B235 describes the gating strategy for the test system. The preset acceptable upper limit of mean coefficient of variation (%CV) for CD154+TcM induced by stimulation was 20%.

Validation set samples consisted of archived subject samples with 2 million or greater total cells, which were not tested with or were accrued after testing of training set samples. These samples were obtained between 2009 and 2012, deidentified by study coordinator (A.B.), and analyzed with the standardized test format between 2012 and 2013. Test results were linked to subject identity and outcomes by the statistician (B.H.), performance determined by applying training set rejection-risk thresholds, and results communicated to senior author (R.S.).

Overlap in Training and Validation Set Periods

To use resources efficiently, testing of some samples obtained during the accrual period for the training set (2006-2010) was deferred pending availability of additional samples from the same subject or stimulator cells from the appropriate normal human donor. These samples made up the validation set along with those collected after the training set collection period (2009-2012), resulting in overlapping periods for the 2 sample sets (Table 1A). There was no contamination of samples between the training and validation data sets for a particular period, pretransplant or posttransplant.

TABLE 1A
TABLE 1A:
Subject and sample characteristics

Statistical Analysis

Logistic regression was used to define respective IR thresholds for pretransplant and posttransplant training set samples at or above which rejection was predicted within the 60-day period after sampling.23,24 To evaluate factors confounding prediction of ACR, covariates in the logistic model included age, sex, race (white vs nonwhite), type of stimulator cell (actual donor or surrogate donor), organ transplant type (liver, intestine, combined liver-intestine, or combined liver-kidney), tacrolimus whole blood concentrations (FKWBC), induction (rabbit antihuman thymocyte globulin [Genzyme, Cambridge, MA], campath [alemtuzumab, Genzyme], or none), and time between transplantation and outcome. The IR of CD154+TcM was log10 transformed to reduce the effect of skewness (rejectors: >1-46, Table S1, SDC,http://links.lww.com/TP/B235; and nonrejectors: 0-7) and achieve normality. Test performance was calculated as sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV) with 95% confidence intervals, as well as area under the receiver operating characteristic (ROC) curve. For the ROC analysis, we weighed the sensitivity and specificity equally and selected the cut-point that maximized both of these parameters simultaneously. The pretransplant and posttransplant logistic regression models both stratified by and including all covariates (described above) were compared with the single CD154+TcM IR variable models for predicting training set samples. All analyses were conducted in the R statistical programming environment.25

RESULTS

Patients

Test performance was evaluated in 280 total samples from 214 subjects. The training set included 158 samples from 127 subjects (Table 1A). After excluding 11 samples, which failed stimulation, 147 samples from 120 subjects were analyzed. Samples were evenly distributed in pretransplant or IR0, and the 2 posttransplant periods, IR1 and IRx. The validation set of 122 samples from 87 subjects was similarly reduced to 97 analyzable samples from 72 subjects after excluding 9 samples with inadequate cell counts and 16 samples for failed stimulation. Fewer actual donor cells were used as stimulators in the validation cohort because of fewer living donor LTx in this period. The FKWBC were also lower in the validation set. Fewer small bowel–containing allograft recipients were present in the validation set. The groups were similar in all other respects. Sampling occurred at a mean interval of 2 weeks before a biopsy in either cohort. Differences in donor-recipient HLA matching between rejectors and nonrejectors did not achieve statistical significance (Table 1B). Three subjects who provided an analyzable pretransplant (IR0) training set sample also provided an analyzable validation set IRx sample late after transplantation (Figure S2, SDC,http://links.lww.com/TP/B235).

TABLE 1B
TABLE 1B:
Differences in HLA match at the HLA-A, -B, and -DR loci between rejectors and nonrejectors for pretransplant and posttransplant samples in the training and validation sets

Immunosuppression

The relative distribution of induction and maintenance immunosuppression among analyzable pretransplant and posttransplant samples in the training and validation sets are shown in Table 1C. Induction was performed with rabbit antihuman thymocyte globulin (Genzyme) or alemtuzumab (campath, Genzyme, Cambridge, MA) in all intestine recipients and some liver recipients. A subset of liver recipients did not receive induction therapy. Maintenance immunosuppression was started after transplantation and consisted of tacrolimus or rapamycin as the primary agent. Steroids and cellcept were used as adjunctive maintenance agents. Three liver recipients, 2 in the training set and 1 in the validation set were free of maintenance immunosuppression. Fewer samples were obtained after campath induction in the validation set compared with the training set because of fewer recipients of small bowel allografts in the validation set.

TABLE 1C
TABLE 1C:
Distribution of induction and maintenance immunosuppressants

Diagnoses

The diseases leading to end-stage disease and transplantation for liver- or intestine-containing allografts are shown in Table 2.

TABLE 2
TABLE 2:
Causes of end-organ disease requiring liver or intestine transplantation in 214 study subjects

Test Standardization

Using PBL from normal human subjects, we first confirmed that manufacturer-recommended concentrations of each of the abovementioned fluorochrome-labeled antibodies and 7-aminoactinomycin-D were at or exceeded the minimum concentration to detect the highest percentage of positive cells.26 Next, we established the specificity of each antibody in the cocktail by measuring the variation in frequencies of CD8+ cells or Tc upon adding each antibody alone and in combination with others. The %CV in the frequency of Tc in PBL from 3 normal human subjects was 3.5% to 12.2% with successive addition of each antibody, except anti-CD154 (Table 3). The acceptable %CV for this and all other phases of reproducibility testing shown below is 20% or less. When anti-CD154-PE was added to the remaining fluorochrome-labeled antibodies, the variation in Tc frequency ranged from %CV of 1.04% to 5.9%. Two lots of each antibody were tested for their variability in detecting respective target marker using PBL from 3 normal human subjects. The %CV ranged from 0.9% to 15.3%.

TABLE 3
TABLE 3:
Effect of multiple antibodies on %CD8+ cells labeled with anti-CD8-APCH7

Reproducibility testing studies were conducted using PBL from normal human subjects, because our clinical subjects many of whom are 6 months in age and weigh 4 kg cannot provide the blood sample volume for multiple replicates. The mean CV in allospecific CD154+TcM which were induced by stimulation was evaluated in each study. In addition to the 3 reproducibility studies described below, reproducibility was also evaluated for samples tested on 3 different flow cytometers by 3 different operators (n = 21; CV, 8.2 ± 4.8%, Table S2, SDC,http://links.lww.com/TP/B235), and for samples tested by 2 different technicians (n = 5; CV, 4.8 ± 3%, Table S3, SDC,http://links.lww.com/TP/B235).

Effect of Cryopreservation

Because test performance was established in cryopreserved archived subject samples, variation due to cryopreservation was established in assays between 20 HLA-mismatched unique pairs of PBL from normal human subjects before and 30 days after cryopreservation. Stimulated CD154+TcM before and after cryopreservation demonstrated an acceptable mean %CV of 8.9%, which was below the prespecified 20% limit (Tables 4A and 4B).

TABLE 4A
TABLE 4A:
Mean %CD154+TcM in 20 normal human blood samples tested before and 30 days after cryopreservation
TABLE 4B
TABLE 4B:
Mean %CV for CD154+TcM in 20 normal human blood samples tested before and after 30-day cryopreservation

Same-Day Duplicate Testing

Assays between twenty unique pairs of HLA-mismatched PBL from normal human subjects were performed in duplicate (a and b) in each of 2 runs (runs 1 and 2) on the same day to determine within run (a vs b within runs 1 and 2) and between run (all replicates) variability in CD154+TcM generated in the stimulated reaction. Stimulated CD154+TcM in all replicates of each sample demonstrated an acceptable mean %CV of 6.0%, which was below the prespecified 20% limit (Tables 5A and 5B).

TABLE 5A
TABLE 5A:
Mean %CD154+TcM for 20 duplicate assays (a and b) in each of the 2 runs (1 and 2)
TABLE 5B
TABLE 5B:
Mean %CV for %CD154+TcM within each of the 2 runs, 1 and 2, and for all replicates performed in both runs

Day-to-Day Variation

Real-life patient samples can be tested on the same day (condition 1a), after 24-hour storage at ambient temperature in a reference laboratory if the samples arrive late in the day from a local hospital (condition 1b), or after overnight shipment at ambient temperature (condition 1c). Five unique pairs of HLA-mismatched PBL from normal human subjects were tested under each condition. Stimulated CD154+TcM in all replicates of each sample demonstrated an acceptable mean %CV of 3.2%, which was below the prespecified 20% limit (Tables 6A and 6B).

TABLE 6A
TABLE 6A:
Mean %CD154+TcM for each condition of storage/shipment of 5 samples in day-to-day variation testing
TABLE 6B
TABLE 6B:
%CV for %CD154+TcM between 3 conditions of storage/shipment for 5 samples in day to day variation testing

Development of Multivariate (Optimal) and Single-Variable Predictive Models in Training Set

For 98 analyzable posttransplant training set samples, the IR of CD154+TcM (P = 0.0008), organ transplant type (P = 0.019), and FKWBC (P = 0.004) emerged as significant covariates in logistic regression analysis. Stepwise (exhaustive) regression identified the most predictive, yet parsimonious model. The optimal model contained the 5 variables: time between transplantation and assay (P = 0.061), race (P = 0.053), organ transplant type (P = 0.0028), FKWBC (P = 0.0025), and IR of CD154+TcM (P = 0.0003). For 49 analyzable pretransplant training set samples, the IR of CD154+TcM (P = 0.0041) emerged as the most significant covariate in logistic regression. In stepwise regression, the optimal model contained the 4 variables: organ (P = 0.16), sex (P = 0.026), race (P = 0.076), and IR of CD154+TcM (P = 0.002). For either pretransplant or posttransplant models, the cut point was identified as the optimal level of both sensitivity and specificity from the ROC curve of this training set predicting training set (ie, optimal true-positive and true-negative values). To identify the tradeoff in predictive accuracy between the optimal model with multiple variables and a model with the single most overall predictive variable, the IR of CD154+TcM, performance of these 2 logistic regression models was compared in the training set (Tables S4, SDC,http://links.lww.com/TP/B235). For the single variable posttransplant or IR1+IRx model, the cut point was determined at a raw IR value of 1.10. The raw IR value for the single variable pretransplant or IR0 model was 1.23. The ROC curves for the single variable model for training and validation set pretransplant and posttransplant samples are shown in Figure 3.

FIGURE 3
FIGURE 3:
The ROC curves for posttransplant (IR1+IRx) training (left panel) and validation (middle panel) and pretransplant (right panel) data sets using single variable IR value. Each plot also shows ROC curves for corresponding early posttransplant (IR1) and late posttransplant (IRx) samples. TP rate, true-positive rate; FP rate, false-positive rate; AUC, area under the receiver-operating characteristic curve (Table S6, SDC, http://links.lww.com/TP/B235 for additional details).

Model Stability

Given the modest number of rejection events, for example, 25 in the posttransplant training set samples, model overfitting is a distinct possibility.27 The coefficient of the IR variable in the posttransplant training set samples was 3.41 in the multivariate model and 3.31 in the single variable model based on the IR alone—a difference of approximately 3% (Table S5, SDC,http://links.lww.com/TP/B235). The error term for this coefficient goes from 0.93 in the multivariate model to 0.77 in the single variable model—a difference of approximately 18%. This result and the reproducibility of predictive performance in an independent validation set reassure us that this model is in fact stable and predictive. Additionally, beyond adjusting for potentially confounding variables, we have performed multiple stratified analyses, where the performance of the single variable model is evaluated in subjects subgrouped by the various covariates. The results of stratified subanalyses are shown for the covariates type of organ transplanted, type of induction, whether actual or surrogate donor stimulators were used, and whether rejection or nonrejection were diagnosed by “for-cause” or surveillance biopsy or clinically (Tables S6-S9, SDC,http://links.lww.com/TP/B235). These analyses also confirm good stability in model performance.

Replication of Test Performance in Validation Samples and Final Model Selection

The optimal models for pretransplant and posttransplant samples, which incorporated multiple covariates demonstrated inferior performance when applied to corresponding validation set samples (Tables S4a and S4b, SDC,http://links.lww.com/TP/B235). The single variable model demonstrated consistent performance for predicting rejection in the training and validation sets. An IR of 1.1 or greater in posttransplant samples demonstrated sensitivity and specificity of 92% and 84%, respectively, in training set and 84% and 80%, respectively, in the validation set (Table 7A). An IR of 1.23 or greater in pretransplant samples predicted lower sensitivity of 57% in the validation set compared with 80% in the training set (Table 7B). However, the respective 95% confidence intervals showed overlap, 30% to 81% versus 59% to 92%, and test specificity, PPV, and NPV were similar.

TABLE 7
TABLE 7:
Performance of single variable posttransplant (A, upper table) and pretransplant (B, lower table) models based on IR of CD154+TcM in training and validation sets

Additional Analyses to Test the Effect of Confounders

Comparable test performance within the range seen in overall training and validation set samples was also seen in samples subgrouped by time of sampling after transplantation, the type of stimulator-actual or surrogate donor, organ transplant type, type of induction immunosuppression, and whether rejection or nonrejection were diagnosed by for-cause or surveillance biopsy or clinically (Tables S6-S10, SDC,http://links.lww.com/TP/B235). Performance estimates are less likely to be meaningful for those subgroups with small numbers.

Adverse Events

No adverse events were encountered due to phlebotomy.

DISCUSSION

Our study shows that a “fine” functional T-cell subset, allospecific CD154+TcM, predicts ACR in the rare population of children with LTx or ITx and addresses the unmet need for noninvasive rejection-risk assessment. Developed in samples from 127 children, test performance is replicated in blinded samples from 87 subjects. Test sensitivity, specificity, PPV, and NPV of 92%, 84%, 65%, and 97%, respectively, in posttransplant training set samples, and 84%, 80%, 64%, and 92%, respectively, in blinded independent posttransplant validation set samples, which were tested 18 months later with a standardized assay format with cGMP reagents and instruments represent true replication. Significant attributes of the test system include actionable results after overnight culture, and the potential for indefinite testing with “surrogate” donor stimulators without compromising rejection-risk determination (Table S8, SDC,http://links.lww.com/TP/B235). Other advantages are a personalized test output, the IR, and prediction of early rejection with pretransplant samples. The lower sensitivity of test predictions with pretransplant validation set samples of 57% is noteworthy compared with 80% sensitivity in the training set. The smaller numbers of rejectors in the validation set compared with the training set, 14 versus 25, and overlap in respective of 95% confidence intervals, 30% to 81% versus 59% to 91% offer reassurance that the actual sensitivity may lie within these estimates. This performance is reasonable given that there is no other noninvasive predictor of cellular rejection for this rare population. The confidence intervals for pretransplant sensitivity also encompass the performance of the Enzyme-Linked ImmunoSpot in predicting renal transplant rejection and suggest that lower predictive sensitivity is a feature of pretransplant samples.28 Enhanced donor-specific alloreactivity, the mechanism underlying ACR in a variety of organ transplants, and its measurement with CD154+TcM, the parameter used to measure rejection-risk makes this test system potentially adaptable to other types of organ transplants. Finally, the test is highly reproducible, with CV of 10% or less in simulated daily testing, and after 24-hour storage or overnight shipment.

Several factors may affect test performance. The type of cell stimulator, whether surrogate or actual donor cell, was not a significant covariate in the regression analysis, which established predictive thresholds. This is consistent with previously reported stability in rejection-risk assessment in samples tested with both types of stimulators.16 As added evidence, reasonable test performance is also seen in subjects subgrouped further by surrogate donor or actual donor stimulator cells (Table S6, SDC,http://links.lww.com/TP/B235), and by various other confounders (Tables S7-S10, SDC,http://links.lww.com/TP/B235). Further, optimal predictive models, which incorporated the covariate organ type and several other covariates such as type of stimulator, tacrolimus whole blood levels, race, time between transplantation and sample, and type of induction treatment demonstrated inferior performance when applied to validation set samples. In contrast, the single variable model based on the IR of CD154+TcM performed consistently in training and validation sets. Possible reasons include the fact that compared with other T-cell subsets, the alloresponse of CD154+TcM has shown specificity for rejection after 3 different types of transplants including those evaluated here. Second, by reporting test results as an index which uses a reference alloresponse to normalize donor-induced CD154+TcM from the same patient likely negates the effect of these confounders, which are expected to affect either reaction proportionately.

The effect of opportunistic tissue-invasive infections with cytomegalovirus and Epstein-Barr virus on rejection-risk assessment with CD154+TcM remains unknown. These infections were absent in all but 1 subject at the time when analyzable blood samples were obtained, likely due to preemptive treatment of viremia with evolving surveillance protocols in most centers. This subject experienced Epstein-Barr viral enteritis in the intestine allograft. The posttransplant sample from this subject obtained during this episode failed allostimulation. Therefore, no result could be generated. Test formats and thresholds for PCR-based viral load monitoring changed throughout the 6-year study period, precluding reliable assessments of the effect of viremia on test performance during this preclinical evaluation. Early performance evaluation (unpublished) during clinical use of this test system in 63 children with LTx or ITx has shown that test predictions have not been confounded by infections. This cohort includes 20 children, who were evaluated in the preclinical phase and retested as a component of clinical care, and 43 new subjects. Among 11 of these 63 children, 1 experienced biopsy-proven cholangitis, 1 experienced adenoviral allograft enteritis and 9 demonstrated Epstein-Barr virus viral replication without tissue-invasive disease with mean (SEM, range) Epstein-Barr virus viral load of 10926 copies per mL (4472; range, 120-31 000) at the time of sampling. No differences were seen between children with infection compared with those without infection in test sensitivity (3/4 or 75%, vs 18/21 or 86%; P = 0.527; NS, Fisher exact test) and specificity (6/7 or 86% vs 31/31 or 100%; P = 0.184, NS). The cytomegalovirus viremia was not reported or detected in this clinical cohort on the day of sampling. An expanded clinical evaluation will be the subject of a follow-up report.

Because the determination of rejection-risk is central to the daily management of a transplant recipient, clinical situations most suited for this test system are likely to vary. Our early experience suggests that the adjunctive information provided by noninvasive rejection-risk assessment is likely (i) to assist clinical decision-making when minimization of immunosuppression is being considered earlier than indicated by the prevailing clinical protocol; and (ii) to better assess the clinical significance of indeterminate, borderline or nonspecific inflammatory changes in late surveillance biopsies.29 Additional analysis of data obtained during clinical use will determine whether the test is being used in this way.

In summary, allospecific TcMs fulfil an unmet need for personalized prediction of ACR in the rare and high-risk population of children with LTx or ITx with clinically acceptable and reproducible performance. The potential benefit of risk-based optimization of immunosuppression with adjunctive information provided by this first-in-class flow cytometric test outweighs the risks of phlebotomy. The additional risks of undetected false-positive and false-negative results are minimized by using test results as an adjunct with all available clinical and laboratory information, in a manner concurrent with current clinical practice.

ACKNOWLEDGMENTS

The authors thank Plexision, Inc, Pittsburgh, PA, for assay standardization and data analyses. The authors also thank Ms Dale Zecca for manuscript preparation.

REFERENCES

1. Shepherd RW, Turmelle Y, Nadler M, et al. Risk factors for rejection and infection in pediatric liver transplantation. Am J Transplant. 2008;8:396–403.
2. Reyes J, Mazariegos GV, Abu-Elmagd K, et al. Intestinal transplantation under tacrolimus monotherapy after perioperative lymphoid depletion with rabbit anti-thymocyte globulin (thymoglobulin). Am J Transplant. 2005;5:1430–36.
3. Nayyar N, Mazariegos G, Ranganathan S, et al. Pediatric small bowel transplantation. Semin Pediatr Surg. 2010;19:68–77.
4. Sudan DL, Shaw BW Jr, Langnas AN. Causes of late mortality in pediatric liver transplant recipients. Ann Surg. 1998;227:289–95.
5. Fridell JA, Jain A, Reyes J, et al. Causes of mortality beyond 1 year after primary pediatric liver transplant under tacrolimus. Transplantation. 2002;74:1721–24.
6. Soltys KA, Mazariegos GV, Squires RH, SPLIT Research Group. Late graft loss or death in pediatric liver transplantation: an analysis of the SPLIT database. Am J Transplant. 2007;7:2165–71.
7. Abu-Elmagd KM, Costa G, Bond GJ, et al. Five hundred intestinal and multivisceral transplantations at a single center: major advances with new challenges. Ann Surg. 2009;250:567–81.
8. UNOS OPTN Annual Report. 2008 Annual Report of the U.S. Organ Procurement and Transplantation Network and the Scientific Registry of Transplant Recipients: Transplant Data 1998-2007. Department of Health and Human Services, Health Resources and Services Administration, Healthcare Systems Bureau, Division of Transplantation, Rockville, MD; United Network for Organ Sharing, Richmond, VA; University Renal Research and Education Association, Ann Arbor, MI.
9. Designating Humanitarian Use Devices. http://http://www.fda.gov/ForIndustry/DevelopingProductsforRareDiseasesConditions/DesignatingHumanitarianUseDevicesHUDS/default.htm. Accessed January, 23, 2015.
10. Eydelman MB, Chen EA. The FDA's humanitarian device exemption program. Health Aff (Millwood). 2011;30:1210–12; author reply 1212. doi: 10.1377/hlthaff.2011.0550.
11. Sharfstein J. FDA Regulation of laboratory-developed diagnostic tests: Protect the public, advance the science. JAMA. 2015;313:667–8.
12. Evans JP, Watson MS. Genetic testing and FDA regulation: overregulation threatens the emergence of genomic medicine. JAMA. 2015.
13. Ashokkumar C, Talukdar A, Sun Q, et al. Allospecific CD154+ T cells associate with rejection risk after pediatric liver transplantation. Am J Transplant. 2009;9:179–91.
14. Ashokkumar C, Gupta A, Sun Q, et al. Allospecific CD154+ T cells identify rejection-prone recipients after pediatric small-bowel transplantation. Surgery. 2009;146:166–73.
15. Sindhi R, Ashokkumar C, Higgs BW, et al. Allospecific CD154+T-cytotoxic memory cells as potential surrogate for rejection risk in pediatric intestine transplantation. Pediatr Transplant. 2012;16:83–91. Pediatr Transplant. 2012;16:913.
16. Ashokkumar C, Shapiro R, Tan H, et al. Allospecific CD154+ T-cytotoxic memory cells identify recipients experiencing acute cellular rejection after renal transplantation. Transplantation. 2011;92:433–38.
17. Sindhi R, Magill A, Bentlejewski C, et al. Enhanced donor-specific alloreactivity occurs independently of immunosuppression in children with early liver rejection. Am J Transplant. 2005;5:96–102.
18. Khera N, Janosky J, Zeevi A, et al. Persistent donor-specific alloreactivity may portend delayed liver rejection during drug minimization in children. Front Biosci. 2007;12:660–63.
19. FDA U.S. Food and Drug Administration, Recently-Approved Devices. http://http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfTopic/pma/pma.cfm?num=h130004.
20. Roederer M. How many events is enough? Are you positive? Cytometry A. 2008;73:384–85.
21. National Committee for Clinical Laboratory Standards (NCCLS). Evaluation of Precision Performance of Quantitative Measurement Methods; Approved Guideline—Second Edition. NCCLS document EP5-A2, 2004.
22. International Panel: Demetris AJ, Batts KP, Dhillon AP, et al. Banff schema for grading liver allograft rejection: An international consensus document. Hepatology. 1997;25:658–63.
23. Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed. New York, NY: John Wiley & Sons Inc;2000.
24. Fawcett T. ROC Graphs: Notes and Practical Considerations for Researchers. March 16, 2004:1–38. http://binf.gmu.edu/mmasso/ROC101.pdf. Accessed December 29, 2014.
25. R Development Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
26. Hulspas R. Titration of Fluorochrome-Conjugated UNIT 6.29 Antibodies for Labeling Cell Surface Markers on Live Cells. In J. Paul Robinson, ed. Current Protocols in Flow Cytometry. 2010; 54:6.29:6.29.1-6:29.9. DOI: 10.1002/0471142956.cy0629s54.
27. van der Ploeg T, Austin PC, Steyerberg EW. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol. 2014;14:137.
28. Augustine JJ, Siu DS, Clemente MJ, et al. Pre-transplant IFN-gamma ELISPOTs are associated with post-transplant renal function in African American renal transplant recipients. Am J Transplant. 2005;5:1971–1975.
29. Ranganathan S, Celik N, Mazariegos G, et al. Liver allograft fibrosis and minimization of immunosuppression. Pediatr Transplant. 2015;19:667–668.

Supplemental Digital Content

Copyright © 2017 Wolters Kluwer Health, Inc. All rights reserved.