A cochlear implant (CI) is an implantable device that can partially restore the hearing ability of patients with severe sensorineural hearing loss . Although speech perception capabilities of patients with CIs have improved dramatically over the years, speech outcomes of patients with CIs have been quite unpredictable and variable (van Dijk et al. 1999 ; Turner et al. 2002 ; van Eijl et al. 2017 ). An important factor that affects the speech outcomes of patients with CIs is the condition of the auditory nerve. The neural responses generated by auditory nerve fibers (ANFs) can be evaluated by measuring electrically evoked compound action potentials (eCAPs) in patients with CIs (Fayad & Linthicum 2006 ; Kim et al. 2010 ; Garadat et al. 2012 ; He et al. 2017 ). The eCAP is typically assessed by examining its amplitude; namely, the difference between the first negative peak (N1) and the first positive peak (P1) (e.g., Goldstein & Kiang 1958 ; Lai & Dillier 2000 ; Kim et al. 2010 ). This amplitude is thought to be approximately proportional to the number of ANFs that responded to the stimulus pulse (e.g., Westen et al. 2011 ; van Gendt et al. 2019 ).
Fig. 1.: Extraction of the
temporal firing properties of excited auditory nerve fibers from eCAPs, based on an iterative deconvolution method proposed by
Dong et al. (2020 , 2021 ). In this method, an eCAP (A, blue line) was calculated by convolving a human unitary response (UR) (B) and a parameterized CDLD (C), optimized to match a recorded eCAP (A, green line), and iteratively minimizing the fitting error. This CDLD (C) consists of early and late Gaussian components; the parameters of the early component (α
1 , µ
1 , and σ
1 ) and the late component (α
2 , µ
2 , and σ
2 ) reflect the
temporal firing properties . CDLD indicates compound discharge latency distribution; eCAP,
electrically evoked compound action potential ; E-Gauss, early Gaussian component; L-Gauss, late Gaussian component; P-eCAP, predicted eCAP; R-eCAP, recorded eCAP.
C D L D = α 1 ∗ N ( μ 1 , σ 1 ) + α 2 ∗ N ( μ 2 , σ 2 )
where N represents the Gaussian distribution; the variables α 1 , µ1 and σ 1 belong to the early Gaussian component (in time), and the variables α 2 , μ 2 and σ 2 belong to the late Gaussian component. The α 1 and α 2 are the peak amplitudes; the μ 2 and are the peak latencies, representing the average firing latencies of excited ANFs; and the σ 1 and σ 2 are the peak widths, which indicate the degree of synchronicity in excited ANFs. The early and late components of CDLDs may be attributed to the excitation of the proximal and peripheral axonal processes of ANFs, respectively (e.g., Stypulkowski & van den Honert 1984 ; Lai & Dillier 2000 ; Strahl et al. 2016 ; Dong et al. 2020 ), or due to separate neural responses of part of the ANF population (Ramekers et al. 2015 ; Konerding et al. 2022 ). The CDLD can be used to reveal eCAP characteristics, in terms of the number and temporal firing properties of excited ANFs (Fig. 1 ). Specifically, the α 1 and α 2 indicate the neural firing density. These parameters are highly related to the number of excited ANFs and the eCAP amplitude (Strahl et al. 2016 ; Dong et al. 2020 ). The number of excited ANFs could be estimated with the area under the CDLD (AUCD) more accurately than with the eCAP amplitude (Dong et al. 2020 ). Similar to the AGF, the AUCD growth function (AUGF) can be calculated by plotting the AUCD as a function of the stimulus level. The slope of the AUGF indicates the rate of increase in the estimated number of excited ANFs with rising stimulus levels. Previous studies have not considered these temporal firing properties in explorations of whether speech perception was associated with eCAPs after a CI implantation.
The inconsistent results reported from previous studies on the relationship between eCAPs and speech perception (e.g., Franck & Norton 2001 ; van Eijl et al. 2017 ; He et al. 2017 ) may have been due to suboptimal outcome measures and the inappropriate use of the statistical methodology. For instance, Franck & Norton (2001 ) averaged the slope across individual AGFs, and subsequently, carried out linear regression to examine the association between the AGF slope and individual speech perception scores. However, potential confounders, such as the use of different contacts along the electrode array, the implant design, the age at implantation, and the duration of deafness, were not considered (e.g., Van der Beek et al. 2012 ; van de Heyning et al. 2016 ; He et al. 2017 ). We have used linear-mixed modeling (LMM) because it supports the inclusion of confounding factors.
In the present study, we aimed to find out to what extent speech perception performance in individuals with CIs can be explained by the temporal firing properties of excited ANFs that are represented in eCAPs. To that end, the CDLD was determined from intraoperatively recorded eCAP waveforms, based on an iterative deconvolution method (Dong et al. 2021 ). We investigated whether the six parameters of Eq. (1) and the AUCD and slope of the AUFG were correlated with speech perception in individuals after CI implantation. To facilitate comparisons with existing literature, we also compared the predictive value of these eight parameters with the predictive values determined with conventional methods, based on the eCAP amplitude and the AGF slope. The results might provide a novel clinical predictor of ANF survival and reveal the predictive value of CDLDs for postoperative speech perception performance.
MATERIALS AND METHODS
Patient Population
This retrospective study included AGF recordings from 134 adult patients with postlingual deafness that had undergone CI implantation at the Leiden University Medical Center between June 2012 and March 2019. The AGF was recorded as part of the standard clinical routine for assessing CI function intraoperatively. All patients received unilateral implants with a HiRes90K device, with either a HiFocus-1J or a HiFocus Mid-Scala (MS) electrode array (Advanced Bionics, Valencia, CA). These electrode arrays consisted of 16 electrode contacts (numbered from 1 to 16, in apical to basal order). The MS array has a pre-curved design favoring a midscalar position, whereas the 1J array’s curvature is less pronounced for outer wall positioning. As a result of their different designs, the MS array is positioned closer to the modiolus than the 1J electrode array, especially in the basal region (Van der Jagt et al. 2016 ). According to the inclusion criteria of eCAPs, 10 patients were excluded (see Data Recordings). Therefore, the remaining 124 patients were included in the analysis. Table 1 shows the characteristics of the included patients.
TABLE 1. -
Characteristics of patients with
cochlear implants due to post-lingual deafness
Characteristic
Patients (n = 124)
Sex
 Male
49
 Female
75
Cochlear implant type
 HiRes90K 1J
19
 HiRes90K Mid-Scala
105
Mean age at implantation, years
61.3 ± 19.2
Mean duration of deafness, years
14.1 ± 13.7
Etiology
Ototoxic medication
3
Meniere’s disease
2
Meningitis
8
Otosclerosis
7
Usher syndrome
2
Congenital/hereditary (nonspecified)
37
Other/unknown
65
Monosyllabic word scores at 1 year, % correct
60.8 ± 21.1
Values are the number of patients or mean ± SD, as indicated.
Data Recordings
Test Procedure for AGFs
The AGFs were recorded on all odd electrode contacts with the forward-masking paradigm provided in the Research Studies Platform Objective Measures software program (Advanced Bionics, Sylmar, CA). The electrical stimulus for the masker and probe was a monopolar, cathodic-first, charge-balanced, biphasic pulse (32 μs/phase without interphase-gap). The interval between the masker and probe pulses was fixed at 400 μs. The eCAP response was recorded at a sampling rate of 56 kHz and a gain of 300. For each eCAP, 32 averages were performed. Each AGF was based on 10 different current levels, ranging from 50 to 500 clinical units (CUs). The stimulus level was measured in CUs, where CU equals pulse duration (µs) ⋅ amplitude (µA)/78.7. The number 78.7 is a unitless correction factor defined by the manufacturer (e.g., De Jong et al. 2020 ). Additional details on the recordings were described previously (Biesheuvel et al. 2018 ; Dong et al. 2020 ).
The N1 and P1 peaks of eCAP waveforms were defined as the minimum and maximum amplitudes, respectively, measured across the 180 to 490 μs and the 470 to 980 μs intervals after the end of stimulation. The eCAP amplitude was defined as the voltage difference between P1 and N1 (mV). The noise level of the recording was determined from the last 30 samples (approximately the last 0.5 ms) of the recording, under the assumption that no remaining neural response or stimulus artifact was present in this section (for details, see Dong et al. 2021 ). The signal-to-noise ratio of the eCAP was calculated as the eCAP amplitude divided by the root mean square of the noise segment. Valid eCAPs were selected using a semiautomatic method programmed in MATLAB (Mathworks 2019, Natick, MA), which included two criteria: the eCAP amplitude had to be larger than 25 μV, and the signal-to-noise ratio had to exceed +15 dB. eCAPs that did not meet both criteria were excluded. As a result, we included 5612 eCAPs obtained from 920 AGFs originating from 124 patients (3588 recordings were excluded) for further analysis.
The eCAP amplitude measured at saturation stimulation levels leads to the best estimation of the number of excitable ANFs. However, those saturation levels were not applied intraoperatively, because of safety limitations. Thus, the AGF slope, as an alternative metric for the number of surviving ANFs was investigated instead. We performed linear regression on the AGF data to extract the slope of the best-fit regression line (µV/CU). The intercept of the line with the x-axis is defined as the eCAP threshold (for details see Biesheuvel et al. 2018 ). An example of an AGF and its underlying recordings is shown in Fig. 2 .
Fig. 2.: Example of an AGF from the subject S225, obtained at electrode 9. The AGF (left ) shows the eCAP amplitude as a function of stimulus intensity. The corresponding eCAPs (right ) are plotted from low (bottom ) to high (top ) stimulus intensity. Data points that did not show true eCAP responses are shown in red, and points included in the AGF are shown in blue. Error bars reflect the variance in eCAP amplitude. AGF indicates amplitude growth function; eCAP, electrically evoked compound action potential .
Extraction of the Temporal Firing Properties in eCAPs
To deduce the temporal firing properties of excited ANFs from eCAPs, we calculated CDLDs from eCAP waveforms with an iterative deconvolution method (for details see Dong et al. 2020 , 2021 ). Before we calculated CDLDs, the eCAP waveforms of AGFs were preprocessed. First, the baseline was corrected to zero, with the noise level as a reference. Second, because a convolution can introduce distortions at the leading and trailing ends of a finite-length signal, 50 additional samples were added to the start and end of the recorded eCAP waveforms by linear extrapolation to zero. Then, the preprocessed eCAPs were entered as input into the iterative deconvolution procedure to obtain CDLDs. Specifically, we simulated the eCAPs as the convolution of the human unitary response calculated by Dong et al. (2020 ) with a parameterized CDLD (Eq. 1), with a deconvolution fitting error minimization routine (Fig. 1 ). In this routine, the human unitary response was constant and the simulated eCAP was optimized by iteratively adjusting the variables in the parameterized CDLD, until the simulated eCAP converged to the recorded eCAP. We validated the goodness of fit by calculating the normalized root mean square error. Then, the temporal firing properties were revealed, based on the CDLD parameter values, as shown in Eq. (1).
The AUCD was calculated as a parameter reflective of the number of excited ANFs by taking the integral of the CDLDs as a function over time. We applied linear regression techniques to the AUCD data and extracted the slope of the AUGF (estimated number of fibers/CU) from the best-fit regression line. All signal processing was performed off-line, with MATLAB.
Evaluation of Speech Perception
Speech perception was evaluated at predetermined intervals during a standard clinical follow-up. In this study, we analyzed the word recognition score, obtained in a quiet environment, at 1 year after implantation. Speech material comprised the standard Dutch speech test of the Dutch Society of Audiology. It consisted of phonetically balanced monosyllabic (CVC) word lists (Bosman & Smoorenburg 1995 ), presented at 65 dB SPL in a quiet listening environment. To enhance test reliability, four lists (44 words) per condition were performed. All speech testing was conducted in a soundproof room using a calibrated loudspeaker (Yamaha monitor speaker model MSP5A) placed at 1.0 meter in front of the participant. All patients used the HiRes processing strategy from Advanced Bionics.
Statistical Analysis
LMMs were constructed with the lme4 package (Bates et al. 2015 ) in R (R version 3.6.1, The R Foundation for Statistical Computing 2020). Word recognition outcomes were assumed to be the sum of fixed and random effects. Random effects can introduce correlations between cases and should therefore be taken into account when statistically testing fixed effects for population effects. The LMM allowed the inclusion of potential confounding factors (Brauer & Curtin 2018 ; Bolker et al. 2009 ). Moreover, the LMM design accounted for missing data (Fitzmaurice et al. 2004 ; Netten et al. 2017 ).
LMMs were used to test the relationship between the word recognition score and the metrics based on CDLDs obtained from Eq. (1), the AUCD, and the slope of AUGF. Our dataset included only a single word recognition score per patient, but multiple eCAP measurements were obtained in each patient. Therefore, each of the eight CDLD-related metrics was entered as the dependent variable in a separate LMM. In each of these models, the word recognition score was entered as a fixed covariate. Five additional fixed factors were included that could potentially affect the word recognition score and the CDLD-related parameters, including (1) the implant design, (2) the contact location along the electrode array, (3) the current level, (4) the age at implantation, and (5) the duration of deafness. The duration of deafness was defined as the time, in years, between the age at implantation and the age at which patients had experienced severe hearing loss, either in both ears or in the second ear. Data on the duration of deafness were available for 93 patients. The subject IDs were entered as random categorical variables, including a random intercept (Brauer & Curtin 2018 ). A p value <0.05 was considered to reflect a statistically significant difference.
To compare the CDLD-related parameters to the eCAP amplitude and the AGF slope in their abilities to explain the variance in word recognition scores, the corresponding R2 was required. However, the LMMs did not produce an R2 estimate. Thus, we performed separate simple linear regression analyses, based on average CDLD parameters averaged across all odd electrodes and suprathreshold current levels within each patient as described in previous studies (e.g., Franck & Norton 2001 ; He et al. 2017 ). This way, the coefficient of correlation could be calculated (Neter et al. 1996 ; Khan et al. 2005 ). This approach is inferior to that followed with the LMMs, because using all repeated measures (current levels, electrode locations) greatly increases power. Yet, to extract approximate R2 values, averaged values are an appropriate alternative.
To provide visual representations, word recognition scores were plotted against the corresponding CDLD-related parameters, the eCAP amplitude, and the AGF slope, which were averaged across electrodes and current levels within each patient. These plots did not completely match the analyses performed with LMMs, because the models took into account missing data points and random effects.
RESULTS
Derivation of CDLDs
We derived the CDLD from each eCAP waveform. Figure 3 shows three eCAP waveforms, representing a standard eCAP with a single negative peak (A), an intermediate waveform with a visible, yet inconspicuous second negative peak (B), and an eCAP with two clearly distinguishable peaks. Panels D to F show the corresponding fitted CDLDs, each with two Gaussian components. Overall, the 95% confidence intervals of the goodness of fit (i.e., the normalized root mean square error) ranged from 0.91 to 0.96. Table 2 shows the mean values (with standard deviations) of the CDLD parameters.
TABLE 2. -
CDLD parameters for eCAPs in patients with
cochlear implants
Parameters
α
1
α
2
μ
1
(ms)
μ
2
(ms)
σ
1
(ms)
σ
2
(ms)
AUCD
(the estimated number of excited fibers)
AUGF slope
(the estimatednumber ofexcitedfibers/CU)
Mean
710
553
0.37
0.59
0.076
0.16
484
0.52
SD
46
69
0.049
0.1
0.03
0.066
142
0.07
Means represent averaged values overall electrodes or overall stimulation levels and for all patients.
AUCD indicates area under the CDLD curve; AUGF, the AUCD growth function; CDLD, compound discharge latency distribution; eCAPs, evoked compound action potentials.
Fig. 3.: Examples of eCAPs with different morphologies (upper row ) and corresponding CDLDs (lower row ). CDLD indicates compound discharge latency distribution; eCAP, electrically evoked compound action potential ; E-Gauss, early Gaussian component; L-Gauss, late Gaussian component. NRMSE, normalized root mean square error; P-eCAP, predicted eCAP; R-eCAP, recorded eCAP.
Relationship between CDLDs and Speech Perception
At 1 year of follow-up, the average monosyllabic word score for the 124 adult patients with CIs was 60.8% ± 21.1% correct. The primary goal of this study was to determine whether the CDLD-related parameters might be related to speech perception in CI recipients. Table 3 shows the parameter estimates for the eight LMMs, with the word recognition score as the independent variable and the CDLD parameters as dependent variables.
TABLE 3. -
Parameter estimates from LMMs, with the word recognition score as the independent variable, and the CDLD parameters as dependent variables
Dependent variable
Estimate
SD
F
p
α
1
+18
5.6
8.7
0.003*
α
2
+13
5.4
5.6
0.01*
μ
1
–0.011
0.057
0.07
0.82
μ
2
–0.07
0.06
1.02
0.2
σ
1
–0.09
0.039
6.5
0.01*
σ
2
–0.1
0.052
3.5
0.06
AUCD
+15
5.1
8.1
0.005*
AUGF slope
+0.18
0.06
8.7
0.004*
* Significant difference
AUCD indicates area under the CDLD curve; AUGF, the AUCD growth function; CDLD, compound discharge latency distribution; LMM, linear mixed model; SD, standard deviation.
Figure 4 plots the word recognition scores as a function of the α 1 andα 2 . Both α 1 and α 2 showed a significant positive relationship with the word recognition score determined with LMM analyses [Fig. 4A ; F (1, 117.1) = 8.7, p = 0.003; Fig. 4B , F (1, 117) = 5.6, p = 0.01, respectively]. These outcomes indicated that patients with a higher word recognition score tended to have larger α 1 and α 2 values. Among the remaining factors, the implant design, current, and contact location showed a significant effect on α 1 and α 2 . Specifically, the α 1 and α 2 recorded through 1J electrode arrays were significantly larger than those obtained with MS implants [F (1,128) = 4.9, p = 0.029; F (1,128.5) = 12.2, p < 0.001]. Both α 1 and α 2 significantly increased with current level [F (8,4593) = 558.3, p < 0.0001; F (8,4594) = 4624, p < 0.0001], while they decreased significantly from apical to basal electrode locations [F (7,4596) = 165.8, p < 0.001; F (7,4597.6) = 154.6, p < 0.001]. The duration of deafness and age at implantation did not significantly affect α 1 (p = 0.07; p = 0.25) or α 2 (p = 0.17; p = 0.51). Figure 5A shows word recognition scores plotted as a function of the AUCD, an estimate of the number of excited ANFs in each recorded eCAP. The AUCD was significantly correlated with the word recognition score [F (1,122.1) = 8.0, p = 0.005]. This result indicated that when more ANFs were excited, better speech perception was achieved. The implant design, current level, and contact location showed significant effects on the AUCD. Specifically, the AUCDs recorded through the MS electrode were significantly smaller than that by 1J electrode [F (1,132.7) = 9.3, p = 0.00028]. The AUCD significantly increased with the increasing current level [F (8,4607) = 657.2, p < 0.00001]. The AUCDs recorded at the apical contacts were significantly larger than those recorded at the basal contacts [F (8,4607.8) = 657.2, p < 0.0001]. But the duration of deafness and age at implantation did not affect the AUCD (all p > 0.2).
Fig. 4.: Correlations between word recognition scores and firing density parameters. The percentage of words recognized by each individual patient are plotted against the corresponding
α 1
(A) and
α 2
(B) values, averaged across all contacts and all current levels. R
2 values are derived from the linear regressions (dotted lines).
Fig. 5.: Correlations between word recognition scores and the estimated number of ANFs and AUGF slope. The percentage of words recognized by each individual patient are plotted against the corresponding AUCD (A) and AUGF slope (B), averaged across all contacts or all current levels. R2 values are derived from the linear regressions (dotted lines). ANF indicates auditory nerve fiber; AUCD, area under the CDLD curve; AUGF, the AUCD growth function; CDLD, compound discharge latency distribution.
Figure 5B shows word recognition scores plotted as a function of the slope of AUGF, an estimate of the rate of increase in the estimated number of excited ANFs as a function of stimulus level. Steeper slopes of the AUGF were significantly associated with better word recognition scores [F (1,122.1) = 8.7, p = 0.004]. The AUGF slope decreased significantly from apical to basal electrode locations [F (7,735.7) = 32.8; p < 0.001] and did not significantly depend on any of the other factors (all p > 0.05).
We found that the μ 1 and μ 2 , which reflect the average firing latencies of excited ANFs, were not significantly associated with the word recognition score [F (1,116) = 0.87, p = 0.82; F (1,113.6) = 1.6, p = 0.2, respectively]. The μ 1 and μ 2 recorded at the basal contacts were significantly longer than those recorded at the apical contacts [F (7,4601) = 40, p < 0.001); F (7,4600) = 10.9, p < 0.001]. The age at implantation had a significant effect on μ 2 (p = 0.02), indicating patients implanted at an older age showed longer firing latencies. This effect was not observed for μ 1 (p = 0.53). The duration of deafness and the current level did not significantly affect μ 1 (p = 0.17 and p = 0.06, respectively) or μ 2 (p = 0.3 and p = 0.09, respectively).
The σ 1 and σ 2 represent the degree of neural synchronicity . Figure 6 plots word recognition scores as a function of the σ 1 and σ 2 respectively. σ 1 showed a significant negative relationship between the word recognition score [Fig. 6A ; F (1,107.7) = 5.5, p = 0.02]. However, σ 2 was not significantly associated with the word recognition score [Fig. 6B , F (1,113) = 3.5, p = 0.06]. The implant design, current level, electrode location, and deafness duration showed significant effects on σ 1 and σ 2 (all p < 0.05). The age at implant showed a significant effect on σ 2 (p < 0.01), but not on σ 1 (p = 0.06).
Fig. 6.: Correlations between word recognition scores and
neural synchronicity parameters. The percentage of words recognized by each individual patient are plotted against the corresponding
σ 1
(A) and
σ 2
(B) values, averaged across all contacts and all current levels. R
2 value is derived from the linear regression (dotted line).
Due to the unequal sample sizes of the MS and 1J groups (Table 1 ), the difference in speech perception between the two groups was examined with the Welch t-test and no significant difference was observed (p = 0.57).
Abilities of CDLD Parameters, eCAP Amplitude, and AGF Slope to explain the Variance in Speech Perception
We performed simple linear regression analyses to determine whether CDLD-related parameters allow more precise predictions about the word recognition score than eCAP amplitude and slope of the AGF by comparing R2 values (Table 4 ). For these analyses, the CDLD parameters were calculated for each individual patient as the average of all available eCAPs, across different electrode contacts and current levels. α 1 and α 2 showed R2 values of 0.102 and 0.05, respectively (Fig. 4 ). The AUCD showed an R2 value of 0.12 (Fig. 5A ). The AUGF slope showed an R2 value of 0.09 (Fig. 5B ). μ 1 and μ 2 revealed small R2 values, 0.0009 and 0.015, respectively. σ 1 showed a moderately high R2 of 0.09 (Fig. 6A ), but σ 2 showed a low value of 0.04 (Fig. 6B ).
TABLE 4. -
Comparison of the abilities of different parameters to explain (R
2 ) the variance in speech performance in patients with
cochlear implants
Parameters
R2
0.1
α
2
0.05
μ
1
0.0009
μ
2
0.015
σ
1
0.09
σ
2
0.04
AUCD
0.12
AUGF slope
0.09
eCAP amplitude
0.06
AGF slope
0.07
R2 values are derived from the linear regressions.
AGF indicates the eCAP amplitude growth function; AUCD, the area under the CDLD curve; AUGF, the AUCD growth function; CDLD, compound discharge latency distribution; eCAP, electrically evoked compound action potential .
The eCAP amplitude, calculated for each individual patient as the average of all available eCAPs across different electrode contacts and current levels, showed an R2 of 0.06 (Fig. 7A ). The AGF slope showed an R2 of 0.07 (Fig. 7B ). It was calculated for each patient as the average of all available AGFs across different contacts. The differences between the R2 of the eCAP outcome measures and those of the CDLD outcome measures were investigated based on the cocor test (Diedenhofen & Musch 2015 ). However, no significant differences were observed (all p > 0.05).
Fig. 7.: Correlations between word recognition scores and eCAP parameters. The percentage of words recognized by each individual patient are plotted against the corresponding eCAP amplitude (A) and AGF slope (B), averaged across all electrodes or current levels. R2 values are derived from the linear regressions (dotted lines). AGF indicates amplitude growth function; eCAP, electrically evoked compound action potential .
DISCUSSION
This study was the first to test whether speech understanding was correlated with the CDLD (i.e., the number and the temporal firing properties of excited ANFs in human eCAPs). We showed that speech recognition performance was significantly associated with the CDLD parameters related to the number of excited ANFs (α 1 , α 2 , AUCD), with the AUGF slope (i.e., the speed of the increase of the number of excited ANFs with increasing current level), and with early neural synchronicity (σ 1 ). The other three parameters (μ 1 , μ 2 , and σ 2 ) were not significantly correlated with speech recognition.
Results from postmortem studies have suggested that patients with a greater number of surviving ANFs tended to perform better in speech recognition tests (e.g., Otte et al. 1978 ; Kawano et al. 1998 ; Seyyedi et al. 2014 ; Kamakura & Nadol 2016 ). After studies showed that eCAPs could be indicative of neural survival, interest increased in using eCAP measurements to evaluate correlations with speech perception (e.g., Shepherd & Javel 1997 ; He et al. 2017 ). However, evidently, a direct comparison between the eCAP amplitude and the number of surviving ANFs in individuals with CIs cannot be made. In this study, the temporal firing properties of excited ANFs extracted from the CDLD metrics in eCAPs (α 1 ,α 2 , AUCD) provided a more accurate estimate of the number of excited ANFs than the eCAP amplitude (Dong et al. 2020 ), and the AUGF slope provided a more accurate rate of the increase of the estimated number of excited ANFs with increasing stimulus than the AGF slope. The significant associations between the word recognition score and these four metrics (Table 3 ) supported the notion that more excited ANFs would provide better speech perception . According to the results in the present study, combined with those in previous studies (e.g., Schvartz-Leyzac & Pfingst, 2018 ), we conclude that the number of excited ANFs played a significant role in speech perception performance. In other words, a larger number of healthy spiral ganglion cells could potentially lead to higher speech perception scores after cochlea implantation.
Earlier studies have suggested that a decline in the synchronicity of the auditory neural response might adversely influence speech understanding (e.g., Hellstrom & Schmiedt 1990 ; Pichora-Fuller et al. 2007 ). This theoretical expectation was substantiated for the first time in our study. Specifically, we showed that the σ 1 was negatively associated with speech perception (Table 3 ), that is, a more synchronous ANF response in the early CDLD peak (smaller σ 1 ) was associated with better speech understanding. Moreover, although the σ 2 was not significantly associated with speech perception (p = 0.06), a similar trend was observed (Fig. 6B ). Our findings are consistent with previous findings showing that a decline in the synchronicity of excited ANFs was associated with different factors, such as the duration of deafness, auditory nerve abnormalities, and myelin disorders (Shepherd & Javel 1997 ; Rance 2005 ). In turn, these factors may lead to a deterioration in CI speech outcomes. Our analysis of σ suggests that eCAP waveforms with different morphology can have different clinical implications for neural synchrony and speech perception performance. That is, patients with narrower eCAP waveforms tended to have greater neural synchrony and better speech perception performance than those with wider eCAP waveforms.
To our knowledge, no previous study has reported that the peak latency of eCAPs was associated with speech perception performance in patients with CIs. Also, in our study, we did not observe significant associations between the average firing latencies of excited ANFs in CDLDs (μ 1 and μ 2 ) and speech perception outcomes indicating that firing latencies of excited ANFs had little effect on speech perception (Table 3 ).
Previous studies have reported that patients with larger eCAP amplitudes and steeper AGF slopes tended to show better speech perception than their counterparts (e.g., Brown et al. 1990 ; Kim et al. 2010 ; DeVries et al. 2016 ). In line with their findings, we found that eCAP amplitudes and steeper AGF slopes were significantly associated with speech perception (Fig. 7 ). Contrasting with the eCAP amplitude and AGF slope, we found that a similar proportion of the variance in speech perception could be explained by the α 1 , AUCD, AUGF slope, and σ 1 (Table 4 ), but because of the higher significance levels (Table 3 and Fig. 7 ), α 1 , AUCD, AUGF slope and σ 1 might be better predictors of CI outcomes than the traditionally used eCAP amplitude and AGF slope. In addition, no outliers were observed among this large group of subjects showing a combination of a low word score with either a high number of estimated ANFs, or a steep AUGF slope (Fig. 5 ). This also supports the notion that AUCD and AUGF slope have predictive value for speech recognition performance after implantation.
The low coefficients of determination of the correlation between the CDLD parameters and speech recognition performance indicated that much of the variance in performance remained unexplained, suggesting that high ANF survival rates and high neural synchronicity alone are not sufficient to guarantee good CI outcomes. Modeling work by Oxenham (2016 ) indeed indicates that large losses in ANFs are associated with relatively small changes in auditory psycho-acoustic outcomes. Other factors also play a role in speech recognition, including demographic factors, such as cognitive abilities (e.g., Fayad & Linthicum 2006 ; He et al. 2017 ; Pisoni et al. 2017 ).
A reliable derivation of the temporal firing properties of ANFs in eCAPs was highly related to the shape of the human unitary response, as stated in Dong et al. (2020 ). The unitary response has not been recorded in humans, and the one used in this study was estimated with iterative deconvolution by Dong et al. (2020 , 2021 ) from eCAP recordings (Fig. 1B ). In addition, the CDLD provides a valid estimate of the number of excited ANFs, only when the two components of CDLDs originate from two different groups of ANFs. However, this issue remains controversial, because the two CDLD components may, to some extent, originate from the same group of spiral ganglion cells (Ramekers et al. 2015 ; Konerding et al. 2022 ). For instance, the origin of the early component of CDLDs may be attributable to the direct excitation of the axonal process in the modiolus proximal to the spiral ganglion cell; and the origin of the late component of CDLDs may be attributable to the activation of the axonal process peripheral to the soma of the bipolar ganglion neuron (e.g., Stypulkowski & van den Honert 1984 ; Lai & Dillier 2000 ). According to the above reasons, we report only qualitatively the effects of CDLD parameters on speech perception . Quantitative effect size estimates will have to wait until the human unitary response and CDLD have been quantitatively, and ideally physiologically better defined. Further anatomical and electrophysiological studies are therefore warranted. This knowledge can provide a better understanding of how the two CDLD components affect speech perception performance in individuals with CIs.
Other factors may have affected the CDLD parameters and, consequently, the association between these CDLD parameters and speech perception . First, the CDLD parameters relating to the number of excited ANFs were significantly dependent on the current level and contact location. However, because we only had a single speech score available for each patient, we were unable to determine which electrodes and current levels were most optimal for eCAP recording and CDLD derivation to predict speech perception performance. Second, alternative methods to determine the slope of the AGF can be considered to improve the relatively low correlation between speech perception and AGF slope as found in this study. Conventional linear regression fitting, as used in the present report, has been used most often in the literature (e.g., Abbas et al. 1999 ; Stronks et al. 2019 ). This approach requires nonlinear data points to be removed, which was based on visual inspection in the present paper (Biesheuvel et al. 2018 ). Other proposed fitting methods, notably sigmoidal fitting, eliminate this requirement (Van de Heyning et al. 2016 ) and potentially allow for more reliable curve fitting, because more data is available. The disadvantage of this method is that the noise floor, and hence the number of averages, ultimately determines the threshold (Stronks et al. 2019 ). Other linear fitting approaches include preprocessing steps by excluding outliers (Schvartz-Leyzac et al. 2020 ) or smoothing the data (Hughes et al. 2001 ), which may enhance the reliability of fitting as well. Very recently a novel method involving re-sampling of the AGF and slope estimation based on a windowing method has been proposed that outperformed other methods in terms of its correlation with spiral ganglion neuron density (Skidmore et al. 2022 ). Thirdly, earlier studies have shown that CI design can affect the angular insertion depth, the distance from the electrode contacts to the modiolus, and the auditory neural response threshold (e.g., Gordon & Papsin 2013 ; van der Jagt et al. 2016 ). Nevertheless, these studies do not report any significant effects of array type on speech perception or behavioral thresholds. Our present data corroborate these findings.
To date, eCAP measurements have proven to be useful in diagnosing and managing CI failures, although some discrepancies have been reported (Gantz et al. 1988 ; Hughes et al. 2004 ; DeVries et al. 2016 ; van Eijl et al. 2017 ; He et al. 2017 ). Our results demonstrated that the extraction of CDLDs from eCAP waveforms can provide additional clinical information, including an estimation of the number and synchronicity of excited ANFs and how they affect speech understanding after cochlear implantation. Therefore, integrating the extraction of CDLDs into eCAP measurements may provide a potential predictor of CI outcomes.
CONCLUSIONS
The results of this study showed that in individuals with CIs, speech perception after implantation was significantly associated with the number and synchronicity of excited ANFs, estimated based on the CDLDs. We found that the CDLD-related and eCAP-based parameters (amplitude and AGF slope) could explain similar variance (R2 ) in speech perception . We conclude that eCAP-derived CDLD measurements, which reflect the temporal features of excited ANFs, could potentially serve as additional predictors of speech perception performance in individuals with CIs.
ACKNOWLEDGMENTS
Y.D. was financially supported by the China Scholarship Council. There are no conflicts of interest, financial, or otherwise.
REFERENCES
Abbas P. J., Brown C. J. (1991). Electrically evoked auditory brainstem response: refractory properties and strength-duration functions. Hear Res, 51, 139–147.
Abbas P. J., Brown C. J., Shallop J. K., Firszt J. B., Hughes M. L., Hong S. H., Staller S. J. (1999). Summary of results using the nucleus CI24M implant to record the
electrically evoked compound action potential . Ear Hear, 20, 45–59.
Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting Linear Mixed-Effects Models Using lme4. J Stati Softw, 67, 1–48.
Diedenhofen B., Musch J. (2015). cocor: a comprehensive solution for the statistical comparison of correlations. PLoS One, 10, e0121945.
Biesheuvel J. D., Briaire J. J., Frijns J. H. M. (2018). The precision of eCAP thresholds derived from amplitude growth functions. Ear Hear, 39, 701–711.
Bolker B. M., Brooks M. E., Clark C. J., Geange S. W., Poulsen J. R., Stevens M. H., White J. S. (2009). Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol Evol, 24, 127–135.
Bosman A. J., Smoorenburg G. F. (1995). Intelligibility of Dutch CVC syllables and sentences for listeners with normal hearing and with three types of hearing impairment. Audiology, 34, 260–284.
Brauer M., Curtin J. J. (2018). Linear mixed-effects models and the analysis of nonindependent data: A unified framework to analyze categorical and continuous independent variables that vary within-subjects and/or within-items. Psychol Methods, 23, 389–411.
Brown C. J., Abbas P. J., Gantz B. (1990). Electrically evoked whole-nerve action potentials: data from human cochlear implant users. J Acoust Soc Am, 88, 1385–1391.
Cosetti M. K., Shapiro W. H., Green J. E., Roman B. R., Lalwani A. K., Gunn S. H., Roland J. T. Jr, Waltzman S. B. (2010). Intraoperative neural response telemetry as a predictor of performance. Otol Neurotol, 31, 1095–1099.
de Jong M. A. M., Briaire J. J., Biesheuvel J. D., Snel-Bongers J., Böhringer S., Timp G. R. F. M., Frijns J. H. M. (2020). Effectiveness of phantom stimulation in shifting the pitch percept in cochlear implant users. Ear Hear, 41, 1258–1269.
DeVries L., Scheperle R., Bierer J. A. (2016). Assessing the electrode-neuron interface with the
electrically evoked compound action potential , electrode position, and behavioral thresholds. J Assoc Res Otolaryngol, 17, 237–252.
Dong Y., Briaire J. J., Biesheuvel J. D., Stronks H. C., Frijns J. H. M. (2020). Unravelling the temporal properties of human eCAPs through an iterative deconvolution model. Hear Res, 395, 108037.
Dong Y., Stronks H. C., Briaire J. J., Frijns J. H. M. (2021). An iterative deconvolution model to extract the
temporal firing properties of the auditory nerve fibers in human eCAPs. MethodsX, 8, 101240.
Dong Y., Briaire J. J., Christiaan Stronks H., Frijns J. H. M. (2022). Short- and long-latency components of the eCAP reveal different refractory properties. Hear Res, 420, 108522.
Fayad J. N., Linthicum F. H. Jr. (2006). Multichannel
cochlear implants : relation of histopathology to performance. Laryngoscope, 116, 1310–1320.
Fitzmaurice G. M., Laird N. M., Ware J. H. (2004). Linear mixed effects models. In: Applied Longitudinal Analysis. Hoboken, NJ: John Wiley & Sons, Inc 2004, 187–236
Franck K. H., Norton S. J. (2001). Estimation of psychophysical levels using the
electrically evoked compound action potential measured with the neural response telemetry capabilities of Cochlear Corporation’s CI24M device. Ear Hear, 22, 289–299.
Gantz B. J., Tyler R. S., Knutson J. F., Woodworth G., Abbas P., McCabe B. F., Hinrichs J., Tye-Murray N., Lansing C., Kuk F. (1988). Evaluation of five different cochlear implant designs: audiologic assessment and predictors of performance. Laryngoscope, 98, 1100–1106.
Garadat S. N., Zwolan T. A., Pfingst B. E. (2012). Across-site patterns of modulation detection: relation to speech recognition. J Acoust Soc Am, 131, 4030–4041.
Goldstein M. H., Kiang N. Y. S. (1958). Synchrony of neural activity in electric responses evoked by transient acoustic stimuli. J Acoust Soc Am, 30, 107–114.
Gordon K. A., Papsin B. C. (2013). From nucleus 24 to 513: changing cochlear implant design affects auditory response thresholds. Otol Neurotol, 34, 436–442.
Hall R. D. (1990). Estimation of surviving spiral ganglion cells in the deaf rat using the electrically evoked auditory brainstem response. Hear Res, 49, 155–168.
He S., Teagle H. F. B., Buchman C. A. (2017). The
electrically evoked compound action potential : from laboratory to clinic. Front Neurosci, 11, 339.
Hellstrom L. I., Schmiedt R. A. (1990). Compound action potential input/output functions in young and quiet-aged gerbils. Hear Res, 50, 163–174.
Hughes M. L., Vander Werff K. R., Brown C. J., Abbas P. J., Kelsay D. M., Teagle H. F., Lowder M. W. (2001). A longitudinal study of electrode impedance, the
electrically evoked compound action potential , and behavioral measures in nucleus 24 cochlear implant users. Ear Hear, 22, 471–486.
Hughes M. L., Brown C. J., Abbas P. J. (2004). Sensitivity and specificity of averaged electrode voltage measures in cochlear implant recipients. Ear Hear, 25, 431–446.
Kawano A., Seldon H. L., Clark G. M., Ramsden R. T., Raine C. H. (1998). Intracochlear factors contributing to psychophysical percepts following cochlear implantation. Acta Otolaryngol, 118, 313–326.
Khan A. M., Handzel O., Burgess B. J., Damian D., Eddington D. K., Nadol J. B. Jr. (2005). Is word recognition correlated with the number of surviving spiral ganglion cells and electrode insertion depth in human subjects with
cochlear implants ? Laryngoscope, 115, 672–677.
Kim J. R., Abbas P. J., Brown C. J., Etler C. P., O’Brien S., Kim L. S. (2010). The relationship between
electrically evoked compound action potential and
speech perception : a study in cochlear implant users with short electrode array. Otol Neurotol, 31, 1041–1048.
Kamakura T., Nadol J. B. Jr. (2016). Correlation between word recognition score and intracochlear new bone and fibrous tissue after cochlear implantation in the human. Hear Res, 339, 132–141.
Konerding W., Arenberg J. G., Kral A., Baumhoff P. (2022). Late electrically-evoked compound action potentials as markers for acute micro-lesions of spiral ganglion neurons. Hear Res, 413, 108057.
Lai W. K., Dillier N. (2000). A simple two-component model of the
electrically evoked compound action potential in the human cochlea. Audiol Neurootol, 5, 333–345.
McKay C. M., Chandan K., Akhoun I., Siciliano C., Kluk K. (2013). Can ECAP measures be used for totally objective programming of
cochlear implants ? J Assoc Res Otolaryngol, 14, 879–890.
Miller C. A., Abbas P. J., Rubinstein J. T. (1999). An empirically based model of the
electrically evoked compound action potential . Hear Res, 135, 1–18.
Nadol Jr J. B., Burgess B. J., Gantz B. J., Coker N. J., Ketten D. R., Kos I., Shallop J. K. (2001). Histopathology of
cochlear implants in humans. Ann Otol Rhinol Laryngol, 110, 883–891.
Neter J., Kutner M.H., Nachtsheim C.J., Wasserman W. (1996). Applied Linear Statistical Models. Chicago, IL: Richard D. Irwin, Inc.
Netten A. P., Dekker F. W., Rieffe C., Soede W., Briaire J. J., Frijns J. H. (2017). Missing data in the field of otorhinolaryngology and head & neck surgery: need for improvement. Ear Hear, 38, 1–6.
Otte J., Schunknecht H. F., Kerr A. G. (1978). Ganglion cell populations in normal and pathological human cochleae. Implications for cochlear implantation. Laryngoscope, 88(8 Pt 1), 1231–1246.
Oxenham A. J. (2016). Predicting the perceptual consequences of hidden hearing loss. Trends Hear, 20, 2331216516686768.
Pfingst B. E., Sutton D., Miller J. M., Bohne B. A. (1981). Relation of psychophysical data to histopathology in monkeys with
cochlear implants . Acta Otolaryngol, 92, 1–13.
Pichora-Fuller M. K., Schneider B. A., Macdonald E., Pass H. E., Brown S. (2007). Temporal jitter disrupts speech intelligibility: a simulation of auditory aging. Hear Res, 223, 114–121.
Pisoni D. B., Kronenberger W. G., Harris M. S., Moberly A. C. (2017). Three challenges for future research on
cochlear implants . World J Otorhinolaryngol Head Neck Surg, 3, 240–254.
Ramekers D., Versnel H., Strahl S. B., Klis S. F., Grolman W. (2015). Recovery characteristics of the electrically stimulated auditory nerve in deafened guinea pigs: relation to neuronal status. Hear Res, 321, 12–24.
Rance G. (2005). Auditory neuropathy/dys-synchrony and its perceptual consequences. Trends Amplif, 9, 1–43.
Schvartz-Leyzac K. C., Pfingst B. E. (2018). Assessing the relationship between the
electrically evoked compound action potential and speech recognition abilities in bilateral cochlear implant recipients. Ear Hear, 39, 344–358.
Schvartz-Leyzac K. C., Colesa D. J., Buswinka C. J., Rabah A. M., Swiderski D. L., Raphael Y., Pfingst B. E. (2020). How electrically evoked compound action potentials in chronically implanted guinea pigs relate to auditory nerve health and electrode impedance. J Acoust Soc Am, 148, 3900.
Seyyedi M., Viana L. M., Nadol J. B. Jr. (2014). Within-subject comparison of word recognition and spiral ganglion cell count in bilateral cochlear implant recipients. Otol Neurotol, 35, 1446–1450.
Shepherd R. K., Javel E. (1997). Electrical stimulation of the auditory nerve. I. Correlation of physiological responses with cochlear status. Hear Res, 108, 112–144.
Skidmore J., Ramekers D., Colesa D. J., Schvartz-Leyzac K. C., Pfingst B. E., He S. (2022). A broadly applicable method for characterizing the slope of the
electrically evoked compound action potential amplitude growth function. Ear Hear, 43, 150–164.
Strahl S. B., Ramekers D., Nagelkerke M. M. B., Schwarz K. E., Spitzer P., Klis S. F. L., Grolman W., Versnel H. (2016). Assessing the firing properties of the electrically stimulated auditory nerve using a convolution model. Adv Exp Med Biol, 894, 143–153.
Stronks H. C., Biesheuvel J. D., de Vos J. J., Boot M. S., Briaire J. J., Frijns J. H. M. (2019). Test/retest variability of the eCAP threshold in advanced bionics cochlear implant users. Ear Hear, 40, 1457–1466.
Stypulkowski P. H., van den Honert C. (1984). Physiological properties of the electrically stimulated auditory nerve. I. Compound action potential recordings. Hear Res, 14, 205–223.
Turner C., Mehr M., Hughes M., Brown C., Abbas P. (2002). Within-subject predictors of speech recognition in
cochlear implants : a null result. Acoust Res Lett Online, 3, 95–100.
van de Heyning P., Arauz S. L., Atlas M., Baumgartner W. D., Caversaccio M., Chester-Browne R., Skarzynski H. (2016). Electrically evoked compound action potentials are different depending on the site of cochlear stimulation.
Cochlear Implants Int, 17, 251–262.
van den Honert C., Stypulkowski P. H. (1984). Physiological properties of the electrically stimulated auditory nerve. II. Single fiber recordings. Hear Res, 14, 225–243.
van der Beek F. B., Briaire J. J., Frijns J. H. (2012). Effects of parameter manipulations on spread of excitation measured with electrically-evoked compound action potentials. Int J Audiol, 51, 465–474.
van der Jagt M. A., Briaire J. J., Verbist B. M., Frijns J. H. (2016). Comparison of the Hifocus mid-scala and Hifocus 1J electrode array: angular insertion depths and
speech perception outcomes. Audiol Neurootol, 21, 316–325.
van Dijk J. E., van Olphen A. F., Langereis M. C., Mens L. H., Brokx J. P., Smoorenburg G. F. (1999). Predictors of cochlear implant performance. Audiology, 38, 109–116.
van Eijl R. H., Buitenhuis P. J., Stegeman I., Klis S. F., Grolman W. (2017). Systematic review of compound action potentials as predictors for cochlear implant performance. Laryngoscope, 127, 476–487.
van Gendt M. J., Briaire J. J., Frijns J. H. M. (2019). Effect of neural adaptation and degeneration on pulse-train ECAPs: a model study. Hear Res, 377, 167–178.
Versnel H., Prijs V. F., Schoonhoven R. (1992). Round-window recorded potential of single-fibre discharge (unit response) in normal and noise-damaged cochleas. Hear Res, 59, 157–170.
Westen A. A., Dekker D. M., Briaire J. J., Frijns J. H. (2011). Stimulus level effects on neural excitation and eCAP amplitude. Hear Res, 280, 166–176.