Journal Logo

Original Articles

Temporal Speech Parameters Indicate Early Cognitive Decline in Elderly Patients With Type 2 Diabetes Mellitus

Imre, Nóra MA*; Balogh, Réka MA*; Gosztolya, Gábor PhD; Tóth, László PhD; Hoffmann, Ildikó PhD‡,§; Várkonyi, Tamás MD, PhD; Lengyel, Csaba MD, PhD; Pákáski, Magdolna MD, PhD*; Kálmán, János MD, PhD, DSc*

Author Information
Alzheimer Disease & Associated Disorders: April–June 2022 - Volume 36 - Issue 2 - p 148-155
doi: 10.1097/WAD.0000000000000492


Increasing evidence confirms the heightened risk of cognitive disorders in elderly patients living with type 2 diabetes mellitus (T2DM), compared with nondiabetic individuals.1,2 T2DM not only doubles the odds of Alzheimers disease (AD) and vascular dementia (VD),3 but also increases the incidence of mild cognitive impairment (MCI), the clinical condition between healthy aging and dementia.4 MCI patients experience subtle cognitive symptoms (eg, deficits in language and executive functions, attention, or memory), which can cause problems with more complex activities of daily living but do not interfere with basic everyday functioning.5 This association with cognitive decline poses a significant risk worldwide, as the global prevalence of T2DM is more than 9.3% of all adults today.6 Although the exact pathophysiological pathways are under investigation, diabetes has been reported to accelerate the aging process of the brain through alterations in the metabolism of glucose, insulin, and amyloid, which can act as serious biological risk factors for dementia.7 Cognition in T2DM was found to be impaired in several domains, like learning, verbal memory, attention, executive functions, processing and psychomotor speed, and language.8

Decline in language functions have been found to be one of the earliest signs of cognitive deterioration.9 Especially, the temporal (time-based) organization of speech reflects the functioning of several underlying cognitive processes, including the planning of speech production, the access to vocabulary, working memory, and, depending on the specific task, even episodic memory.10 Studies using temporal analysis of speech found increased signs of disfluency (eg, word finding delays), or decreased speech rate in cognitively impaired individuals (eg, patients with AD or MCI).11–13 Increased number/duration of pauses in speech is hypothesized to reflect the increased cognitive load required for maintaining one’s train of thought14 and the general slowing down of word-retrieval.9

Since temporal analyses of speech provide highly valuable information regarding cognitive processes, and there is a strong association between cognitive deficits and T2DM, it is of great significance to explore temporal speech characteristics among a high risk group, the elderly with T2DM. In the present study, an automated speech analysis method, the Speech-Gap Test (S-GAP Test) was applied on speech recordings of T2DM participants. This method, built on automatic speech recognition (ASR) techniques, was sensitive to distinguish between MCI patients and elderly individuals with healthy cognition (HC), both for Hungarian15–19 and for English native speakers.20

The objective of the present study was (1) to explore whether elderly HC individuals with and without T2DM differ in temporal speech characteristics, which may reflect subtle differences in cognition as well; and (2) to also understand how the same temporal speech characteristics compare between MCI patients with and without T2DM.



Based on the initial inclusion criteria, a total of 160 individuals were enrolled. After the exclusion process (Fig. 1), 100 of them were eligible for participation. Data collection took place at 2 departments of the Albert Szent-Györgyi Health Center, University of Szeged, Hungary: (1) T2DM patients were recruited at the Division of Diabetology of the Department of Internal Medicine, while (2) nondiabetic subjects were studied at the Memory Clinic of the Department of Psychiatry. The investigation took place within a 25-month time frame between 2018 and 2020.

Demonstration of the inclusion/exclusion process, and the final sample sizes of the four study groups: HC with and without T2DM; MCI with and without T2DM. HC indicates healthy cognition; MCI, mild cognitive impairment; T2DM, type 2 diabetes mellitus.

Participation was voluntary after giving written informed consent. Participants did not receive financial compensation. The study was approved by the Regional Human Biomedical Research Ethics Committee of the University of Szeged, Hungary (231/2017-SZTE). The study was conducted in compliance with the principles of the Declaration of Helsinki.

All participants were evaluated by means of a neuropsychological battery (under Study protocol in detail). The battery included the Mini-Mental State Examination (MMSE),21 which served as the measure of objective cognitive status. Based on the MMSE, participants were classified as either HC (30 to 28 points) or as having MCI (27 to 25 points). Finally, 4 groups emerged: HC with T2DM (n=39), HC without T2DM (n=34), MCI with T2DM (n=12), and MCI without T2DM (n=15).

Inclusion and Exclusion Process

Diabetes-related Criteria

In the T2DM sample, medical diagnosis of T2DM was the initial inclusion criterion. Diagnosis was based on current international guidelines of the American Diabetes Association.22 Patients with type 1 diabetes mellitus, prediabetes, or chronic hyperglycemia of any other etiology were not enrolled. Average duration of diabetes was 11.4 years (SD=8.08); treatment was either oral medication (50.9%; n=26), insulin (25.5%; n=13), combined oral medication and insulin (17.6%; n=9), or only diet (5.9%; n=3).

Other Criteria

For all participants, initial inclusion criteria were a minimum age of 50 years, a minimum of 8 years of formal education, and Hungarian as native language. Exclusion criteria included the following: major hearing problems/deafness, acute depression, dementia, history of substance use disorder, head injuries, major neuropsychiatric disorders, previous computed tomography/magnetic resonance imaging showing evidence of significant abnormality suggesting another potential etiology for MCI (eg, prior macrohemorrhage/microhemorrhages, lacunar infarcts or single large infarct), evidence of cerebral contusion, encephalomalacia, aneurysm, vascular malformations or clinically significant space-occupying lesions. Finally, individuals whose speech could not be properly recorded due to technical errors were also excluded from further analysis (Fig. 1).

To check all inclusion and exclusion criteria, patient history was gathered from an initial interview and from available medical records. Furthermore, dementia and depression were screened on-site at the beginning of the protocol. The MMSE was used for dementia screening, and patients with a score under 25 were excluded. The presence/absence of acute depressive symptoms was evaluated by applying the 15-item Geriatric Depression Scale (GDS-15),23 with a cut-off score of 6 above which individuals were not considered eligible.

Study Protocol

Neuropsychological Tests

Following a brief demographic and eligibility interview, a neuropsychological test sequence was administered, comprised of 8 instruments. These included 3 test batteries measuring current cognitive state: MMSE, Clock Drawing Test (CDT),24 and Alzheimer’s Disease Assessment Scale-Cognitive Subscale (ADAS-Cog)25; 4 tests measuring working memory and executive functions: digit span test forward and backward,26 nonword repetition test,27 and listening span test28; and one scale for measuring current depressive symptoms: GDS-15. The test order was fixed for all participants and had been assembled to ensure that tasks requiring the same cognitive function were separated (eg, working memory tasks did not directly follow each other).

Speech Task

A speech task was also administered to collect spontaneous (unplanned) speech samples for the temporal speech analysis. This task was chosen because it requires both working and episodic memory, allows remote and repeated testing, and was found to be sensitive in discriminating between MCI and controls.19 In order to prevent fatigue, this speech task was administered approximately at the 15-minute mark of the 1-hour protocol. Speech was elicited in the following manner: the lead investigator (Investigator 1) told the participant that another researcher (Investigator 2), who was in a different room was to call them on a mobile phone and provide instructions for a new task. Following this cue, Investigator 2 called the participant and after a brief introduction, asked them to talk about their previous day. The standardized instruction was: “Please tell me about your previous day in as much detail as you can.” Following the instruction, both Investigator 1 (in the room) and Investigator 2 (on the phone) remained silent until the participant finished the task. The elicited monologue was recorded by a call recorder application installed on the mobile phone.

Speech Sample Preparation and Analysis

The obtained speech recordings were independently screened before analysis by 2 investigators: a linguist specialized in language pathologies (I.H.) screened the overall quality of the recording, while a researcher of computational speech analysis (G.G.) provided technical control. Those recordings that were not of suitable quality (n=4 in the T2DM, and n=2 in the nondiabetic groups) were excluded (Fig. 1). The remaining 100 recordings were converted into an uncompressed PCM mono, 16-bit wave format with a sampling rate of 8000 Hz, and were edited in the beginning and at the end so that only the participants’ speech remained; the opening/closing formulas and the instructions were removed.

After these preparations, ASR techniques were employed to identify pauses, both silent and filled, in each recording. Pauses were defined as the interruption of speech by either complete silence (silent pause) or by filler words like “um” or “er” (filled pause) lasting longer than 30 ms. The acoustic model was trained on a subset of the BEA audio corpus29 that consisted of spontaneous speech, as this type of speech is expected to contain filled pauses (for the training of the ASR system, see Gosztolya et al20). For training, the speech of 116 speakers was utilized, which amounted to ~44 hours of recordings. This ASR model performed phone-level recognition, with labeling of the input signal (including filled pauses, treated as a special “phoneme”) and the output of a phonetic segmentation. Based on the raw parameters from the ASR output, 15 temporal speech parameters were extracted using simple calculations established in previous works of our research group.16,20 The calculations and definitions of the parameters are available as supplements (Supplemental Digital Content 1,

Statistical Analysis

Descriptive statistical data are expressed as means, medians, and SD for each group. The Shapiro-Wilk test demonstrated non-normality of data in most scale variables, thus the Mann-Whitney U test was employed to assess between-group differences on demographic data, neuropsychological test scores and temporal speech parameters. For categorical variables, Fisher exact test was applied. To further examine the abilities of each speech parameter in identifying T2DM patients, receiver operating characteristic (ROC) analysis was applied. Sensitivity and specificity (true positive and true negative rate) were calculated using threshold values that yielded the highest possible sensitivity (while keeping specificity above 50%). The level of significance was set at P<0.05 for all statistical tests. Analyses were performed using IBM SPSS 24.0 (SPSS Inc., Chicago, IL).


Demographic and Neuropsychological Characteristics

Demographic and neuropsychological test scores in the HC and MCI groups are presented in Table 1, respectively. Within the HC sample, participants with T2DM and without T2DM did not differ statistically significantly in either of the demographic factors, or any of the neuropsychological tests. However, within the MCI sample, digit span (backwards) performance turned out to be significantly lower among the T2DM patients, compared with the nondiabetic participants.

TABLE 1 - Descriptive and Comparative Statistics of the Demographic Characteristics and Neuropsychological Test Scores in the HC With and Without T2DM, and the MCI With and Without T2DM Groups, Using the Mann-Whitney U Test or Fisher Exact Test (in Italics)
HC With T2DM (n=39) HC Without T2DM (n=34) Mann-Whitney U Test/Fisher Exact Test
M Mdn SD M Mdn SD U Z P
Sex (male/female) 13/26 9/25 0.613
Age (y) 65.31 66.00 8.059 67.74 68.00 6.934 548.000 −1.273 0.203
Education (y) 13.03 12.00 2.748 13.29 12.00 2.505 609.500 −0.608 0.543
MMSE 28.72 29.00 0.647 29.00 29.00 0.778 531.000 −1.582 0.114
CDT 7.62 9.00 3.159 7.50 9.00 3.077 612.000 −0.584 0.559
ADAS-Cog 7.08 6.15 2.989 6.61 6.95 2.608 607.500 −0.435 0.664
Digit span: forward 5.56 5.00 0.995 5.85 5.50 1.158 579.500 −0.975 0.330
Digit span: backward 4.13 4.00 0.894 4.18 4.00 0.999 642.000 −0.243 0.808
Nonword repetition 5.18 5.00 1.715 4.74 5.00 1.620 552.000 −1.275 0.202
Listening span 2.53 2.60 0.583 2.75 2.85 0.602 504.500 −1.782 0.075
GDS-15 2.00 1.00 1.717 2.00 2.00 1.595 645.000 −0.205 0.838
MCI with T2DM (n=12) MCI without T2DM (n=15) Mann-Whitney U Test/Fisher Exact Test
M Mdn SD M Mdn SD U Z P
Sex (male/female) 2/10 5/10 0.408
Age (y) 70.42 73.50 9.120 72.60 74.00 6.311 83.500 −0.318 0.755
Education (y) 11.17 11.50 2.855 11.73 12.00 2.865 76.000 −0.712 0.516
MMSE 26.17 26.00 0.835 26.27 26.00 0.799 84.000 −0.315 0.792
CDT 5.50 4.50 3.529 7.33 8.00 2.870 64.000 −1.281 0.217
ADAS-Cog 9.38 9.00 2.070 10.61 10.60 3.104 64.000 −1.271 0.217
Digit span: forward 5.00 5.00 1.128 5.33 5.00 0.617 60.500 −1.668 0.152
Digit span: backward 3.25 3.00 0.754 3.93 4.00 0.799 49.000 −2.161 0.047
Nonword repetition 3.58 5.00 2.575 3.67 4.00 1.718 81.000 −0.450 0.683
Listening span 2.32 2.15 0.476 2.23 2.30 0.434 87.000 −0.151 0.905
GDS-15 1.92 2.00 1.505 2.53 2.00 1.187 62.000 −1.436 0.183
The P-values indicating statistically significant differences (at the P<0.05 level) are in bold.
ADAS-Cog indicates Alzheimer’s Disease Assessment Scale-Cognitive Subscale; CDT, Clock Drawing Test; GDS-15, 15-item Geriatric Depression Scale; HC, healthy cognition; M, mean; MCI, mild cognitive impairment; Mdn, median; MMSE, Mini-Mental State Examination; T2DM, type 2 diabetes mellitus.

Temporal Speech Parameters in the HC and MCI Groups According to Diabetic Status

Comparison between the T2DM and the nondiabetic groups was applied both within the HC and within the MCI samples. In the HC sample (Table 2), 5 of the 15 parameters differed significantly, as follows: the HC with T2DM group had shorter utterance length, higher duration rate of silent pause and total pause, and also higher average duration of silent pause and total pause, compared with the HC without T2DM group.

TABLE 2 - Descriptive and Comparative Statistics of the HC With and Without T2DM Groups Using the Mann-Whitney U Test
HC With T2DM (n=39) HC Without T2DM (n=34) Mann-Whitney U Test
Temporal Speech Parameters M Mdn SD M Mdn SD U Z P
Utterance length (s) 114.00 93.36 68.274 205.68 151.88 235.281 407.000 −2.831 0.005
Articulation tempo (1/s) 9.27 9.49 1.907 9.65 9.68 2.001 602.000 −0.675 0.500
Speech tempo (1/s) 10.05 10.30 1.872 10.46 10.48 1.850 597.000 −0.730 0.465
Occurrence rates of pauses
 Silent pause (%) 5.55 5.35 1.562 5.29 4.83 2.458 536.000 −1.404 0.160
 Filled pause (%) 2.57 2.15 1.613 3.09 2.56 2.123 573.000 −0.995 0.320
 Total pause (%) 8.11 7.32 2.642 8.38 7.41 4.268 639.000 −0.265 0.791
Duration rates of pauses
 Silent pause (%) 32.16 29.40 10.991 25.79 24.13 10.850 429.000 −2.588 0.010
 Filled pause (%) 5.81 5.04 4.054 6.92 6.03 3.940 556.000 −1.183 0.237
 Total pause (%) 37.97 37.90 11.495 32.71 30.79 12.700 474.000 −2.090 0.037
Frequency of pauses
 Silent pause (1/s) 0.53 0.53 0.101 0.52 0.48 0.142 580.000 −0.918 0.359
 Filled pause (1/s) 0.24 0.23 0.140 0.30 0.27 0.150 516.000 −1.626 0.104
 Total pause (1/s) 0.78 0.74 0.174 0.82 0.78 0.241 620.000 −0.476 0.634
Average durations of pauses
 Silent pause (s) 0.62 0.55 0.248 0.50 0.46 0.169 453.000 −2.322 0.020
 Filled pause (s) 0.22 0.20 0.072 0.22 0.22 0.056 590.500 −0.802 0.423
 Total pause (s) 0.50 0.45 0.164 0.41 0.37 0.128 419.000 −2.698 0.007
The P-values indicating statistically significant differences (at the P<0.05 level) are in bold.
HC indicates healthy cognition; M, mean; Mdn, median; T2DM, type 2 diabetes mellitus.

A subsequent ROC analysis was executed in order to explore if HC with T2DM patients could be discriminated from HC without T2DM participants, based on their temporal speech parameters. The results showed that the same 5 parameters demonstrated significant classification potential, with utterance length having the highest area under the curve (AUC) (0.693) and the average duration of total pause yielding the highest sensitivity (79.5%). Sensitivity and specificity measures of temporal parameters were derived from ROC analysis; parameters with an AUC above 0.600 are shown in Table 4.

However, regarding the MCI sample (Table 3), no statistically significant differences could be detected between the with and the without T2DM subgroups. This was further consolidated by the subsequent ROC analysis, which revealed that none of the 15 temporal parameters had statistically significant abilities to discriminate MCI with T2DM from MCI without T2DM participants. Nevertheless, parameters concerning filled pauses produced the highest AUCs. Sensitivity and specificity measures of temporal parameters were derived from ROC analysis; parameters with an AUC above 0.600 are shown in Table 4.

TABLE 3 - Descriptive and Comparative Statistics of the MCI With and Without T2DM Groups Using the Mann-Whitney U Test
MCI With T2DM (n=12) MCI Without T2DM (n=15) Mann-Whitney U test
Temporal Speech Parameters M Mdn SD M Mdn SD U Z P
Utterance length (s) 119.50 80.10 93.150 131.70 79.40 139.058 83.000 −0.342 0.755
Articulation tempo (1/s) 9.26 9.64 2.644 8.76 8.20 1.703 76.000 −0.683 0.516
Speech tempo (1/s) 9.99 10.37 2.555 9.57 9.09 1.582 77.000 −0.634 0.548
Occurrence rates of pauses
 Silent pause (%) 5.77 5.74 2.504 5.73 5.47 1.841 88.000 −0.098 0.943
 Filled pause (%) 2.19 2.74 1.344 3.13 2.72 2.009 67.000 −1.122 0.277
 Total pause (%) 7.97 7.98 3.445 8.85 8.63 3.272 77.000 −0.634 0.548
Duration rates of pauses
 Silent pause (%) 33.94 32.68 16.602 31.93 28.69 7.933 89.000 −0.049 0.981
 Filled pause (%) 4.41 5.03 2.883 6.84 7.65 4.474 62.000 −1.366 0.183
 Total pause (%) 38.35 36.75 17.231 38.77 36.57 10.476 86.000 −0.195 0.867
Frequency of pauses
 Silent pause (1/s) 0.52 0.53 0.128 0.53 0.54 0.135 88.000 −0.098 0.943
 Filled pause (1/s) 0.20 0.19 0.129 0.28 0.27 0.152 62.000 −1.366 0.183
 Total pause (1/s) 0.73 0.77 0.204 0.81 0.78 0.214 73.000 −0.830 0.427
Average durations of pauses
 Silent pause (s) 0.64 0.56 0.255 0.62 0.62 0.164 82.000 −0.390 0.719
 Filled pause (s) 0.21 0.21 0.041 0.24 0.23 0.097 76.000 −0.683 0.516
 Total pause (s) 0.53 0.47 0.210 0.49 0.49 0.099 87.000 −0.146 0.905
The P-values indicating statistically significant differences (at the P<0.05 level) are in bold.
M indicates mean; MCI, mild cognitive impairment; Mdn, median; T2DM, type 2 diabetes mellitus.

TABLE 4 - Accuracy Measures of Temporal Parameters With AUC Above 0.600 in the HC and the MCI Samples, Respectively (Containing Both the “With T2DM” and “Without T2DM” Subgroups), Using Receiver Operating Characteristic (ROC) Analysis
HC Groups (With vs. Without T2DM) Accuracy Measures
Temporal Speech Parameters P AUC 95% CI− 95% CI+ Threshold Value Sensitivity (%) Specificity (%)
Utterance length (s) 0.005 0.693 0.572 0.815 131.845 74.4 61.8
Total pause average duration (s) 0.007 0.684 0.560 0.808 0.374 79.5 55.9
Silent pause duration rate (%) 0.010 0.676 0.553 0.800 24.192 74.4 52.9
Silent pause average duration (s) 0.020 0.658 0.532 0.785 0.471 74.4 55.9
Total pause duration rate (%) 0.037 0.643 0.514 0.771 31.705 66.7 55.9
Filled pause frequency (1/s) 0.104 0.611 0.481 0.740 0.246 61.5 58.8
MCI groups (with vs. without T2DM) Accuracy measures
Temporal speech parameters P AUC 95% CI− 95% CI+ Threshold value Sensitivity (%) Specificity (%)
Filled pause duration rate (%) 0.172 0.656 0.446 0.865 6.754 83.3 53.3
Filled pause frequency (1/s) 0.172 0.656 0.443 0.868 0.229 66.7 60.0
Filled pause occurrence rate (%) 0.262 0.628 0.408 0.848 2.715 50.0 53.3
The P-values indicating statistically significant classification abilities (at the P<0.05 level) are in bold.
AUC indicates area under the curve; CI, confidence interval; HC, healthy cognition; MCI, mild cognitive impairment; ROC, receiver operating characteristic; T2DM, type 2 diabetes mellitus.

Correlations of Temporal Speech Parameters With Age and Education

Regarding the relationship between age and the 15 temporal speech parameters across the 4 groups, correlation was statistically significant for articulation tempo (HC with T2DM: τb=−0.221, P=0.050), for speech tempo (HC with T2DM: τb=−0.229, P=0.042), and for silent pause frequency (MCI without T2DM: τb=0.390, P=0.046). With regards to education, weak to moderate but statistically significant correlations were found with utterance length (HC without T2DM: τb=0.269, P=0.035; MCI with T2DM: τb=0.478, P=0.044), articulation tempo (MCI with T2DM: τb=0.478, P=0.044), speech tempo (MCI with T2DM: τb=0.546, P=0.021), filled pause occurrence rate (HC with T2DM: τb=0.274, P=0.022), filled pause duration rate (HC with T2DM: τb=0.268, P=0.025; MCI without T2DM: τb=0.596, P=0.004), silent pause average duration (MCI with T2DM: τb=−0.580, P=0.014), filled pause average duration (MCI without T2DM: τb=−0.618, P=0.003), and total pause average duration (MCI with T2DM: τb=−0.615, P=0.010). The comprehensive table containing all correlations is available as supplement (Supplemental Digital Content 1,


To the best of our knowledge, this was the first study that investigated the speech of T2DM patients with the purpose of detecting signs of subtle cognitive deficits that can manifest as changes in the temporal characteristics of speech. The major finding was that the speech of elderly HC individuals with T2DM compared significantly worse on several temporal characteristics to that of age-matched and education-matched HC individuals without T2DM.

Firstly, we intended to study the temporal speech characteristics of elderly T2DM patients who have been classified as HC based on traditional neuropsychological screening. Our results showed that their speech contains more signs of subtle, underlying cognitive deficits than that of the HC subjects without T2DM. Namely, 5 of 15 temporal speech parameters showed statistically significant differences between the diabetic and nondiabetic groups: HC with T2DM patients had shorter utterance length, higher duration rate of silent pause and total pause, and also higher average duration of silent pause and total pause compared to HC without T2DM participants. [Although it was not the focus of the present study, it is interesting to note that the temporal speech parameters that differentiated between the HC with/without T2DM groups also showed different mean/median values within the nondiabetic sample, between HC and MCI (Table 2 vs. Table 3). This further highlights that from the full set of 15 parameters these would have the most discriminative potential in future clinical applications.]

These differences are in agreement with the results of previous studies using the S-GAP Test and other speech analysis methods: in earlier works, more or longer pauses (signs of disfluency, word-finding difficulties and decreased lexical access) had been reported in the speech of patients with varying levels of cognitive impairment, for example, due to AD,11,30,31 MCI,12,13 or Parkinson disease.32,33 These results, now complemented by the findings of the present study, confirm that pauses in speech provide a highly valuable source of information regarding language functions and thus cognitive state, especially in the introductory stages of neurocognitive disorders when other cognitive domains measured by traditional test batteries have not yet deteriorated in such a magnitude to be detected. In the case of T2DM patients, these subtle cognitive changes may be explained by diabetes-associated changes in the brain, such as impaired insulin signaling, neuronal insulin resistance, inflammation, mitochondrial dysfunction, vascular damage, or disturbances in synaptic plasticity, all of which can lead to an onset of cognitive decline.7,34,35

Furthermore, we also compared the temporal speech characteristics of MCI patients with and without T2DM. No significant differences could be detected in any of the 15 analyzed temporal speech parameters, suggesting that these two groups performed similarly. A possible explanation for this could be that the pathophysiological processes in the brain are facilitated by T2DM and, as a consequence, cognitive abilities gradually deteriorate. According to current medical protocol, MCI diagnosis is only given when, besides fulfilling other criteria, cognitive symptoms reach a measurable level and can be confirmed by an objective evaluation tool.36,37 However, it has been reported that the underlying cognitive deterioration is usually present for a longer period, more or less without clinical symptoms.38 It could be argued that in the case of T2DM patients, the onset of the latent phase of transitioning from HC to MCI might take place earlier, and speech disfluencies might precede the more robust symptoms by a longer period of time than in the case of nondiabetic subjects. Our results also indicate that the temporal speech characteristics of T2DM and nondiabetic subjects tend to be similar when the cognitive deterioration reaches the level of MCI, which would suggest that once the transition to MCI has manifested, the presence of T2DM may not necessarily exacerbate the already deteriorated temporal speech symptoms. It would be of high clinical interest to further explore the effects of T2DM on cognition from a longitudinal viewpoint and to study whether temporal speech features differ in the next stage of cognitive decay, dementia with T2DM.

Regarding the relationship between demographics and temporal speech characteristics, age showed a statistically significant, weak correlation with 3 parameters: a negative correlation with articulation tempo and speech tempo, and a positive correlation with silent pause frequency. Education weakly to moderately correlated with 8 parameters: positively with utterance length, articulation tempo, speech tempo, filled pause occurrence rate, filled pause duration rate, and filled pause average duration; and negatively with silent pause average duration and total pause average duration. Careful examination of the positive and negative directions of the statistically significant correlations reveals that the increased presence of silent pauses (higher frequency or average length) was aligned with the demographic risk factors of cognitive decline (lower education, higher age39,40). In contrast, the ability to produce more and faster speech (longer utterance length, higher articulation and speech tempo) was more associated with lower dementia-risk (such as higher education and lower age39).

Limitations of the present study include the small number of MCI individuals which might reduce the statistical power of the comparisons, and therefore could contribute to the lack of between-group differences within the MCI sample. As this research was a pilot study for identifying speech parameters with the highest differentiating potential for future telemedicine-based assessments, multiple correction testing was not applied for the statistical comparisons. This needs to be taken into account when interpreting the results. On another note, even though the sampling rate used for speech recording (8000 Hz) might seem relatively low, the S-GAP Test was specifically intended to be applied in real-life settings, potentially in the form of a mobile application. A minimum sampling rate of 8000 Hz is available on most mobile phone devices, enabling wider adoption of the technology. Also, future works could also involve more diabetes-related medical characteristics, which could enable the creation of subgroups based on, for example, diabetes severity, glycemic control, or insulin levels.

The utilization of telemedicine in the management of diabetes is a dynamically emerging area, however, to this date no such technique has been used for the cognitive examination of diabetic patients. A subtle speech deficiency detected by the S-GAP Test could be an indication for a thorough medical and neuropsychological examination to search for possible underlying causes or for monitoring the patient more closely, for example, with frequent check-ups. Remote assessment is gaining increasing significance in light of the current COVID-19 pandemic, with every medical field facing restrictions of face-to-face appointments. The S-GAP Test is currently being developed in a mobile application format which could serve as a rapid, cost-effective, noninvasive, and no-contact form of cognitive screening for the elderly and, according to the present results, could be implemented for monitoring T2DM patients as well.


We explored the speech of T2DM patients, building on the shared pathophysiology of T2DM and neurocognitive disorders, as well as the strong association between cognitive deterioration and speech deficits. Even though T2DM patients classified as HC and matched nondiabetic subjects performed similarly on global cognitive and traditional neuropsychological tests, we demonstrated that the speech of T2DM patients contained an increased number and length of silent pauses compared to the nondiabetic group. Therefore, we would suggest that temporal analysis of speech offers a sensitive and ecologically valid tool for monitoring cognitive state in the early, introductory stages of cognitive impairments, and it could be useful for identifying the T2DM individuals who are more at risk of developing manifest MCI or later, dementia.


1. Cukierman T, Gerstein HC, Williamson JD. Cognitive decline and dementia in diabetes—systematic overview of prospective observational studies. Diabetologia. 2005;48:2460–2469.
2. Sadanand S, Balachandar R, Bharath S. Memory and executive functions in persons with type 2 diabetes: a meta-analysis. Diabetes Metab Res Rev. 2016;32:132–142.
3. Ahtiluoto S, Polvikoski T, Peltonen M, et al. Diabetes, Alzheimer-disease, and vascular dementia: a population-based neuropathologic study. Neurology. 2010;75:1195–1202.
4. Degen C, Toro P, Schönknecht P, et al. Diabetes mellitus type II and cognitive capacity in healthy aging, mild cognitive impairment and Alzheimer’s disease. Psychiatry Res. 2016;240:42–46.
5. Jekel K, Damian M, Wattmo C, et al. Mild cognitive impairment and deficits in instrumental activities of daily living: a systematic review. Alzheimers Res Ther. 2015;7:1–20.
6. Saeedi P, Petersohn I, Paraskevi S, et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Res Clin Pract. 2019;157:107843.
7. Biessels GJ, Staekenborg S, Brunner E, et al. Risk of dementia in diabetes mellitus: a systematic review. Lancet Neurol. 2006;5:64–74.
8. McCrimmon RJ, Ryan CM, Frier BM. Diabetes and cognitive dysfunction. Lancet. 2012;379:2291–2299.
9. Szatloczki G, Hoffmann I, Vincze V, et al. Speaking in Alzheimer’s disease, is that an early sign? Importance of changes in language abilities in Alzheimer’s disease. Front Aging Neurosci. 2015;7:1–7.
10. Mortensen L, Meyer AS, Humphreys GW. Age-related effects on speech production: a review. Lang Cogn Process. 2006;21:238–290.
11. Hoffmann I, Nemeth D, Dye CD, et al. Temporal features of spontaneous speech in Alzheimer’s disease. Int J Speech Lang Pathol. 2010;12:29–34.
12. Roark B, Mitchell M, Hosom JP, et al. Spoken language derived measures for detecting mild cognitive impairment. IEEE Trans Audio Speech Lang Processing. 2011;19:2081–2090.
13. Meilán JJG, Martínez-Sánchez F, Martínez-Nicolás I, et al. Changes in the rhythm of speech difference between people with nondegenerative mild cognitive impairment and with preclinical dementia. Behav Neurol. 2020;4683573:1–10.
14. König A, Satt A, Sorin A, et al. Automatic speech analysis for the assessment of patients with predementia and Alzheimer’s disease. Alzheimers Dement. 2015;1:112–124.
15. Tóth L, Gosztolya G, Vincze V, et al. Automatic detection of mild cognitive impairment from spontaneous speech using ASR. Proceedings of Interspeech, Dresden, Germany. 2015: 2694–2698.
16. Tóth L, Hoffmann I, Gosztolya G, et al. A speech recognition-based solution for the automatic detection of mild cognitive impairment from spontaneous speech. Curr Alzheimer Res. 2018;15:130–138.
17. Gosztolya G, Tóth L, Grósz T, et al. Detecting mild cognitive impairment from spontaneous speech by correlation-based phonetic feature selection. Proceedings of Interspeech, San Francisco, USA. 2016:107–111.
18. Gosztolya G, Vincze V, Tóth L, et al. Identifying mild cognitive impairment and mild Alzheimer’s disease based on spontaneous speech using ASR and linguistic features. Comput Speech Lang. 2019;53:181–197.
19. Vincze V, Szatlóczki G, Tóth L, et al. Telltale silence: temporal speech parameters discriminate between prodromal dementia and mild Alzheimer’s disease. Clin Linguist Phon. 2021;35:727–742.
20. Gosztolya G, Balogh R, Imre N, et al. Cross-lingual detection of mild cognitive impairment based on temporal parameters of spontaneous speech. Comput Speech Lang. 2021;69:101215.
21. Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12:189–198.
22. American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diab Care. 2014;37:S81–S90.
23. Sheikh JI, Yesavage JA. Geriatric Depression Scale (GDS)—recent evidence and development of a shorter version. Clin Gerontol. 1986;5:165–173.
24. Shulman KI, Shedletsky R, Silver IL. The challenge of time: clock-drawing and cognitive function in the elderly. Int J Geriatric Psychiatry. 1986;1:135–140.
25. Rosen WG, Mohs RC, Davis KL. A new rating scale for Alzheimer’s disease. Am J Psychiatry. 1984;141:1356–1364.
26. Wechsler D. The Wechsler Adult Intelligence Scale—Revised. New York, NY: The Psychological Corporation; 1981.
27. Gathercole SE, Willis CS, Baddeley AD, et al. Gathercole SE, McCarthy RA. The children’s test of nonword repetition: a test of phonological working memory. Memory Tests and Techniques. Hove: Lawrence Erlbaum Associates; 1994:103–127.
28. Daneman M, Carpenter PA. Individual differences in working memory and reading. J Verbal Learning Verbal Behav. 1980;19:450–466.
29. Neuberger T, Gyarmathy D, Gráczi TE, et al. Sojka P, Horák A, Kopeček I. Development of a large spontaneous speech database of agglutinative Hungarian language. Text, Speech and Dialogue. Brno, Czech Republic: Springer; 2014:424–431.
30. López-de-Ipiña K, Alonso J-B, Travieso CM, et al. On the selection of non-invasive methods based on speech analysis oriented to automatic Alzheimer disease diagnosis. Sensors (Basel). 2013;13:6730–6745.
31. Martínez-Sánchez F, Meilán JJG, García-Sevilla J. Oral reading fluency analysis in patients with Alzheimer disease and asymptomatic control subjects. Neurología. 2013;28:325–331.
32. Hlavnička J, Čmejla R, Tykalová T, et al. Automated analysis of connected speech reveals early biomarkers of Parkinson’s disease in patients with rapid eye movement sleep behaviour disorder. Sci Rep. 2017;7:1–12.
33. Alvar AM, Lee J, Huber JE. Filled pauses as a special case of automatic speech behaviors and the effect of Parkinson’s disease. Am J Speech Lang Pathol. 2019;28(2S):835–843.
34. Bello-Chavolla OY, Antonio-Villa NE, Vargas-Vázquez A, et al. Pathophysiological mechanisms linking type 2 diabetes and dementia: review of evidence from clinical, translational and epidemiological research. Curr Diabetes Rev. 2019;15:456–470.
35. Stranahan AM, Arumugam TV, Cutler RG, et al. Diabetes impairs hippocampal function through glucocorticoid-mediated effects on new and mature neurons. Nat Neurosci. 2008;11:309–317.
36. Petersen RC, Smith GE, Warning SC, et al. Mild cognitive impairment: clinical characterization and outcome. Arch Neurol. 1999;56:303–308.
37. Petersen RC, Doody R, Kurz A, et al. Current concepts in mild cognitive impairment. Arch Neurol. 2001;58:1985–1992.
38. Albert M, Yuxin Z, Moghekar A, et al. Predicting progression from normal cognition to mild cognitive impairment for individuals at 5 years. Brain. 2018;141:877–887.
39. Luck T, Luppa M, Briel S, et al. Incidence of mild cognitive impairment: a systematic review. Dement Geriatr Cogn Disord. 2010;29:164–175.
40. Patterson C, Feightner J, Garcia A, et al. General risk factors for dementia: a systematic evidence review. Alzheimers Dement. 2007;3:341–347.

mild cognitive impairment; type 2 diabetes mellitus; cognitive screening; neuropsychology; early detection; cognitive dysfunction; language functions; speech analysis; temporal speech characteristics; automatic speech recognition

Copyright © 2022 The Author(s). Published by Wolters Kluwer Health, Inc.