Hearing is unique among the human senses in its ability to sustain our contact with and awareness of the nonspeech sound environment that surrounds us, and to enable communication through the use of spoken language (American Medical Association 2001). This description captures the functional aspects of hearing in general terms. More specifically, functional hearing abilities include sound detection, sound recognition, sound localization, and speech communication both in quiet and in noise (e.g., Soli & Vermiglio 1999; Soli, Reference Note 1; Cook & Hickey 2003 ; Tufts et al. 2009 , Vaillancourt et al. 2010).
These functional hearing abilities are vitally important in daily life, especially for public safety, law enforcement, aviation, and military personnel (Goldberg et al. 2001 ; Cook & Hickey 2003 ; Soli & Friesen 1998 ; Soli & Vermiglio 1999 ; Tufts et al. 2009 ; Punch et al. 1996 ; Laroche et al. 2003 , 2008 ; Giguère et al. 2008 ; Laroche et al. 2014 ; Brungart 2014 ; Harkins et al. 2017). The job tasks performed by personnel in these occupations have several things in common, as related to functional hearing ability. Many essential tasks are hearing-critical, in that adequate functional hearing ability is necessary for their safe and effective performance. Hearing-critical tasks must often be performed in real-world environments where noise levels cannot be controlled. Noise is here defined generally as sounds different from the message-carrying speech that make speech communication difficult. Although both speech and nonspeech signals may be relevant to functional hearing, this article focuses on speech communication. Effective speech communication has almost universally been identified as the most important functional hearing ability (Goldberg et al. 2001 ; Cook & Hickey 2003 ; Soli & Friesen 1998 ; Soli & Vermiglio 1999 ; Tufts et al. 2009 ; Punch et al. 1996 ; Laroche et al. 2003 2014 ; Harkins et al. 2017). This is especially true for law enforcement officers who may need to communicate face-to-face with suspects in noisy settings where vision is limited by low illumination, and with other officers via radio (e.g., Cook & Hickey 2003).
These considerations point to the need for a valid means of evaluating functional hearing abilities that can be objectively linked to the performance levels required in hearing-critical tasks. Meeting this need is important for a number of reasons. First, effective hearing screening must be adequate to distinguish candidates whose functional hearing ability is sufficient to safely perform the job from those whose functional hearing ability is not. Second, the inability of public safety and law enforcement personnel to effectively communicate with speech can directly affect their safety, the safety of their coworkers, and the safety of the general public. Third, employment standards in many jurisdictions and countries are subject to legislative provisions. For example, the Americans with Disability Act (ADA) [Equal Employment Opportunity Commission (EEOC) 1992] requires that screening and inclusion/exclusion criteria for employment must be job related and consistent with business necessity. Likewise, the Canadian Human Rights Commission Tribunal (CHRCT) requires that screening and exclusion criteria meet the bona fide occupational requirements for the job (e.g., Laroche et al. 2003).
These reasons distinguish occupational hearing screening from other types of hearing assessment. For example, diagnostic audiometry assesses hearing impairment, defined as a hearing function outside the range of normal (Ward 1983). Ward (1983) also defined hearing handicap as the disadvantage from impairment sufficient to affect the efficiency of daily activities. Note that neither of these terms refers to the explicit job-related linkage required by ADA and CHRCT. In this context, a disabling hearing handicap is one that prevents an individual from performing essential hearing-critical job tasks, and that cannot be ameliorated with some form of reasonable accommodation. Note that this definition of hearing disability is job specific. The challenge for occupational hearing screening is to validate the objective relationship between evidence of hearing handicap and the ability or inability to perform specific essential hearing-critical job tasks it may cause.
Purpose and Overview
This article and the companion article (Soli et al. 2018) describe a method of evidence-based occupational hearing screening using the extended speech intelligibility index (ESII) (Rhebergen & Versfeld 2005 ; Rhebergen et al. 2006 , 2008 , 2010; Rhebergen et al., Reference Note 1). To validate this method, it is first necessary to use the ESII to model the effects of real-world noise environments on effective speech communication. Next, it is necessary to validate this model for individuals with normal hearing and with hearing loss, the subject of the companion article.
This article begins with a review of efforts to identify essential hearing-critical tasks and the real-world noise environments where they occur for a number of public safety and law enforcement jobs. A total of five studies performed over the past 17 years are reviewed. Two methods of modeling the effects of these noise environments on speech communication have been used in these studies. Both methods are reviewed, with the primary emphasis on the second method based on the ESII. These reviews establish the rationale for the present study.
Following these reviews, the method for selection of hearing-critical tasks and noise environments is described, as well as the ESII modeling of sound recordings from these environments. A means of defining the likelihood of effective speech communication is also described. Results are reported in a form that provides a normative characterization of each noise environment in terms of its impact on the likelihood of effective speech communication, taking into consideration variations in vocal effort, communication distance, and repetition.
Identification of Essential Hearing-Critical Tasks and Real-World Noise Environments
One of the first efforts to define the essential hearing-critical tasks for law enforcement officers was undertaken by the California Peace Officers’ Standards and Training (POST) Commission in 1979, the POST study (Soli & Friesen 1998 ; Soli & Vermiglio 1999 ; Goldberg et al. 2001). The purpose of this study was to validate hearing screening criteria for individuals seeking employment as law enforcement officers. Task analysis surveys were the primary means of characterizing hearing-critical tasks. These surveys were updated in 1984 and again in 1998. The surveys revealed that effective speech communication was essential and occurred most often during radio transmissions and face-to-face communications in loud or very loud noise environments. Typical noise sources included urban traffic, airports, factories, crowds, sirens during pursuits and emergency responses, and jail inmates (Soli & Friesen 1998).
A similar effort was undertaken for the Canadian Coast Guard and the Conservation and Protection agencies of the Department of Fisheries and Oceans (DFO), Canada, the DFO study (Laroche et al. 2005). Subject matter experts identified 34 essential tasks involving face-to-face speech communication as hearing-critical. Most communications occurred on the decks or within cabins on Canadian Coast Guard vessels and fishing vessels, on rescue vessels, or in aircraft where noise sources were again loud or very loud (Laroche et al. 2003 , 2005 ; Giguère et al. 2008).
The Corrections Standard Authority (CSA) of the California Department of Corrections and Rehabilitation, now the Board of State and Community Corrections, conducted a study to update its 1992 hearing standard for entry-level corrections officers, the CSA study (Montgomery et al. 2011). A large sample of incident reports was reviewed, and subject matter experts were interviewed to identify essential hearing-critical job tasks. More than half of routine tasks, as well as those occurring in response to incidents, required effective speech communication, and over 90% of the noise levels were rated as medium or loud.
The Ontario Ministry of Community Safety and Correctional Services in Canada conducted a study similar to update its hearing standard for individuals seeking jobs as Ontario constables, the Ontario constables (OC) study (Laroche et al. 2014). Police constables were asked to identify essential hearing-critical tasks. Their reports identified a total of 139 such tasks, representing 78% of all tasks. Speech communication was again identified as the most important functional hearing ability in most of these tasks. Environments where effective speech communication was important and most challenging included bars and restaurants, urban streets, pursuit situations with sirens and radio communications, large crowds, and highway roadsides (Laroche et al. 2014).
Finally, the U.S. Federal Bureau of Investigation (FBI) conducted a series of studies to set standards for a number of its jobs that require functional hearing and vision abilities (Harkins et al. 2017). The process used to identify essential hearing-critical tasks for these jobs was as follows. First, an initial task list was developed. Next, an online survey of incumbents was used to obtain ratings of the difficulty, importance, and frequency (DIF) of each job task (the DIF rating), as well as the number of incumbents who perform the task. Next, use of each of the four functional hearing abilities was evaluated for each task and determined to be either “unused,” “used,” or “used and essential for completion of the task” (Harkins et al. 2017). In this manner, essential hearing-critical abilities were objectively determined.
The work environments where essential tasks are performed were also identified for each job, and many were visited. These environments were objectively characterized in terms of complicating factors that may affect the performance of essential hearing-critical tasks. The most frequent and important complicating factor was noise interference.
To summarize, a consistent picture of the essential hearing-critical tasks and the real-world environments in which they take place emerges from these studies. The primary functional hearing ability was consistently speech communication, and the primary complicating factor was consistently noise interference. The consistency of these noise environments is seen in the LAeq values obtained from the recordings and measurements made for each study. LAeq is defined as the sound level in decibels equivalent to the total A-weighted sound energy measured over a specified period of time.
Figure 1 displays both the discrete and cumulative distributions of LAeq values for 49 noise environments from the three studies. Over 80% of the noise environments had LAeq values of 70 dB or higher, demonstrating the importance of assessing speech communication in noise. This highlights the challenge of characterizing real-world noise environments in a manner that models their impact on effective speech communication.
Characterization of Real-World Noise Environments for Occupational Hearing Screening
Characterization of real-world noise environments has two aspects. The first is characterization of the noise, which has been long studied and is well understood. The average equivalent continuous level of the A-weighted noise, LAeq, and its average spectrum are statistics often used for this purpose (e.g., Harris 1998 ; Bies & Hansen 2009). The second is characterization of the speech signal together with the effects of the noise on its intelligibility. This challenge has been addressed using two different methodologies. Average levels of the speech and noise have been used to define the signal to noise ratio (SNR) and its effects on speech intelligibility (e.g., Hawkins & Stevens 1950 ; Plomp 1978 , 1986 ; Dreschler & Plomp 1980 , 1985 ; Duquesnoy 1983 ; Duquesnoy & Plomp 1983; Festen & Plomp 1983, 1986 ; Plomp & Mimpen 1979 ; Smoorenburg 1992). More recently, either average or time-varying SNRs within critical or 1/3 octave bands, together with band importance functions (BIFs) to calculate the speech intelligibility index (SII) or the ESII, have been used to estimate intelligibility (American National Standards Institute 2017 ; Rhebergen & Versfeld 2005 ; Rhebergen et al. 2006 , 2008 , 2010).
The POST study and the DFO study used methods based on SNR measurements and calculations, as well as psychometric data. The CSA study, the OC study, and the FBI study used methods based on ESII calculations and predictions, which are validated in the companion article (Soli et al. 2018).
The rationale for using SNR calculations has its roots in the early work of Hawkins and Stevens (1950) and Plomp (1978 , 1986). Hawkins and Stevens showed that the speech reception threshold (SRT) in noise is determined by the SNR rather than the absolute levels of the speech and noise over a fairly wide range of levels. Based on these findings, Plomp developed a model that predicts the SRT in quiet or in any level of noise based on two parameters, which he termed audibility and distortion (Plomp 1986). Plomp proposed that speech communication handicap is the elevation of the SRT in quiet or noise over that of individuals with normal hearing. However, these considerations leave unanswered how much handicap results in the inability of an individual to perform essential hearing-critical tasks in real-world noise environments, thus constituting a hearing disability.
Acoustic measurements were made of real-world noise environments where essential hearing-critical job tasks are performed by law enforcement personnel for the POST study (Soli & Vermiglio 1999). LAeq measurements of noise levels were recorded every 5 sec with a logging sound level meter. Speech levels were estimated based on Pearsons et al. (1977).
The distribution of LAeq values for the 5-sec time intervals in each noise recording was formed using 5 dB wide bin widths. Estimated speech levels, together with the measured noise levels, allowed the SNR and percent intelligibility for each bin to be estimated, providing a normative reference for the intelligibility estimates at each SNR.
The sum of the products of the intelligibility estimates and time proportions for the bins in a recording provided the distribution of expected intelligibility levels in that noise environment. These calculations were repeated for each noise environment and for different predicted intelligibility (PI) functions whose positions were determined by various SRT elevations. These analyses quantified the effects of speech communication handicap.
This SNR method has several limitations. Analyses were not based on the short-term temporal and spectral characteristics of the noise within the 5-sec analysis intervals. In addition, the stationary hearing in noise test (HINT) noise, rather than the actual characteristics of real-world noise, was used to estimate intelligibility in the noise environments. The use of estimated speech levels is also a limitation, as well as the inability to address informational masking (Brungart 2001 ; Brungart et al. 2001).
The DFO study addressed these limitations by establishing the relationship between the PI function measured with the stationary HINT noise and PI functions measured in real-world noise environments (Laroche et al. 2003 , 2005 ; Giguère et al. 2008). Sound recordings at real-world noise environments were made, and segments of these recordings were used as masking noise in laboratory intelligibility tests. The separation in dB between the SRTs for each noise environment and the SRT from the HINT norms defined location-specific and language-specific offsets that could be applied to an individual’s HINT SRT to predict performance in a specific noise environment for English and Canadian French.
The PI functions obtained in each real-world noise capture the effects of the short-term temporal and spectral variations in the real-world noises. In addition, the use of actual PI functions measured in each real-world noise environment obviates the need to use the reference PI function. The use of estimated speech levels remains; however, the inability to deal with informational masking is addressed because the empirically derived offsets for each noise environment include the effects of both informational and energetic masking.
The benefits of the DFO method come at a cost. PI functions must be obtained with behavioral testing for each noise environment. Giguère et al. (2008) suggested that it may be possible to use objective analyses of the noise recordings based on the SII (American National Standards Institute 2017) to estimate the offsets. Philippon (2004) compared SII-based estimates of location offsets with empirically determined offsets and found that both methods exhibited essentially equal accuracy.
The rationale for calculation of the SII is described in an American National Standards Institute/Acoustical Society of America standard (American National Standards Institute 2017). Speech and noise spectrum levels in frequency bands whose importance to speech intelligibility can vary from band to band are used. These levels, together with audibility thresholds and noise levels within each band, determine the proportion of audible speech in each band. These proportions are multiplied by the importance weight for the band and summed to produce the SII. SII calculations are based on average speech and noise levels. The SII standard specifies average speech spectrum levels for each frequency band with respect to normal, raised, loud, and shouted vocal effort.
Development of the ESII model by Rhebergen and his colleagues (Rhebergen & Versfeld 2005 ; Rhebergen et al. 2006 , 2008 , 2010; Rhebergen et al., Reference Note 1) has enabled SII calculations to be made in a manner that captures the effects on speech intelligibility of the spectral and temporal variations common to real-world noise environments. Briefly, the SII is calculated from noise spectrum levels determined in each of many partially overlapping intervals of the noise using frequency-dependent time windows and speech spectrum levels specified in the standard. These SII values are averaged to produce the ESII for the nonstationary noise.
There are other modeling options in addition to the ESII. For example, the multiresolution speech-based envelope power spectrum model (Jørgensen et al. 2013) and the extended short-time objective intelligibility model (Jensen & Taal 2016). Both of these models can account for nonlinear signal processing effects used in hearing aids. However, such effects are not at issue in the current research. The accuracy of the ESII model in linear and reverberant conditions, such as those found in the real-world noise environments used in the present study, has been demonstrated for individuals with normal and impaired hearing (e.g., Rhebergen et al. 2010). The ESII model is also based on the SII standard, while the other models are not standard based. For these reasons, the ESII model was selected for use in the DFO, OC, and FBI studies and forms the basis for the current approach to evidence-based occupational hearing screening.
By using ESII calculations and standardized speech spectrum levels, it is possible to estimate the effects of a real-world noise environment on the ability of an otologically normal individual to perform essential hearing-critical tasks that require effective speech communication. The validity of this method of estimation, first demonstrated by Giguère et al. (2008), has been further established in independent studies reported in the companion article (Soli et al. 2018) and by Rhebergen et al. (2006 , 2008).
The three remaining occupational hearing screening studies—the CSA study, the OC study, and the FBI study—have all used ESII methods to characterize the real-world noise environments. The same methods for recording and analyzing the real-world noise environments were used in the three studies, allowing the results to be combined to provide a characterization of the effects of each noise environment on the likelihood of effective speech communication by otologically normal individuals. These methods are described in the following section.
MATERIALS AND METHODS
Selection of Tasks and Noise Environments
Recordings of real-world noise environments from the CSA, OC study, and FBI studies were used for the ESII analyses. Selection of noise environments was determined by the locations where essential hearing-critical tasks are performed. Panels of subject matter experts identified these tasks and environments using the methods described earlier.
In the CSA study, the subject matter experts determined that all of the tasks of a corrections officer working in a prison in close contact with inmates throughout a typical day are both essential and hearing-critical. Thus, all of the locations within the prisons where corrections officers work were selected for recordings. In the OC and FBI studies, the subject matter experts selected the noisiest locations with the most complicating factors where essential hearing-critical tasks with the highest DIF ratings are performed.
Recording Noise Environments
All recordings for the CSA and FBI studies were made using a professional hand-held stereo digital audio recorder (the Roland Edirol R-09HR). Deviations from a flat microphone frequency response of approximately 6 dB/octave between 4.0 and 8.0 kHz were equalized as part of the calibration process. Recordings were stored on an SD memory card and later transferred to a personal computer for processing and analysis. Recordings for the OC study were made using a B&K 2250 type 1 sound level meter and B&K 4189 microphone, stored on an SD memory card, and transferred to a personal computer as well.
A sound calibrator signal was recorded daily for calibration purposes during the OC study. Given the flat microphone response of the B&K microphone, the root mean square (RMS)-to-dB SPL calibrations were applied at all frequencies for the ESII calculations, and a standard A-weighting filter with unity gain at 1 kHz was used to filter the waveforms for calculation of LAeq values. In the case of the Roland Edirol R-09HR, frequency-dependent correction factors were applied to data to compensate for the nonflat response of the microphone. This was done by presenting calibration tones over a range of frequencies from 100 to 8 kHz at 80 dB SPL using a Fonix 7000 Hearing Aid Analyzer. These tones were recorded before and after the field recordings were made. This allowed frequency-dependent RMS-to-dB SPL calibrations to be obtained. The calibrations were used for the ESII calculations and to design a filter that both equalized the microphone response and applied A-weighting for calculation of LAeq values.
All recordings were manually edited using Audacity (version 1.2.6) to remove spoken comments by the individuals making the recordings, leaving only the background noise for subsequent analysis, which often included some voices in addition to other sound sources typically present in each environment.
SII and ESII calculations are based on the BIF for speech which specifies the relative importance of speech information in each frequency band (American National Standards Institute 2017). The BIF for short passages of easy material was used in the analyses. The SII standard also specifies speech spectrum levels in each band as a function of vocal effort, defined as normal, raised, loud, or shouted. The spectrum level of speech information in a band in relation to the spectrum level of noise in the same band, together with the BIF for the speech information, was used to calculate the SII. The spectrum level of the noise for each band was calculated by filtering the noise recordings using an 18-band 1/3 octave bank with center frequencies ranging from 160 to 8000 Hz with equal logarithmic spacing. The Butterworth infinite impulse response filter bank was designed using a MATLAB program developed according to specifications in the standard for fractional octave-band digital filters (American National Standards Institute 2009).
Unlike the SII, ESII calculations specify the duration for each of the 18 frequency-dependent time windows, with the windows for the lowest frequency band having the longest duration (35 msec) and the window for the highest frequency band having the shortest duration (9.4 msec) (Rhebergen & Versfeld 2005). These windows are aligned at their offsets and spaced every 9.4 msec, causing the windows for low-frequency bands to overlap substantially.
Rectangular frequency-dependent time windows of appropriate lengths were applied to each filtered time waveforms every 9.4 msec, and the RMS level for each window was calculated. This process produced slightly more than 100 RMS values per second for each of the filter bank outputs. These RMS values were converted to band levels expressed in dB SPL using the calibration information for each band described earlier, and then to spectrum levels by applying the bandwidth adjustment values given in the standard (American National Standards Institute 2017).
The noise spectrum levels for the 18 bands, expressed every 9.4 msec, together with the speech spectrum levels and the BIF for short passages of easy material from the standard, were used to calculate slightly more than 100 SII values per second of recorded background noise. These calculations were performed with a series of MATLAB programs developed by Muesch (2005). The ESII specifies that these short-term SII values be averaged over the time interval of interest to obtain a single estimate of the ESII for that interval (Rhebergen & Versfeld 2005). Rather than using the entire duration of the recording as the interval of interest, a shorter interval during which a typical brief two-way communication might occur was defined as 4 sec. Thus, the average ESII was calculated for all 4-sec intervals in each recording, with 435 SII values in each 4-sec interval. Note that these intervals are not exactly 4 sec in duration because there is no integer multiple of 9.4 msec whose product is exactly 4 sec.
The ESII calculation process was repeated 16 times for the data from each location using the four levels of vocal effort specified for a 1 m distance in the standard—normal (62.35 dB SPL), raised (68.34 dB SPL), loud (74.85 dB SPL), and shouted (82.30 dB SPL)—and four communication distances (0.5, 1, 5, and 10 m). (Note that the precision of these values from the standard is used in the ESII calculations; although this precision may exceed the accuracy of measured values obtained in nonstationary real-world noise environments.) Speech levels at each communication distance were calculated using a MATLAB program posted on the web page for the standard (www.si.to), which assumes free field propagation that decreases by 6.0 dB per for each doubling of propagation distance (Muesch 2005).
These calculations did not include the potential effects of reverberation, as proposed by Rhebergen et al. (Reference Note 1) for several reasons. First, the method for including the effects of reverberation has yet to be standardized. Second, the real-world noise levels were all quite high, as seen in Figure 1, suggesting that real-world noise interference may have a greater effect on speech communication than reverberation, especially at negative SNRs. Third, most of the communication distances, except 10 m in some settings, were within the critical distance, and thus the speech would be dominated by the direct path sound energy. Finally, approximately 80% of the performance environments were either outdoors or in large enclosures such as public transportation terminals where the power of indirect path reverberation would be low compared with the power of the direct path signal.
The final step in processing the 16 ESII data sets from each location was to cast each data set into discrete and cumulative frequency distributions. Once in this form, it was possible to determine the proportion of 4-sec intervals in which the ESII exceeded a specified criterion value for each level of vocal effort and communication distance. The ESII step size for the frequency distributions was set at 0.03, which is the approximate change in ESII corresponding to 1.0 dB change in SRT for an otologically normal individual.
Rationale for Defining the Likelihood of Effective Speech Communication
To describe the process by which criterion ESII values are defined and applied, it is first necessary to consider the relationship between SRTs, ESII values, speech intelligibility, and the likelihood of effective speech communication in complex, fluctuating background noise environments. The HINT (Nilsson et al. 1994 ; Soli & Nilsson 1994) sentences and masking noise were used for these calculations, because the HINT was also used to validate the ESII model in the companion article. HINT SRTs (Vermiglio 2008), which correspond to 50% sentence intelligibility, were related to ESII (and SII) values by applying the eighteen 1/3 octave filter band analysis to the reference stationary HINT noise scaled to a sound pressure level of 65.0 dBA, the noise presentation level used during testing. The filter outputs for the HINT noise were converted to spectrum levels and combined with the standard speech spectrum levels for normal vocal effort and the BIF to obtain the SII.
This analysis showed that the SII for the HINT noise and normal vocal effort is 0.34. The HINT noise front condition most closely approximates the assumptions used for the SII calculation. The norm for individuals with normal functional hearing ability in this condition is an SRT of 62.4 dBA (Soli & Wong 2008 ; Vermiglio 2008), closely approximating the standard speech spectrum level for normal vocal effort. The SII for the noise front norm is 0.35. Other investigators have also found the SII at the SRT to be approximately 0.34 for meaningful sentences in stationary noise (e.g., Houtgast & Festen 2008), a nonsignificant difference.
These speech spectrum levels from the standard for normal vocal effort can be compared with the speech spectrum levels of the HINT sentences at the noise front threshold. The average spectrum level difference across the eighteen 1/3 octave bands was 1.0 dB, with the average HINT speech spectrum levels slightly higher. However, the average spectrum levels for the range of 1/3 octave bands from 315 to 3150 Hz, which contributes 82% of the overall band importance, did not differ significantly.
Percent word intelligibility at the HINT SRT is approximately 70% (Nilsson et al. 1994 ; Vermiglio 2008). The slope of the function relating percent intelligibility to presentation level is approximately 10% per dB (Vermiglio 2008). Thus, increasing the presentation level by 3 dB should result in approximately 100% intelligibility. The SII (and ESII) at this presentation level is 0.44, which corresponds closely to the value given as the minimum SII for acceptable intelligibility, 0.45, in the standard (American National Standards Institute 2017).
Neither the SII nor the ESII consider listening conditions in which speech and noise sources originate from different locations. In these conditions, the binaural auditory system allows the individual to listen selectively and improve the SRT due to spatial release from masking (SRM) (e.g., Carhart et al. 1967 ; Bronkhorst & Plomp 1988). By equally weighting the SRTs obtained with 90° separation of the speech and masker (best-case conditions) and those obtained with 0° separation (worst-case conditions), an overall estimate of the SRT across a variety of listening conditions is obtained. Note that this estimate is not a measure of SRM or an analog of SRM. Instead, it takes into consideration the potential benefits of SRM in some real-world noise environments. The published norm for this estimate is 58.6 dBA, corresponding to an SNR of −6.4 dB (Vermiglio 2008). The ESII corresponding to this level is approximately 0.25, or 0.10 units lower than the value calculated under the assumptions in the standard. These considerations suggest that the minimum ESII and SII for acceptable intelligibility is also 0.10 units lower than the value stated in the standard, that is, 0.35 instead of 0.45, when SRM is considered.
Another consideration is that effective speech communication, especially in situations where sentences can be repeated, may not necessarily require 100% intelligibility of the initial sentence. For example, if an ESII corresponding to 80% intelligibility is specified, 80% of the time communication is effective and 20% of the time it is not. If communication is not effective and the sentence is repeated, the likelihood that the repetition will also not be effective is also 20%, assuming the two attempted communications are independent— an assumption validated for sentence communications in nonstationary noise environments (Laroche et al. 2003 ; Larose et al. 2006). The joint probability that both communications will be ineffective is the product of the two probabilities, 0.20 × 0.20 = 0.04, and the probability of an effective communication after one repetition is 1.00 – 0.04 = 0.96. Thus, when a single repetition is allowed, nearly perfect communication can occur when the likelihood of effective speech communication without repetition is approximately 0.80.
It was found necessary to consider the benefits of repetition to effective speech communication in defining the criterion ESII value required for effective communication. The job content experts in all five studies who identified the essential hearing-critical tasks that require effective speech communication frequently noted that repetition was commonly used to communicate effectively, both in face-to-face communications and radio communications.
The ESII corresponding to 80% intelligibility is 0.40. If the prior reasoning that weights best- and worst-case scenarios equally is applied, the ESII value for effective speech communication is reduced by 0.10 to 0.30. Thus, an ESII of 0.30 serves as a conservative criterion for evaluation of the cumulative frequency distributions to determine the proportion of 4-sec intervals in which the ESII exceeds this criterion value. This proportion defines the likelihood of effective speech communication with repetition for individuals fluent in the test language(s), assuming normal functional hearing, no informational masking, and no significant effects of reverberation.
Tasks and Noise Environments
The performance environments with the most complicating factors where essential hearing-critical tasks take place were selected for the jobs analyzed in the CSA, OC, and FBI studies. The most dominant and important complicating factor for each of these locations was real-world noise that could not be controlled.
A total of 174 recordings were used for the CSA study. A total of 221 recordings were made for the OC study, of which 72 were used. Multiple recordings were made at some of the performance environments. These recordings were sampled to provide a more manageable, yet representative set of recordings for the current analyses. A total of 24 recordings were made for the FBI study, of which 22 were used.
Tables 1 and 2 provide summary descriptions of the sets of recordings from each performance environment that were analyzed. Table 1 describes performance environments relevant to law enforcement personnel, and Table 2 describes environments relevant to corrections personnel. This distinction was made because law enforcement performance environments can exist almost anywhere, while corrections performance environments exist only within prisons.
The law enforcement performance environments consisted mainly of public spaces, for example, malls, bars, restaurants, transportation terminals, onboard public transportation, and urban and suburban streets and highways. Primary noise sources were voices, traffic and street noise, and amplified sound sources. The corrections performance environments consisted of the various locations and activities within a prison. Primary noise sources were voices, the sounds of keys, chains, doors closing, and amplified sound sources. These environments are sorted in each table by the average LAeq for the set of recordings from the environment.
The geographical locations for each set of performance environment recordings are also given in Table 1, as well as the number of different recordings analyzed for each performance environment and the total duration of each set of recordings. The LAeq for each recording in a set was calculated. The simple average of these individual LAeq values for each recording is reported for each performance environment, together with the maximum and minimum LAeq values within each set. Most of the average LAeq values for the law enforcement performance environments range from 70 to 80 dBA.
The number and duration of recordings for the CSA study are substantially larger than for the other two studies because a complete set of recordings was obtained at each of the seven prisons that participated in the study. Five of the performance environments had average LAeq from approximately 70 to 80 dBA, similar to the measures obtained for law enforcement environments. However, the ranges of individual LAeq values within the corrections environments were consistently larger.
Likelihood of Effective Speech Communication
ESII analyses of all 4-sec intervals in all recordings were done. For each recording, discrete distributions of ESII values ranging from 0.00 to 1.00 in intervals of 0.03 units were formed. Cumulative distributions of ESII values spanning the same range were derived from the discrete distributions. These distributions give the proportion of 4-sec intervals with ESII values that exceed any selected criterion ESII value, for example, 0.30. The cumulative proportions were averaged for each recording in the set of recordings for the performance environment. In this manner, the average proportion of ESII values exceeding the criterion ESII value for the performance environment was determined for each level of vocal effort and each communication distance. This sequence of analyses was repeated for each performance environment. The numerical distributions of average cumulative proportions for all of the performance environments are found in the Excel tables of the Supplemental Digital Content 1, http://links.lww.com/EANDH/A405.
Examples of the cumulative distributions are shown in Figure 2. Two sets of distributions for the four levels of vocal effort at a communication distance of 1 m in two performance environments are displayed. Vertical dotted lines have been drawn at the criterion ESII value of 0.30 in each chart. The ordinate value at which a cumulative distribution intersects this line corresponds to the proportion of intervals exceeding the criterion value. This proportion defines the likelihood of effective speech communication for the specified level of vocal effort.
For example, in the shopping mall environment, the predicted likelihood of effective speech communication using normal vocal effort is 0.06. With raised vocal effort, the likelihood increases to 0.67, and with loud or shouted vocal effort, the likelihood is 1.00. This pattern of likelihood contrasts with that of the urban street by freeway environment. The predicted likelihoods of effective speech communication using normal or raised vocal effort are 0.00 and 0.03, respectively. Loud and shouted vocal effort is predicted to be effective, with likelihoods of 0.41 and 0.94, respectively.
The values of these likelihoods for the four levels of vocal effort and the four communication distances in each performance environment are displayed graphically in Figures 3–5. (The numerical data used to generate these graphs are found in the Excel tables, Supplemental Digital Content 1, http://links.lww.com/EANDH/A405.) A consistent pattern of likelihoods is seen for many of the performance environments. Normal and raised levels of vocal effort are often effective less than half the time, even at communication distances of 0.5 and 1.0 m, and they may be totally ineffective for greater communication distances. Loud and shouted levels of vocal effort are usually effective at the shorter communication distances but are effective less than half the time at longer distances. Noise levels are generally somewhat lower in the performance environments where corrections personnel work, and as a result, the likelihood of effective speech communication is greater than in the environments where public safety and law enforcement personnel work.
The current research has shown that ESII modeling of nonstationary real-world noise environments can provide an objective means of characterizing noise environments in terms of their impact on the likelihood of effective speech communication. This method of modeling is based on parameters specified in the SII standard (American National Standards Institute 2017), and on a body of research that allows the ESII values required for effective speech communication by the average individual with normal functional hearing ability to be defined (e.g., Plomp 1986 ; Houtgast & Festen 2008 ; Soli & Wong 2008 ; Vermiglio 2008). Thus, ESII modeling can produce results that are both standard based and norm referenced.
The motivation for development of ESII modeling has been driven by the need to evaluate the hearing abilities of individuals who seek to perform jobs that include essential hearing-critical tasks requiring the ability to communicate effectively with speech, especially in noisy environments. Unfortunately, these needs are not adequately met with traditional diagnostic measures of hearing impairment. Such traditional measures may be unable to establish an objective, evidence-based linkage between the screening criteria and the specific essential hearing-critical tasks of the job, as required by the ADA in the United States and the CHRCT in Canada. Similar requirements may exist in other countries as well. This linkage requires knowledge of the hearing-critical job tasks, the performance environments where they take place, and the impact of real-world noise and other complicating factors on the ability to perform these tasks in these environments. The first two requirements are met with systematic job analysis. The remaining requirement can be met using ESII modeling of the relevant noise environments.
To understand the impact of real-world noise on hearing-critical job performance, it is first necessary to determine its impact on the performance of otologically normal individuals (normals), thus providing a normative reference for use in evaluation of individuals who may have impaired functional hearing ability. In the present study, the SII (and ESII) value at the average SRT for normals was used as the criterion value defining normal performance. The proportion of 4-sec intervals exceeding this value, 0.30, defines the likelihood of effective speech communication for normals. If likelihood values ≥0.90 are used to define the ability to communicate effectively, such communication at a distance of 1 m without shouting occurred in only six of the 16 public safety and law enforcement performance environments, and in only three of the eight corrections performance environments. These findings demonstrate the importance of norm-referenced screening criteria rather than criteria based on ideal absolute likelihood values that even otologically normal individuals cannot achieve.
If norm-referenced screening criteria based on ESII modeling of real-world noise environments are to be established and used in occupational medicine, the concept “likelihood of effective speech communication” that underlies this approach must be adequately defined. This construct is based on a criterion value of the ESII that is conservatively predicted to result in near-perfect speech intelligibility for the average individual with normal functional hearing ability. In a complex real-world noise environment with time-varying temporal and spectral characteristics, some portion of the time the characteristics of the noise will be such that, during a 4-sec interval, the ESII value will equal or exceed this criterion value, enabling effective speech communication. This definition is based on the assumption that 4 sec is adequate for a brief one- or two-way communication; although longer intervals may be needed for more complex communications, and shorter intervals may provide “glimpses” of speech allowing partial communications (Cooke 2006). Further study of various interval lengths, including ESII reanalyses of the current noise recordings using different interval lengths, can shed additional light on this issue.
Use of the proportion of intervals with ESII values that equal or exceed the criterion value to define the likelihood of effective speech communication also deserves further consideration. Such use applies only to truly hearing-critical tasks. If other modalities can be used to supplement hearing and achieve effective speech communication, such communications may occur at lower ESII values. Thus, the proportion of intervals with such values would be larger. The use of proportions to define likelihoods also does not consider the temporal distribution of intervals within a time-varying noise that exceed the criterion value. Such intervals may not occur at the time an individual needs to communicate effectively with speech, necessitating a pause in communication which may or may not be important. Given these considerations, the use of proportions to estimate the likelihood of effective speech communication should only be applied to truly hearing-critical tasks involving speech communication, and should be regarded as conservative best-case estimates. If an objective norm-referenced characterization of the effects of real-world noise environments on the likelihood of effective speech communication is to be used for occupational hearing screening for jobs that include essential hearing-critical tasks, these are important considerations. Methods based on these considerations for objective screening of functional hearing ability in individuals with normal or impaired functional hearing ability are presented in the companion article (Soli et al. 2018). Briefly, these methods provide an estimate of the ESII value required for the individual to communicate effectively in the workplace noise environments where essential hearing-critical tasks are performed that can be compared with the normative references for these environments.
The authors gratefully acknowledge Aram Glorig, MD, House Clinic; Robert Goldberg, MD, Chief Medical Officer for Los Angeles County; and Shelly Spillberg, PhD, California Peace Officers’ Standards and Training (POST) Commission, for their support of the POST study. The authors also gratefully acknowledge Sharon Robertson, Joanne Jankun, Mal Farquhar, Phil Murdock, and Stephen Peck, Department of Fisheries and Oceans (DFO), Canada, for their support of the DFO study. The authors also gratefully acknowledge Shelley Montgomery and Evonne Garner, California Corrections Standards Authority (CSA), for their support of the CSA study. The authors also gratefully acknowledge Phillip Spottswood, JD, U.S. Office of Personnel Management; David Wade, MD, former Chief Medical Officer for the Federal Bureau of Investigation (FBI); James Yoder, MD, former occupational health physician for the FBI; and Diane Vogelei, FBI, RN, for their support of the FBI study.
S.D.S. contributed to the design of each of the five studies, the real-world noise recordings, and the data analyses for each study. He also wrote portions of the article. C.G., C.L., and V.V. contributed to the design, real-world noise recordings, and the data analyses for the DFO and OC studies. They also wrote portions of the article. W.A.D. and K.S.R. assisted with the analysis and interpretation of the extended speech intelligibility index (ESII) modeling. K.H., M.R., and P.R. performed the job task analyses and assisted in the identification of essential hearing- and vision-critical tasks for the FBI study. L.S.M. and his students performed the job task analysis and assisted in the identification of essential hearing-critical tasks for the CSA study.
American Medical Association. (Guides to the Evaluation of Permanent Impairment (2001). 5th ed.). Milwaukee, WI: American Medical Association.
American National Standards Institute. (Specifications for Octave-Band and Fractional-Octave-Band Analog and Digital Filters. ANSI S1.11–1986. (R2009). 2009). New York, NY: American National Standards Institute.
American National Standards Institute. (Methods for Calculation of the Speech Intelligibility Index
. ANSI/ASA S3.5-1997 (R2017). 2017). New York, NY: American National Standards Institute.
Bies D. A., Hansen C. H.Engineering Noise Control: Theory and Practice (2009). 4th ed.). London, United Kingdom: Spon Press.
Bronkhorst A. W., Plomp RThe effect of head-induced interaural time and level differences on speech intelligibility in noise. J Acoust Soc Am, 1988). 83, 1508–1516.
Brungart D. S.Informational and energetic masking effects in the perception of two simultaneous talkers. J Acoust Soc Am, 2001). 109, 1101–1109.
Brungart D. S., Simpson B. D., Ericson M. A., et alInformational and energetic masking effects in the perception of multiple simultaneous talkers. J Acoust Soc Am, 2001). 110(5 Pt 1), 2527–2538.
Carhart R., Tillman T. W., Johnson K. R.Release of masking for speech through interaural time delay. J Acoust Soc Am, 1967). 42, 124–138.
Cook L. E., Hickey M. J.Edwards F., McCallum R. I., Taylor P. J.Hearing. In Fitness for Work: The Medical Aspects (pp. 2003). Oxford, United Kingdom: Oxford University Press.67–89).
Cooke MA glimpsing model of speech perception in noise. J Acoust Soc Am, 2006). 119, 1562–1573.
Dreschler W. A., Plomp RRelation between psychophysical data and speech perception for hearing-impaired subjects. I. J Acoust Soc Am, 1980). 68, 1608–1615.
Dreschler W. A., Plomp RRelations between psychophysical data and speech perception for hearing-impaired subjects. II. J Acoust Soc Am, 1985). 78, 1261–1270.
Duquesnoy A. J.Effect of a single interfering noise or speech source upon the binaural sentence intelligibility of aged persons. J Acoust Soc Am, 1983). 74, 739–743.
Duquesnoy A. J., Plomp RThe effect of a hearing aid on the speech-reception threshold of hearing-impaired listeners in quiet and in noise. J Acoust Soc Am, 1983). 73, 2166–2173.
Equal Employment Opportunity Commission (EEOC). (A Technical Assistance Manual on the Employment Provisions (Title 1) of the Americans With Disabilities Act (Report no. EEOC-M-1A). 1992). Washington, DC: U.S. Government Printing Office.
Festen J. M., Plomp RRelations between auditory functions in impaired hearing. J Acoust Soc Am, 1983). 73, 652–662.
Festen J. M., Plomp RSpeech-reception threshold in noise with one and two hearing aids. J Acoust Soc Am, 1986). 79, 465–471.
Giguère C., Laroche C., Soli S. D., et alFunctionally-based screening criteria for hearing-critical jobs based on the hearing in noise test. Int J Audiol, 2008). 47, 319–328.
Goldberg R. L., Dirks D., Kramer M., et alH
earing Guidelines. 2001). Report prepared for the California Peace Officers’ Standards and Training Commission.
Harkins K., Ruckstuhl M., Soli S. D., et alAnalysis of Job Tasks, Critical Abilities, and Work-Place Environments for a Federal Law Enforcement Agency. 2017). McLean, VA: Harkcon, Inc.
Harris C. M.Handbook of Acoustical Measurements and Noise Control (1998). 3rd ed.). Woodbury, NY: Acoustical Society of America.
Hawkins J. E., Stevens S. S.The masking of pure tones and of speech by white noise. J Acoust Soc Am, 1950). 22, 6–13.
Houtgast T., Festen J. M.On the auditory and cognitive functions that may explain an individual’s elevation of the speech reception threshold in noise. Int J Audiol, 2008). 47, 287–295.
Jensen J., Taal CAn algorithm for predicting the intelligibility of speech masked by modulated noise maskers. IEEE/ACM Trans Audio, Speech, Language Process, 2016). 24, 1–14.
Jørgensen S., Ewert S. D., Dau TA multi-resolution envelope-power based model for speech intelligibility. J Acoust Soc Am, 2013). 134, 436–446.
Laroche C., Giguère C., Vaillancourt V., et alDevelopment and validation of hearing standards for Canadian Coast Guard Seagoing Personnel and C&P and land-based personnel. 2005). Phase II. Final report to the Department of Fisheries and Oceans under Contract No. F7053-000009.
Laroche C., Giguère C., Vaillancourt V., et alReview and update of constable selection system hearing standards (Reference number OSS_00219841). 2014). Final report to the Ontario Ministry of Community Safety and Correctional Services.
Laroche C., Soli S., Giguère C., et alAn approach to the development of hearing standards for hearing-critical jobs. Noise Health, 2003). 6, 17–37.
Larose R., Mercille I., Giguère C., et alThe effect of sentence repetition on speech intelligibility in noise. Can Acoust, 2006). 34, 106–107.
Montgomery S., Soli S. D., Meyers L. S., et alHearing Standard for Selection of Entry-Level Correctional Officers. 2011). Sacramento, CA: State of California Corrections Standards Authority, California Department of Corrections and Rehabilitation.
Muesch HMethods for calculation of the speech intelligibility index
(SII). Acoustical Society of America Working Group S3-79 support site for the speech intelligibility index
(SII). 2005). Retrieved November 12, 2017, from www.sii.to
Nilsson M., Soli S. D., Sullivan J. A.Development of the hearing in noise test for the measurement of speech reception thresholds in quiet and in noise. J Acoust Soc Am, 1994). 95, 1085–1099.
Pearsons K. S., Bennett R. L., Fidell SSpeech Levels in Various Noise Environments. EPA-600/1-77-025. 1977). Washington, DC: U.S. Environmental Protection Agency.
Philippon BObjective predictions of the performance intelligibility function for speech perception in different noise environments using the SII. 2004). Technical report, Hearing Research Laboratory, University of Ottawa.
Plomp RAuditory handicap of hearing impairment and the limited benefit of hearing aids. J Acoust Soc Am, 1978). 63, 533–549.
Plomp RA signal-to-noise ratio model for the speech-reception threshold of the hearing impaired. J Speech Hear Res, 1986). 29, 146–154.
Plomp R., Mimpen A. M.Improving the reliability of testing the speech reception threshold for sentences. Audiology, 1979). 18, 43–52.
Punch J. L., Robinson D. O., Katt D. F.Development of a hearing performance standard for law enforcement officers. J Am Acad Audiol, 1996). 7, 113–119.
Rhebergen K. S., Lyzenga J., Dreschler W. A., et alModeling speech intelligibility in quiet and noise in listeners with normal and impaired hearing. J Acoust Soc Am, 2010). 127, 1570–1583.
Rhebergen K. S., Versfeld N. J.A speech intelligibility index
-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners. J Acoust Soc Am, 2005). 117(4 Pt 1), 2181–2192.
Rhebergen K. S., Versfeld N. J., Dreschler W. A.Extended speech intelligibility index
for the prediction of the speech reception threshold in fluctuating noise. J Acoust Soc Am, 2006). 120, 3988–3997.
Rhebergen K. S., Versfeld N. J., Dreschler W. A.Prediction of the intelligibility for speech in real-life background noises for subjects with normal hearing. Ear Hear, 2008). 29, 169–175.
Smoorenburg G. F.Speech reception in quiet and in noisy conditions by individuals with noise-induced hearing loss in relation to their tone audiogram. J Acoust Soc Am, 1992). 91, 421–437.
Soli S. D., Friesen LExpert panel on hearing guidelines for entry-level peace officers. 1998). Technical report prepared for the California Peace Officers’ Standards and Training Commission.
Soli S. D., Amano-Kusumoto A., Clavier OEvidence-based occupational hearing screening II: Validation of a screening methodology using measures of functional hearing ability. Int J Audiol, 2018). 57(323–334.
Soli S. D., Nilsson M. J.Assessment of communication handicap with the HINT. Hear Instrum, 1994). 45, 12–16.
Soli S. D., Vermiglio A. J.Assessment of functional hearing abilities for hearing-critical jobs in law enforcement. 1999). Report for the California Peace Officers Standards and Training Commission.
Soli S. D., Wong L. L.Assessment of speech intelligibility in noise with the hearing in noise test. Int J Audiol, 2008). 47, 356–361.
Tufts J. B., Vasil K. A., Briggs SAuditory fitness for duty: A review. J Am Acad Audiol, 2009). 20, 539–557.
Vaillancourt V., Laroche C., Giguère C., et alBasic concepts in functional hearing assessment. Can Hear Rep, 2010). 5, 30–37.
Vermiglio A. J.The American English hearing in noise test. Int J Audiol, 2008). 47, 386–387.
Ward W. D.The American Medical Association/American Academy of Otolaryngology formula for determination of hearing handicap. Audiology, 1983). 22, 313–324.
Rhebergen K. S., Schoonhoven J., Dreschler W. A.Towards measuring the Speech Transmission Index in fluctuating noise: Impulse response measurements. 2014). Poster presented at the IHCON 2014 meeting, Lake Tahoe.
Soli S. D.Hearing and job performance. 2003). Paper commissioned by the Committee on Disability Determination for Individuals with Hearing Impairment, National Research Council, National Academy of Sciences.
Fit for duty; Real-world listening environments; Speech communication; Speech Intelligibility Index
Supplemental Digital Content
Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.