Journal Logo

Hearing Technology Special Feature

Performance and Potential of Machine Learning Audiometry

Barbour, Dennis L. MD, PhD; Wasmann, Jan-Willem A.

Author Information
doi: 10.1097/01.HJ.0000737592.24476.88
  • Free

A common refrain nowadays in many fields is that massive data accumulation provides the raw material for advanced algorithms to improve decision making. The core premise behind precision medicine, for example, is that ever-increasing genetic, physiologic, comorbidity, environmental, and lifestyle data about patients leads to more refined assessments of their conditions and better outcomes as a result.1

Wright Studio, audiology, machine learning, artificial intelligence.
Figure 1:
Comparison of conventional and machine learning audiometry procedures in progress. Asterisk indicates an initial tone delivered to a patient at 1 kHz, 70 dB HL. Letters indicate a subsequent conventional staircase tone sequence. Numbers indicate an equal-length machine learning tone sequence for the same ear. Blue bold = heard tones. Dashed and solid lines indicate interim and final machine learning threshold estimates, respectively. Interim threshold estimates represent hypotheses to be tested using successive tones. Except for “25 dB HL at 1 kHz,” interim conventional threshold estimates are nonexistent at this point in the test and therefore provide little ability to direct subsequent tone selection.6,25,26,27 Audiology, machine learning, artificial intelligence.
Table 1:
Audiologist-selectable options for machine learning audiometry. To demonstrate test flexibility for different clinical scenarios that can be chosen at test time, default options for (1) rapid screening for hearing loss in a new patient, (2) change of hearing status in a returning patient, and (3) detailed diagnostic procedure with minimal assumptions are indicated by the corresponding text modifiers. Audiologists will use their clinical judgment to determine the most appropriate scenario and interpret the test results, while the machine learning algorithm will handle the low-level but mathematically demanding task of selecting appropriate test stimuli and constructing an accurate hearing model from the recorded data.

Largely absent from this narrative of Big Data, however, is that Big Inference has just as great a potential for modernizing decision making.2 Much as directed design can deliver computer hardware capable of more computations per watt of power,3 machine learning principles can be asserted to extract more inference from each data point, irrespective of data set size.


Consider a cornerstone of hearing assessments: the pure tone audiogram. Because a conventional audiogram tests one frequency at a time, it functions much like a behavioral test battery of independent tests with little ability to share inference across variables (i.e., distinct frequencies). Imagine a new patient's response to an initial 1 kHz, 70 dB HL test tone. Little may be known about this patient's hearing in absolute terms, but relatively more is known about 1 kHz than any other frequency. The very last frequency one would want to sample next from an information efficiency standpoint would be 1 kHz. Unfortunately, repeated sampling of this sort in a fixed-frequency staircase is the standard procedure (Fig. 1). The resulting inefficiency compounds as other frequencies (and other tests) are added to the patient's workup, resulting in lengthy testing times and limited inference. On the other hand, a machine learning audiogram can use information theory to determine the most informative next tone every step of the way, culminating in a considerably faster audiogram procedure.4-7

This ability to extract more information from each tone is just the tip of the iceberg for machine learning audiometry. First of all, the machine learning audiometric model is fully predictive for the underlying physiological phenomena, delivering psychometric curves at each frequency instead of merely thresholds. Psychometric curves include information about the internal noise of hearing processes8 but are rarely used in clinical applications because of the five to 10 hours per ear required to estimate them conventionally.9 Full psychometric models also have other advantages over the threshold-only models in current use. Each new data point can be used either to update a full model or to validate it against previous data, for example. Self-validating tests of this sort are useful for determining the influence of potentially confounding variables.

More exciting are audiogram extensions made possible by machine learning that allow inference to be shared in real-time across more variables. Delivering multiple test stimuli concurrently would generate uninterpretable results for a human tester, but not for a properly designed machine learning algorithm. Audiogram testing time, in this case, is shortened by a factor equivalent to the number of concurrent tones.10 Extending models with additional variables allows tests to be stacked together. For example, conjoining individual models of both ears allows them to be evaluated simultaneously (i.e., in parallel). The mutually conjoint bilateral audiogram can be estimated in the same amount of time needed to estimate one ear unilaterally.11,12 Going further, interaural attenuation can be added to the bilateral model, which allows necessary contralateral masking to be computed automatically for highly asymmetric hearing. In contrast to lengthy conventionally masked audiograms, automasked audiograms take no longer to complete than unmasked audiograms.13


Incorporating additional variables can automatically expand machine learning models beyond simple air-conduction thresholds if useful for the selected clinical scenario (Table 1). Audiogram fine structure can be tested as needed. A patient's deviant responses (intentional or accidental) are readily detectable. With proper transducer design, automated bone conduction can be included. Ipsilateral masking will accommodate challenging acoustic environments. Suprathreshold testing such as noisy speech perception can be included to create a simultaneous multitest, providing a much richer assessment of hearing in much less time than would be possible with sequential conventional tests. Multitests incorporating passively collected biological data, such as auditory brainstem responses and distortion product otoacoustic emissions, along with actively collected behavioral data will improve overall efficiency and diagnostic power. Some of these possibilities have been anticipated,14-19 but the combined implementation of them all requires the more advanced framework on which machine learning audiometry was constructed.

Previously collected data can further accelerate machine learning audiometry by providing informative initial guesses for constructing individual models.20 Conventional staircase methods do not exploit prior beliefs for improved accuracy or efficiency, leaving the incorporation of any additional knowledge about a patient solely to the clinician. With active machine learning, prior beliefs can be rigorously postulated as specific hypotheses (i.e., particular physiologic or pathophysiologic models) that are decided among optimally. Active model selection directly addresses the types of diagnostic questions most commonly asked by clinicians.21 Especially relevant for this kind of testing are ongoing hearing surveillance programs where the prior belief is the patient's most recent audiogram.22 In such cases, an active model selection test delivers only the tones needed to verify whether or not the current hearing matches the hearing status determined from the previous test, resulting in an extremely rapid procedure.23

Machine learning audiometry is an example of Big Inference: obtaining more actionable information per carefully selected data point. When these data reflect a wide variety of biological processes and are acquired outside the clinic, a paradigm shift in patient-centric clinical care emerges. Data collection can be directed by clinicians toward either screening or diagnostic procedures using algorithms that rigorously incorporate and efficiently acquire information from various sources.24 Multiple lines of evidence can therefore be brought to bear on particular clinical questions, transforming and expanding hearing health care in the process.25


1. Barbour, D. L. Precision medicine and the cursed dimensions. npj Digital Medicine 2, 4 (2019).
2. Barbour, D. L. Formal idiographic inference in medicine. JAMA Otolaryngology-Head & Neck Surgery 144, 467-8 (2018).
3. Engheim, E. Why Is Apple's M1 Chip So Fast? Medium (2020).
4. Song, X. D., Sukesan, K. A. & Barbour, D. L. Bayesian active probabilistic classification for psychometric field estimation. Atten Percept Psychophys 80, 798-812 (2018).
5. Song, X. D., Garnett, R. & Barbour, D. L. Psychometric function estimation by probabilistic classification. J Acoust Soc Am 141, 2513-2525 (2017).
6. Song, X. D. et al. Fast, Continuous Audiogram Estimation Using Machine Learning. Ear Hear 36, e326-335 (2015).
7. Barbour, D. L. et al. Online Machine Learning Audiometry. Ear Hear 40, 918-926 (2019).
8. Buss, E., Hall III, J. W. & Grose, J. H. Development and the role of internal noise in detection and discrimination thresholds with narrow band stimuli. J Acoust Soc Am 120, 2777-2788 (2006).
9. Buss, E., Hall, J. W., 3rd & Grose, J. H. Psychometric functions for pure tone intensity discrimination: Slope differences in school-aged children and adults. J Acoust Soc Am 125, 1050-8 (2009).
10. Gardner, J. M., Song, X. D., Cunningham, J. P., Barbour, D. L. & Weinberger, K. Q. Psychophysical testing with Bayesian active learning. in Uncertain ArtifIntell 286-295 (Morgan Kaufmann Publishers Inc., 2015).
11. Barbour, D. L. et al. Conjoint psychometric field estimation for bilateral audiometry. Behav Res Meth (2018) doi:10.3758/s13428-018-1062-3.
12. Heisey, K. L., Buchbinder, J. M. & Barbour, D. L. Concurrent Bilateral Audiometric Inference. Acta Acustica united with Acustica 104, 762-765 (2018).
13. Heisey, K. L., Walker, A. M., Xie, K., Abrams, J. M. & Barbour, D. L. Dynamically Masked Audiograms With Machine Learning Audiometry. Ear Hear. Nov/Dec 2020;41(6):1692-1702.
14. Bastianelli, M. et al. Adult validation of a self-administered tablet audiometer. J Otolaryngol Head Neck Surg 48, 59 (2019).
15. Margolis, R. H., Glasberg, B. R., Creeke, S. & Moore, B. C. AMTAS: Automated method for testing auditory sensitivity: Validation studies. International Journal of Audiology 49, 185-194 (2010).
16. Swanepoel de, W., Matthysen, C., Eikelboom, R. H., Clark, J. L. & Hall, J. W., 3rd. Pure-tone audiometry outside a sound booth using earphone attentuation, integrated noise monitoring, and automation. Int J Audiol 54, 777-85 (2015).
17. van Tonder, J., Swanepoel, W., Mahomed-Asmail, F., Myburgh, H. & Eikelboom, R. H. Automated Smartphone Threshold Audiometry: Validity and Time Efficiency. J Am AcadAudiol 28, 200-208 (2017).
18. Cox, M. & de Vries, B. A Bayesian binary classification approach to pure tone audiometry. arXiv:1511.08670 [stat] (2015).
19. Schlittenlacher, J., Turner, R. E. & Moore, B. C. J. Audiogram estimation using Bayesian active learning. The Journal of the Acoustical Society of America 144, 421-430 (2018).
20. Larsen, T., Malkomes, G. & Barbour, D. Accelerating Psychometric Screening Tests with Prior Information. in Explainable AI in Healthcare and Medicine: Building a Culture of Transparency and Accountability (eds. Shaban-Nejad, A., Michalowski, M. & Buckeridge, D. L.) 305-311 (Springer International Publishing, 2021). doi:10.1007/978-3-030-53352-6_29.
21. Gardner, J. et al. Bayesian active model selection with an application to automated audiometry. in Adv Neural Inf Process Syst 2377-2385 (Morgan Kaufmann, 2015).
22. Brungart, D. et al. Using tablet-based technology to deliver time-efficient ototoxicity monitoring. Int J Audiol 57, S25-S33 (2018).
23. Larsen, T. J., Malkomes, G. & Barbour, D. L. Accelerating Psychometric Screening Tests With Bayesian Active Differential Selection. arXiv preprint arXiv:2002.01547 (2020).
24. Wasmann & Barbour. Emerging Hearing Assessment Technologies For Patient Care. Hearing Journal. 2021;74(3):44-45.
25. Wasmann, J.-W. et al. Computational Audiology: New Approaches to Advance Hearing Health Care in the Digital Age. Ear Hear. (2021). DOI: 10.1097/AUD.0000000000001041. In press.
26. Bonauria. Bonauria Audiogram Demo. (2015).
    27. Bonauria. Hughson-Westlake Audiogram. (2015).
      Copyright © 2021 Wolters Kluwer Health, Inc. All rights reserved.