The Limitations of Pure-Tone Audiometry (as the Gold Standard Test of Hearing) That are Worthy of Consideration : Indian Journal of Otology

Secondary Logo

Journal Logo


The Limitations of Pure-Tone Audiometry (as the Gold Standard Test of Hearing) That are Worthy of Consideration

Zakaria, Mohd Normani

Author Information
Indian Journal of Otology 27(1):p 1-2, Jan–Mar 2021. | DOI: 10.4103/indianjotol.indianjotol_11_21
  • Open

In many scientific studies and clinical settings, pure-tone audiometry (PTA) has been regarded as the gold standard test for hearing diagnosis. By measuring air and bone conduction thresholds with calibrated audiometers, it provides comprehensive information on hearing severity and type of hearing loss at specific frequencies. On the other hand, tuning fork tests have been widely used by otorhinolaryngologists for hearing diagnosis in various otological cases. Even though they could only provide qualitative information on hearing, their clinical values have long been recognized.

For clinical and research purposes, the results of tuning forks are typically compared with those of PTA. In general, the diagnostic accuracy of tuning fork tests has been found to be good, depending upon the type of test, frequency of the tuning fork along with others.[1] However, it is worth noting that the highest reported sensitivity of a commonly used Weber test was only 78%.[2] In this regard, it is of interest to know whether the low sensitivity of Weber is due to its diagnostic limitation, or the PTA itself may not serve as a “good” gold standard test of hearing. Based on the accumulated literature, the PTA does have limitations that are worthy of consideration.[13456] As such, acknowledging PTA as the gold standard test can be misleading in certain situations (and the true diagnostic accuracy of tuning fork tests would not be revealed).

In organic hearing loss cases, the degree of hearing loss across speech frequencies can be reliably obtained with PTA. As such, the air conduction results are useful to give some insight into the listening difficulties faced by the affected individuals in their daily life. On the other hand, diagnosing the type of hearing loss with PTA can be problematic in some cases. In particular, the bone conduction thresholds at certain frequencies can be unreliable leading to “questionable” air-bone gaps. In masking dilemma or over masking cases, it is clear that the type of hearing loss cannot be reliably provided by PTA as the true bone conduction thresholds could not be determined (leaving “false” air-bone gaps on the audiogram).

Depending on the type and severity of cases, type of transducer, and other uncertain factors, the bone conduction results can be doubtful at frequencies of 250, 2000, and 4000 Hz.[3456] The majority of studies have utilized the Radioear B71 transducer in bone conduction testing, which has technical limitations. Among PTA frequencies, the vibrotactile sensitivity and harmonic distortion of this transducer are the highest at 250 Hz, leading to “better” bone conduction thresholds (i e., bigger air-bone gaps than expected are produced).[56] These technical issues would be more prominent when assessing patients with mixed or sensorineural hearing loss.

In addition, it is reasonably common to see a 2 kHz notch in bone conduction testing among patients with middle ear problems (in which the middle ear resonance is altered).[78] The presence of this notch would reduce or abolish the air-bone gaps and underestimate the “severity” of the conductive problems.

Furthermore, using the common B71 transducer, “false” air-bone gaps at 4 kHz can be seen in normal-hearing participants.[34] Several reasons have been proposed to explain this phenomenon including acoustic radiation and calibration issues.[34] Herein, it seems possible that the “false” air-bone gaps at 4 kHz may occur in conjunction with the genuine hearing loss at low frequencies in some patients with conductive hearing loss. Consequently, if these “false” gaps are also considered in calculating the average air-bone gaps, it may overestimate the conductive problems. In fact, notable variations were seen in air-bone gap criteria employed by the previous studies.[91011] For example, Johnston compared the performance of a modified Rinne test with only 500 Hz frequency of PTA.[9] On the other hand, Chole and Cook used average air-bone gaps at 250, 500, 1000, and 2000 Hz to study the performance of Rinne test.[10] Whereas in a study by Behn et al., the accuracy of Weber and Rinne tests was determined by comparing their results with average air-bone gaps at 500, 1000, 2000, and 4000 Hz of PTA.[11] Collectively, the aforementioned PTA limitations and the different air-bone gap criteria used in the previous studies would affect the estimation of the diagnostic accuracy of tuning fork tests to some extent.

Putting a stop, it is still pertinent to acknowledge PTA as the gold standard test of hearing but with care. In particular, caution should be taken when comparing tuning fork test outcomes with air-bone gaps at frequencies of 250, 2000, and 4000 Hz. As such, it is imperative to identify and tackle the “audiometric errors” of PTA to reveal the true diagnostic accuracy of tuning fork tests. Further studies on the efficacy of a newer bone conduction transducer (e g., B81) should be carried out, and it is hoped to see significant improvements in this area in future.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.


1. Kelly EA, Li B, Adams ME. Diagnostic accuracy of tuning fork tests for hearing loss: A systematic review Otolaryngol Head Neck Surg. 2018;159:220–30
2. Shuman AG, Li X, Halpin CF, Rauch SD, Telian SA. Tuning fork testing in sudden sensorineural hearing loss JAMA Intern Med. 2013;173:706–7
3. Studebaker GA. Intertest variability and the air-bone gap J Speech Hear Disord. 1967;32:82–6
4. Margolis RH, Eikelboom RH, Johnson C, Ginter SM, Swanepoel de W, Moore BC. False air-bone gaps at 4 kHz in listeners with normal hearing and sensorineural hearing loss Int J Audiol. 2013;52:526–32
5. Eichenauer A, Dillon H, Clinch B, Loi T. Effect of bone-conduction harmonic distortions on hearing thresholds J Acoust Soc Am. 2014;136:EL96–102
6. Fröhlich L, Plontke SK, Rahne T. Influence of transducer types on bone conduction hearing thresholds PLoS One. 2018;13:e0195233
7. Ahmad I, Pahor AL. Carhart's notch: A finding in otitis media with effusion Int J Pediatr Otorhinolaryngol. 2002;64:165–70
8. Yasan H. Predictive role of Carhart's notch in pre-operative assessment for middle-ear surgery J Laryngol Otol. 2007;121:219–21
9. Johnston DF. A new modification of the Rinne test Clin Otolaryngol Allied Sci. 1992;17:322–6
10. Chole RA, Cook GB. The Rinne test for conductive deafness. A critical reappraisal Arch Otolaryngol Head Neck Surg. 1988;114:399–403
11. Behn A, Westerberg BD, Zhang H, Riding KH, Ludemann JP, Kozak FK. Accuracy of the Weber and Rinne tuning fork tests in evaluation of children with otitis media with effusion J Otolaryngol. 2007;36:197–202
© 2021 Indian Journal of Otology | Published by Wolters Kluwer – Medknow