Readers are invited to submit letters for publication in this department. Submit letters online at http://joem.edmgr.com. Choose “Submit New Manuscript.” A signed copyright assignment and financial disclosure form must be submitted with the letter. Form available at http://www.joem.org under Author and Reviewer information.
To the Editor:
Physical examination maneuvers are frequently performed in those without symptoms, in contrast with the representation in Franzblau et al.1 As an example of a memorable case, a worker had massive upper extremity muscles, was a weight lifter, complained of no symptoms, yet had a markedly positive supraspinatus test on a preplacement examination. His laboratory tests subsequently revealed an high-density lipoprotein cholesterol level of 16 mg/dL, likely consistent with anabolic steroid use and thus most likely he had pre-existing rotator cuff tears.
On a population-basis, we and others have taught examiners attending American Occupational Health Conferences and other educational venues for over 20 years to screen the proximal and distal upper extremities during surveillance examinations for safety critical workers (eg, commercial drivers, firefighters, police officers). We have specifically mentioned the supraspinatus test and painful arc as tests to perform, to assess, and screen for both function abilities and disease status. Carried out across the United States, this is likely in excess of 4,000,000 examinations per year in asymptomatic workers.
In other situations, these maneuvers may be clinically performed despite a lack of glenohumeral pain. If a patient has pain elsewhere, for example, in the trapezius, neck, upper arm, testing with rotator cuff maneuvers is naturally performed to help refine the diagnosis. This may be parallel to the frequent need to record an electrocardiogram in patients with chest pain that is thought to most likely be pleuritic or of unclear etiology. Just because it is thought to be pleuritic does not rule out the potential for a cardiac etiology in all cases, yet knowing the specificity (which requires assessment of test performance in those without cardiac disease) and at least intuiting, if not calculating, the positive predictive value is essential.
Regardless of these issues, Franzblau et al1 meritoriously argue in support of a gold standard comparison. However, in the case of shoulder tendinitis, there is no agreed upon gold standard. This is particularly true as shoulder pain is the target for treatments that include surgery in the absence of complete rotator cuff tears, tears are often asymptomatic, and degenerative findings in the supraspinatus become common with age.2,3 Thus, the need for a comparison that is the next best thing, arguably pain plus a supraspinatus test. Others have used only imaging. Similar problems affect the carpal tunnel syndrome (CTS) literature where symptoms of CTS may be present without an abnormal nerve conduction study. Dale et al4 have published studies of this problem and using similar methods. Many others have used similar methods to study the shoulder.5
In response to Doxey et al,6 the data in Table 2 for the left shoulder's external rotation weakness should be corrected to 235 total assessed, 21 with examiner one positive results and six with examiner two positive results. Franzblau et al1 are correct that the values provided in Table 2 are reproducibility and not kappa statistics. Table 3 provides the percentage of either examiner's examination maneuver being abnormal among these workers with glenohumeral pain being abnormal plus a positive supraspinatus test result in Table 3.
More factual details may assist with addressing questions raised by Franzblau et al,1 as it is not correct to interpret that the second examiner was “fully informed” of the first examiners results while performing the examinations. This study had approximately 1 hour to enroll a participating worker from start to finish including informed consent, a 266 item questionnaire (which used skip sequences), measurements (eg, blood pressure, heart rate, height, weight, wrist dimensions), 483 item structured interview, two physical examinations, and one nerve conduction study. Thus, the examiner was required to expertly and fluidly perform that which was required while assuring the flow of workers through the enrollment processes. The primary indicator for a second examination of a body part was pain in that geographic region (eg, pain in one of the shoulder regions). Consequently, someone with shoulder pain was identified as having that pain on the structured interview and the second examiner glanced at a computer screen, saw the need to perform a shoulder examination (which did not have the examination results on it) and then performed all shoulder maneuvers. While not technically blinded to the first examiner's results as the recording sheet could have been consulted, these maneuvers were usually performed without knowledge of the first examiner's results due to time constraints. After recording those results, occasionally, one would then find a positive examination finding not previously assessed that would need repeating. Thus, the reproducibility would most typically be identified when the second examiner recorded the findings on the paper form. Regardless, these details would not change or result in little change in the sensitivities, specificities, positive predictive values, and negative predictive values.
Examinations were standardized. We encouraged workers to report the results. It is of course possible that contamination of the second response could have been conditioned by the first. Yet, workers were engaged and interested in participating and helping with performing and responding to all maneuvers accurately. The number of times workers spontaneously commented that it hurt the last time or vice versa was striking, although something likely universally experienced in medical training with teams of physicians repeatedly performing examinations; naturally the precise response for that specific maneuver was recorded. Thus, in using the above enrollment processes, we believe that using a different person do the examination and to attempt blinding would likely only modestly change the examinee's responses if at all. It also would likely not appreciably reduce a reporting bias based on the first examination results which would be a potential weakness regardless of which process was followed. Thus, we doubt this would result in material changes in the sensitivity, specificity, etc.
The requisite to determine specificity is to evaluate the results in a “disease free” population. In assessing the value of magnetic resonance imaging, multiple studies have assessed the test in those without the disease (eg, those without low back pain, shoulder pain, etc). This type of process to test those with and without disease has been used for nerve conduction studies, CT colonoscopy, mammography, and essentially all tests or screening procedures. This process has been used to examine multiple history and/or physical examination maneuvers.4,5 Thus, a case has to be established and the test assessed among both those meeting and those not meeting the case definition.
In this case, there was a set definition established and the test was assessed in that population. To not test against a population who do not meet the case definition precludes an assessment of specificity. In this case, the tests were assessed among those who did not meet the case definition. Naturally, some had a positive test despite not meeting the case definition. What this study has also somewhat uniquely accomplished is to test physical examination maneuvers that have been standardized in a broad population, rather than in typically smaller clinic-based populations.
We naturally support that physical examinations have their place in clinical care, although the evidence base for the utility of screening populations of workers remains weak. While an examination is often represented as consisting of purely objective information, patients naturally supply and supplement the examination with subjective and historical information, such more precise pain localization, pointing to the affected site, better describing radiation, etc. This cross contamination of the objective with the subjective in the course of a clinical physical examination may be some of the more beneficial information obtained during the course of an examination. The results of our study suggest most of the clinician's attention should be focused on the history.
1. Franzblau A, Gerr F, Werner RA. Reliability of common provocative tests for shoulder tendinitis by Doxey et al—Letter to the Editor. J Occup Environ Med
2. Sher JS, Uribe JW, Posada A, Murphy BJ, Zlatkin MB. Abnormal findings on magnetic resonance images of asymptomatic shoulders. J Bone Joint Surg
3. Reilly P, Macleod I, Macfarlane R, Windley J, Emery RJ. Dead men and radiologists don’t lie: a review of cadaveric and radiological studies of rotator cuff tear prevalence. Ann R Coll Surg Engl
4. Dale AM, Descatha A, Coomes J, Franzblau A, Evanoff B. Physical examination has a low yield in screening for carpal tunnel syndrome. Am J Ind Med
5. May S, Chance-Larse K, Littlewood C, Lomas D, Saad M. Reliability of physical examination tests used in the assessment of patients with shoulder problems: a systematic review. Physiotherapy
6. Doxey R, Thiese MS, Hegmann KT. Reliability of common provocative tests for shoulder tendinitis. J Occup Environ Med