The coronavirus disease 2019 (COVID-19) pandemic is impacting the lives of nearly everyone around the world in ways that are difficult to comprehend. Clinicians caring for patients with suspicion for COVID-19 are forced to consider the manner in which we use various imaging tests to aid in providing the most appropriate, individualized care possible (1). Unfortunately, diagnostic modalities, including chest radiograph (CXR) (2), CT (3), and reverse-transcription polymerase chain reaction (RT-PCR) on first test (3), have been reported to suffer from poor sensitivity. As a result, serial testing has been recommended (3) when any one of these modalities is negative, which increases the exposures staff and patients to COVID-19 in the hospital while also potentially delaying diagnosis in critically ill patients.
Point-of-care lung ultrasound (LUS) has been suggested as a useful diagnostic modality in these patients (4) as it limits COVID-19 exposure of ancillary staff, minimizes travel within the hospital for patients, can be performed at the bedside within minutes, and has been shown to be diagnostically superior to CXR in critically ill patients with other respiratory complaints (5). LUS patterns for detecting COVID-19 have been suggested (4,6) based on ultrasound (US) theory, case reports, and extrapolation from CT findings; however, diagnostic performance data in an observational analytical study are lacking (6). The objective of this study was to describe LUS findings in patients being evaluated for COVID-19 and retrospectively assess the diagnostic test characteristics of different LUS patterns.
MATERIALS AND METHODS
We performed a retrospective study of a convenience sample of patients in two large urban emergency departments (EDs) in Detroit, Michigan from March 13, 2020, to April 20, 2020. IRB approval was obtained as part of a larger COVID-19 registry at our institution. Patients with suspected COVID-19 who underwent a diagnostic LUS examination with images archived in the ED US database were eligible for inclusion; only patients with complete examinations (10 images, described below) were included. With the exception of LUS performed solely to assess for pneumothorax, our standard ED LUS protocol is based on a prior LUS in heart failure trial which uses a horizontal probe orientation to maximize the amount of visualized pleural line (7). All images were obtained using a curvilinear probe on a Zonare Z1 Pro ultrasound system (Mindray North America, Mahwah, NJ) with a LUS preset: 18 cm depth, clip length of 6 seconds, and multibeam former and tissue harmonics deactivated. Four zones are interrogated in each hemithorax: superior and inferior in both the anterior and lateral chest (Supplemental Fig. 1, Supplemental Digital Content 1, http://links.lww.com/CCX/A248; legend, Supplemental Digital Content 2, http://links.lww.com/CCX/A263). Our standard LUS protocol is to scan patients in the supine position with head-of-bed elevated 30–45°; however, actual position was not recorded in this convenience sample of patients. Assessment for pleural effusion was done by placing the probe in a vertical position (indicator to head) at the costal margin in the mid-axillary line such that both the lung and liver or spleen were visible. Based on prior reports of LUS findings suggestive of COVID-19 lung disease (4–6), LUS images were coded by a blinded US fellowship-trained observer for the presence of nonconfluent and confluent B-lines (based on the same methodology used in the B-lines lung ultrasound-guided emergency department management of acute heart failure (BLUSHED-AHF) study above ), subpleural consolidations, and pleural effusions. Two lung zone patterns were also examined: symmetric bilateral B-lines (vs asymmetric, unilateral, or no B-lines) and nondependent bilateral pulmonary edema (NDBPE; bilateral B-lines with superior count ≥ inferior count and no pleural effusions). The NDBPE pattern was chosen based on the hypothesis that COVID-19 LUS findings may be similar to those seen in acute respiratory distress syndrome (ARDS). While multiple LUS findings were evaluated, no findings or patterns were a priori considered diagnostic for COVID-19 (i.e., this is a retrospective analysis of extant findings, not a prospective assessment of any specific pattern). Demographics, vital signs, test results, hospital course, and other clinical characteristics were recorded. Sonographers were not specifically blinded to results of other diagnostic test results. Concurrent point-of-care echocardiography was performed on an insufficient number of patients to meaningfully inform the analysis, and thus, results of these examinations were not included. Test characteristics and receiver operating characteristic (ROC) area under the curve (AUC) for individual LUS patterns and CXR (pulmonary edema and/or infiltrate, as bilateral vs unilateral vs none) were compared to a reference standard of serial RT-PCR (3) in RStudio v1.2.5001 (RStudio, Boston, MA), using the pROC package (8). Logistic regression was used to model the joint utility of CXR and the LUS pattern with highest AUC. AUCs were compared by DeLong test and ideal cutoffs calculated by Youden J statistic.
In a post hoc exploratory analysis, we sought to derive the highest performing combination of potential diagnostic predictors (vital signs, laboratory tests, CXR, LUS) in a logistic model selected by examination of Akaike information criteria, clinical plausibility, model parsimony, AUC, and the Hosmer-Lemeshow (HL) statistic. Physical examination findings were not considered in this step as they were not defined a priori, thereby precluding unbiased interpretation. Complete case analysis (CCA) can overoptimistically bias prediction models when data are suspected missing at random because missingness is not only a research reality but also a clinical one (9). Thus, multiple imputation (MI) by fully conditional specification (m = 10) was performed in SAS v9.4 (SAS Institute, Cary NC) for eight patients without RT-PCR and three without CXR. MI modeling for the response variable was isolated from predictors in downstream analyses, performed in two bootstrapped stages (9). To help protect against model overfitting and bias from MI, logistic models were fit to stage 1, and model performance measures (HL, ROC, diagnostic characteristics) were calculated on the bootstrapped stage 2 using a “pool-last” approach (10). All analyses were compared to CCA in sensitivity analysis.
Sixty-four patients underwent LUS as part of an evaluation for COVID-19. See Tables 1 and 2, respectively, for characteristics and outcomes. Fifty-six patients had RT-PCR testing for COVID-19, with positivity of 71% (95% CI, 60–83%). Median count of RT-PCR tests per patient was one in positives and two in negative cases. Nineteen of 20 patients with in-hospital mortality tested positive for COVID-19, while one died before testing completion.
Diagnostic test performance for COVID-19 diagnosis is described for CXR, LUS patterns, and the two multipredictor models by ROC plots (Fig. 1). Bilateral infiltrate/edema on CXR was 74% sensitive (95% CI, 48–93%), 53% specific (95% CI, 32–75%), with AUC 0.66 (95% CI, 0.54–0.79). The strongest performing LUS finding was the NDBPE pattern (AUC, 0.73; 95% CI, 0.61–0.84; sensitivity = 69% [95% CI, 37–82%], specificity = 77% [95% CI, 50–92%]). Symmetric bilateral B-lines showed modest univariate diagnostic discrimination for COVID-19 (Fig. 1A), while subpleural consolidation, confluent, and nonconfluent B-line patterns (Fig. 1B–D) failed to reach statistical significance (95% CI of AUC crossed 0.50). Combined CXR and NDBPE LUS pattern (Fig. 1G) showed significantly stronger diagnostic prediction (AUC, 0.80; 95% CI, 0.68–0.90) than either CXR or NDBPE alone (p = 0.035 and 0.020, respectively).
In the exploratory analysis, the optimal diagnostic combination of clinical factors, CXR, and LUS patterns was NDBPE, fever (temperature ≥ 38°C), and hypoxia (room air pulse oximetry ≤ 94%). No other tested combination of CXR, LUS, or clinical factors added performance or parsimony beyond this model (AUC, 0.86; 95% CI, 0.76–0.94; sensitivity = 77% [58–93%]; specificity = 76% [53–94%] at the ideal cutoff). The NDBPE/fever/hypoxia model performance was nonsignificantly different compared to the CXR/NDBPE model (p = 0.17) and was superior to CXR alone (p = 0.003) and NDBPE alone (p < 0.001). By contrast, a model of CXR/fever/hypoxia (AUC, 0.78; 95% CI, 0.67–0.89; sensitivity = 64% [43–80%]; specificity = 77% [48–95%]) had superior overall discrimination compared to CXR alone (p = 0.042) but not to NDBPE alone (p = 0.579), or to the combination of NDBPE and CXR (p = 0.827). At the ideal cutoffs, the CXR/fever/hypoxia model had similar specificity but inferior sensitivity (p = 0.003) to the NDBPE/fever/hypoxia model and even to NDBPE alone (p = 0.015).
Our results suggest that a NDBPE pattern on LUS offers additive diagnostic value to portable CXR in ED patients with suspected COVID-19. The NDBPE pattern was similarly sensitive to CXR, CT, and first-test RT-PCR (3), had improved specificity compared to CXR, can be rapidly performed at point of care, and minimizes ancillary staff exposure and patient transport. RT-PCR requires serial testing at this time (e.g., up to five tests to detect one positive patient in our sample), so a LUS-based strategy with the test characteristics we observed could be highly valuable for risk-stratification, cohorting of infected patients within the hospital, early guidance in management decisions, and resource-flexibility under pandemic conditions of resource-scarcity. The earlier diagnostic certainty that can be achieved using a LUS-based imaging protocol could also offer front-line physicians some relief from the cognitive and psychological stresses associated with providing medical care during a pandemic.
Multiple prior studies, as well as a meta-analysis, have reported that LUS is superior to CXR for diagnosing many lung pathologies in critical illness, including alveolar interstitial syndrome (AIS) in ARDS (5). For example, Lichtenstein et al (9) reported a sensitivity and specificity of 98% and 88% for LUS versus 60% and 100% for CXR in AIS. In our study, neither modality performed as well as this. There are several possible explanations for this. First, there is varied severity of pulmonary involvement with COVID-19, and some patients may have minimal (or no) lung findings early in the disease process, thereby reducing sensitivity. Additionally, RT-PCR is an imperfect gold standard, and as such, a patient’s result on this test may not accurately reflect disease status, thereby negatively impacting LUS test characteristics. This study took place during the local peak of the pandemic, and thus, any patient who presented during this time with respiratory symptoms was likely to have been considered a potential COVID-19 patient, which could negatively impact specificity.
In contrast to 11 recent publications on LUS in COVID-19 comprised of case reports, letters to the editor, and expert opinion in mostly noncritically ill patients (6), our study offers analytical (albeit retrospective) observational evidence of LUS diagnostic performance in a cohort of patients with a range of severities and relatively high mortality rate. Subpleural consolidations, confluent versus nonconfluent B-lines, and basilar-predominant changes had been suggested to be useful for COVID-19 diagnosis in such reports (4,6) but failed to reach statistically significant diagnostic discrimination here. Discrimination for the NDBPE pattern was strong, and when part of a simple clinical score including hypoxia and fever (Fig. 1), outperformed CXR. While we present data on a larger cohort, our results should nonetheless be considered hypothesis-generating. Prospective, external validation of our approach is needed before it is incorporated into routine practice. Future studies should also consider the effect of diverse clinical populations (i.e., mixed COVID-19/non-COVID-19 groups and broad clinical severity) on the accuracy of our approach, as well as the potential additive value of point-of-care echocardiography.
Our study has several limitations. First, retrospective design and convenience sampling mean that these results should be interpreted as hypothesis-generating. Our use of MI in eight cases without reference standard could have biased our findings, although inconsistent access to rapid COVID-19 RT-PCR is the current reality in many countries, including the United States. Difficulties in obtaining accurate and timely COVID-19 testing is a prime reason why a LUS-based strategy would be useful in the first place. CCA results were equal or more favorable for LUS compared to results from MI. This is consistent with our overall MI modeling strategy, treating the uncertainties of MI modeling as additive to the uncertainty of missingness as a clinical reality (9), with an assumption that CCA was over-optimistic. As a gold standard, RT-PCR is imperfect, even when performed serially, and diagnostic accuracy calculated may have been affected. We could not mandate or control for the effect of patient positioning due to the retrospective nature of this study and chose to evaluate whether or not distribution of extravascular lung fluid in a nondependent pattern (superior lung zones having equal to or more fluid than inferior, gravity-dependent lung zones) would be predictive of COVID-19 lung disease because the majority of the patients seen in our ED early in the pandemic were ambulatory at arrival (even the critically ill). However, based on what is known about distribution of pulmonary edema in acute heart failure (AHF), lung water is able to change locations fairly rapidly as patient positioning changes (11). A follow-up study examining this phenomenon in COVID-19 patients would be prudent. Another potential limitation is that the extent to which LUS findings were purely acute versus acute-on-chronic is unknown. Lack of concurrently obtained echocardiography data likewise precludes understanding if left heart dysfunction contributed to findings on LUS. Additionally, the high in-hospital mortality rate of patients in our cohort may represent spectrum bias, with providers having been more likely to perform LUS in sicker patients. The decision to perform a LUS may also have been the result of knowledge of the results of other diagnostic tests by the sonographer, such as CXR. Finally, the LUS patterns observed in our study may not be representative of those with milder disease.
The findings described in the present study demonstrate that LUS has the potential to add value to the care of patients with suspected COVID-19, but useful patterns were different from what has been suggested in nonanalytical publications. Since this is a hypothesis-generating study, no firm conclusions can be made as to why one imaging pattern may have outperformed another; however, analysis of the case-mix of patients may provide some clues. As an example, subpleural consolidation on LUS had been hypothesized (4) as a potentially useful finding in COVID-19 due to utility of this pattern to identify viral and bacterial pneumonia other than COVID-19; however, we did not find this to be the case. It is therefore perhaps notable that eight of the 12 patients who were confirmed COVID-19 negative had a discharge diagnosis of pneumonia due to pathogens other than COVID-19. While we cannot rule out that these were false negatives for COVID-19 pneumonia (i.e., after serial testing), four of those eight had concomitant bacteremia with a pulmonary pathogen, one had pneumocystis pneumonia, one had confirmed invasive pulmonary aspergillosis, and one had confirmed influenza A. These seven of eight with a highly plausible alternative pulmonary infection could also demonstrate pleural LUS findings, and therefore evaluation for subpleural consolidation may simply have failed to distinguish these alternative infectious lung diseases from COVID-19. Furthermore, COVID-19 presentations were particularly severe in our report, with marked mortality rates and high rates of comorbidities. Consequently, an ARDS-like pulmonary edema picture of COVID-19 may have predominated the case-mix more so than the uncomplicated pneumonia expected in mild presentations of COVID-19, with the latter expected to be more consistent with LUS findings of consolidation compared to the former. While we consider the severity of COVID-19 presentations a strength of our report given a paucity of LUS data for critically ill COVID-19 patients (6), it thus also must be considered a limitation. Just as the predominance of a “less-sick” COVID-19 population in previous LUS reports (6) causes a spectrum bias toward results more specific to mild COVID-19, our comparatively ill cohort (and any selection bias that may have led to it) likely introduces a spectrum bias toward LUS findings more characteristic of severe presentations (e.g., ICU-admitted patients, with high mortality) (12).
By contrast, the NDBPE pattern that we describe performed well. There is precedent for the lack of a base-apex gradient on LUS as one factor differentiating pneumogenic pulmonary edema (specifically ARDS) from cardiogenic pulmonary edema (13). This is consistent with the NDBPE pattern we observed here, possibly by highlighting again that more severe presentations of COVID-19 involve the manifestation of bilateral increased extravascular lung fluid that is not hydrostatic in nature (i.e., an ARDS-type picture). Notably, four of the five COVID-19 negative patients who were diagnosed with AHF or volume overload from decompensated renal disease were absent the NDBPE pattern. It is possible then that the NDBPE pattern helped to differentiate critically ill COVID-19 patients from those with cardiogenic pulmonary edema or volume overload from renal failure (14), although this too is simply a hypothesis and needs testing in prospective study. The differentiation of pneumogenic pulmonary edema (e.g., ARDS, viral pneumonitis) from cardiogenic edema has long been a challenge on LUS (15), and echocardiographic evaluation of filling pressures has proved useful to this end in the past. Thus, concurrent ventricular filling pressures will also be needed to confirm the supposition that NDBPE on LUS can help rule-in COVID-19 lung disease in part by screening out cardiogenic and renal pulmonary edema (15). Future research should test the hypotheses generated here with an explicit prospective design, inclusion of a broad spectrum of COVID-19 severity, multiple observations across the ED to ICU, and rigorous methods for diagnostic adjudication beyond the RT-PCR reference standard alone which would likely include CT of the thorax if feasible.
1. Zu Z, Jiang M, Zu P, et al. Coronavirus disease 2019 (COVID-19
): A perspective from China. Radiology. 2020; 296:E15–E25
2. Ng M-Y, Lee EYP, Yang J, et al. Imaging profile of the COVID-19
infection: Radiologic findings and literature review. Radiol Cardiothorac Imaging. 2020; 2:1
3. He JL, Luo L, Luo ZD, et al. Diagnostic performance between CT and initial real-time RT-PCR for clinically suspected 2019 coronavirus disease (COVID-19
) patients outside Wuhan, China. Respir Med. 2020; 168:105980
4. Soldati G, Smargiassi A, Inchingolo R, et al. Is there a role for lung ultrasound
during the COVID-19
pandemic?. J Ultrasound
Med. 2020; 39:1459–1462
5. Winkler MH, Touw HR, van de Ven PM, et al. Diagnostic accuracy of chest radiograph, and when concomitantly studied lung ultrasound
, in critically ill patients with respiratory symptoms: A systematic review and meta-analysis. Crit Care Med. 2018; 46:e707–e714
6. Smith MJ, Hayward SA, Innes SM, et al. Point-of-care lung ultrasound
in patients with COVID-19
– a narrative review. Anaesthesia. 2020; 75:1096–1104
7. Russell FM, Ehrman RR, Ferre R, et al. Design and rational of the B-lines lung ultrasound
-guided emergency department
management of acute heart failure (BLUSHED-AHF) pilot trial. Heart Lung. 2019; 48:186–192
8. Wood AM, Royston P, White IR. The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data. Biom J. 2015; 57:614–632
9. Lichtenstein D, Goldstein I, Mourgeon E, et al. Comparative diagnostic performances of auscultation, chest radiography, and lung ultrasonography in acute respiratory distress syndrome. Anesthesiology. 2004; 100:9–15
10. Robin X, Turck N, Hainard A, et al. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011; 12:77
11. Frasure SE, Matilsky DK, Siadecki SD, et al. Impact of patient positioning on lung ultrasound
findings in acute heart failure. Eur Heart J Acute Cardiovasc Care. 2015; 4:326–332
12. Willis BH. Spectrum bias—why clinicians need to be cautious when applying diagnostic test studies. Fam Pract. 2008; 25:390–396
13. Sofia S, Boccatonda A, Montanari M, et al. Thoracic ultrasound
: A pictorial essay. J Ultrasound
. 2020; 23:217–221
14. Martindale JL, Wakai A, Collins SP, et al. Diagnosing acute heart failure in the emergency department
: A systematic review and meta-analysis. Acad Emerg Med. 2016; 23:223–242
15. Vignon P, Repessé X, Vieillard-Baron A, et al. Critical care ultrasonography in acute respiratory failure. Crit Care. 2016; 20:228