Diagnostic Accuracy of the Proton Pump Inhibitor Test in Gastroesophageal Reflux Disease and Noncardiac Chest Pain: A Systematic Review and Meta-analysis : Journal of Clinical Gastroenterology

Secondary Logo

Journal Logo

Original Articles

Diagnostic Accuracy of the Proton Pump Inhibitor Test in Gastroesophageal Reflux Disease and Noncardiac Chest Pain

A Systematic Review and Meta-analysis

Ghoneim, Sara MD, MS*,†; Wang, Jiasheng MD*,†; El Hage Chehade, Nabil MD*,†; Ganocy, Stephen J. PhD†,‡; Chitsaz, Ehsan MD, MHSc†,§; Fass, Ronnie MD, MACG†,§,∥

Author Information
Journal of Clinical Gastroenterology 57(4):p 380-388, April 2023. | DOI: 10.1097/MCG.0000000000001686

Abstract

Gastroesophageal reflux disease (GERD) is a chronic disorder frequently encountered by gastroenterologists and primary care providers.1 The prevalence of GERD is estimated to be 18% to 28% in North America, 9% to 26% in Europe, and 9% to 33% in Middle East and East Asia.2

Currently, there is no single gold-standard test for GERD. While heartburn and regurgitation are typical symptoms, their presence is not very sensitive.3 The diagnostic accuracy of upper endoscopy is also low, with an overall sensitivity of 50%, as up to 70% of GERD patients have normal endoscopic findings. Conversely, sensitivity and specificity of reflux testing ranges from 79% to 96% and 86% to 100%, respectively, but maybe normal in up to 25% and 50% of patients with erosive reflux disease (ERD) or nonerosive reflux disease (NERD), respectively.4–7

Due to the limitations and invasiveness of the aforementioned tests, the diagnosis of GERD still relies on a response to an empiric trial of acid-suppressive medication. This approach, termed the “PPI test,” constitutes a readily available tool for detecting GERD.8 However, data regarding the diagnostic performance of the proton pump inhibitor (PPI) test has been conflicting with reported sensitivities and specificities ranging from 27% to 89% and 35% to 73%, respectively.8–10 In contrast, in noncardiac chest pain (NCCP), a common atypical manifestation of GERD, the test was shown to be 80% sensitive and 74% specific.11,12

In the first systematic review and meta-analysis of PPI test performance, summary estimates of sensitivity and specificity were 78% and 54%, respectively. The authors concluded that in patients suspected of having GERD, this test did not confidently establish the diagnosis.9 However, their review suffered from several methodological shortcomings that included (1) the use of pivotal clinical trials designed to evaluate therapeutic efficacy rather than diagnostic test accuracy (DTA), (2) the inclusion of patients with functional esophageal disorders and uninvestigated dyspepsia, and (3) the use of “complete relief of heartburn” to define symptomatic response to the index test, which is an endpoint better suited for efficacy trials rather than diagnostic ones.10 Since then, several studies investigated the diagnostic accuracy of the PPI test; however, results were mixed, and none investigated the utility of the test in specific entities of GERD. Therefore, our aim was to conduct a comprehensive meta-analytic investigation of the diagnostic accuracy of the PPI test in GERD and NCCP and explore the test performance in 2 distinct phenotypes of GERD, ERD, and NERD.

METHODS

The protocol for this systematic review was developed and registered with PROSOPERO (CRD42021256644).13 This systematic review and meta-analysis was conducted according to Preferred Reporting Items for Systemic Reviews and Meta-analysis of Diagnostic Test Accuracy Studies: the PRISMA-DTA Statement.14 The study was exempt from ethical approval because the analysis involved only deidentified data, and all of the included studies have received local ethic approval.

Search Strategy

Electronic databases including Web of Science, PubMed/MEDLINE, and CENTRAL were searched from February 1, 1950, to February 1, 2021. The search strategy was developed using the following terms: heartburn, upper endoscopy, GERD, NCCP, PPI test, and esophageal pH monitoring. A manual search of the reference list cited in published trials was also performed to identify additional studies of interest. Full details of the search strategy are provided in Table S1 (Supplemental Digital Content 1, https://links.lww.com/JCG/A817).

Study Selection and Data Extraction

All published studies that met the search criteria were retrieved and reviewed using Covidence. Two reviewers (E.C., S.G.) independently screened the titles and abstracts to exclude irrelevant studies using a predefined data extraction form. Only data published in peer-reviewed journals was selected to minimize potential sources of bias and inaccuracy. The remaining studies were assessed by reading the full manuscript. Any disagreement was resolved by consensus.

The following information were extracted from eligible studies using a standardized form: study characteristics (title of the study, name of first author, author’s country, and publication year) and clinical characteristics (study design, sample size, participants’ age, PPI name, dose, duration, and indication).

Eligibility Criteria

Only DTA studies comparing the PPI test to an accepted reference standard test were included. Patients 18 years or older with the presumptive diagnosis of GERD on the basis of presenting symptoms and history and who have underwent objective testing by esophageal pH monitoring and/or upper endoscopy were eligible. Adult patients with recurrent episodes of chest pain without documented cardiac abnormalities were included in the NCCP arm. ERD was defined as endoscopic evidence of reflux-related mucosal injury, whereas NERD was defined as abnormal esophageal acid exposure based on pH monitoring but without esophageal mucosal injury on endoscopy.15 A short course of PPIs administered for at least 5 days but no more than 4 weeks was the intervention of interest. For the purpose of this review, a PPI test was considered positive if symptom score improved by >50% from baseline. Only randomized, double-blind, placebo-controlled, cross-over, or open-label clinical trials were included. Therapeutic efficacy trials, studies with insufficient data, pediatric studies, duplicates, abstracts, studies with extraesophageal manifestations of GERD, and those with no acceptable reference standard were excluded. There was no restriction in terms of language or location of the study.

Quality Assessment

All diagnostic accuracy trials were evaluated using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool which is the current recommendation by the Cochrane Collaboration for assessment of the risk of bias.16

Data Synthesis and Analysis

A 2×2 contingency table was constructed to extract the true positive, false positive, false negative, and true negative for each individual study. Univariate analysis with a random effect model (DerSimonian-Laird method) was used to calculate the sensitivity, specificity, positive and negative likelihood ratio (LR), and diagnostic odds ratio (DOR). Confidence intervals (95% CIs) were calculated using the Clopper-Pearson method. Interstudy heterogeneity was assessed by the Paule-Mandel τ2, I2 index, and Cochran Q test. I2 index between 0% and 30% indicate low heterogeneity, values between 31% and 60% indicate moderate heterogeneity, values between 61% and 75% indicate substantial heterogeneity and values between 76% and 100% indicate considerable heterogeneity. We performed subgroup analyses by study design and GERD phenotype. We further performed 2 post hoc sensitivity analyses by excluding (1) studies where GERD was diagnosed by only one reference standard test; and (2) studies where PPIs were administered once daily. The summary receiver operating characteristics (SROC) curve was calculated using the bivariate model. The area under the receiver operating characteristic curve (AUC) was calculated; a value of <0.5 is no better than chance, whereas a value >0.80 is considered to have good discerning properties. Publication bias was investigated visually by funnel plots and the Egger test. All tests were 2 sided, and P<0.05 was considered statistically significant. All analyses were performed in R statistical software, version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria).

RESULTS

Study Selection and Characteristics

A total of 3874 studies were identified through electronic and manual searching. After the removal of 875 duplicates, 2999 articles remained. After title and abstract review, 2968 studies were excluded. Thirty-one full texts were screened for eligibility, and 12 studies were excluded for the following reasons: unrelated to diagnostic tests for GERD (n=6), different or no reference standard was included (n=5). Altogether, a total of 19 studies (GERD=11, NCCP=8) meeting the predetermined inclusion criteria were identified (Fig. 1).17–35 The aim of all studies was to diagnose GERD using a PPI as the index test. All trials were prospective in design. Characteristics of individual studies are depicted in Table 1. A narrative assessment of all of the included studies is available in Tables S2 and S3 (Supplemental Digital Content 1, https://links.lww.com/JCG/A817). Of the 11 GERD studies (n=1373), 6 were open-label (n=951), and 5 were randomized double-blind, placebo-controlled clinical trials: 3 parallel (n=316) and 2 cross-over (n=106) in design. Patients underwent both endoscopy and esophageal pH monitoring in 10 studies. Only 1 study relied exclusively on esophageal pH monitoring to establish the diagnosis of GERD. A total of 115 (8.4%) patients were excluded from all trials. The average age of participants was 50±9.4 years.

F1
FIGURE 1:
Study flow diagram. GERD indicates gastresophageal reflux disease; NCCP, noncardiac chest pain.
TABLE 1 - Characteristics of the Included Studies
References Country No. Centers Date of Study Initiation Date of Study End Sample Size for Analysis No. Patients Exclude From Analysis Average Age (y) Gender Distribution (Male/Female) Reference Standard PPI Dose Indication Duration
Randomized controlled trial
 Schenk et al17 The Netherlands Multicenter July 1993 July 1994 85 13 Omeprazole: 49 Placebo: 49 33/52 pH-metry Endoscopy Omeprazole 40 mg daily GERD 2 week
 Johnsson et al18 Denmark  Norway  Sweden Multicenter NA NA 159 27 NA NA pH-metry Endoscopy Omeprazole 20 mg bid GERD 2 week
 Fass et al19 United States Single center January 1996 December 1996 39 0 GERD: 58.2 Non-GERD: 61.6 38/1 pH-metry Endoscopy Omeprazole 40 mg am 20 mg before dinner NCCP 1 week
 Fass et al20 United States Single center January 1, 1996 December 31, 1996 42 1 GERD: 55 Non-GERD: 56.4 32/10 pH-metry Endoscopy Omeprazole 40 mg am 20 mg before dinner GERD 1 week
 Juul-Hansen et al21 Norway Single center NA NA 64 0 54 20/44 pH-metry Endoscopy Lansoprazole 60 mg daily GERD 5 days
 Pandak et al22 United States Single center May 1997 August 1999 37 5 NA NA pH-metry Endoscopy Omeprazole 40 bid NCCP 2 week
 Xia et al23 China Single center November 1998 February 2000 68 2 Lansoprazole: 59.6 Placebo: 56.6 26/42 pH-metry Lansoprazole 30 mg daily NCCP 4 week
 Bautista et al24 United States Single center NA NA 40 0 GERD: 52.1 Non-GERD: 56.3 31/9 pH-metry Endoscopy Lansoprazole 60 mg am 30 mg at bedtime NCCP 1 week
 Dickman et al25 United States Single center NA NA 35 0 GERD: 55 Non-GERD: 55 23/12 pH-metry Endoscopy Rabeprazole 20 mg bid NCCP 1 week
 des Varannes et al26 France Multicenter NA NA 72 11 Rabeprazole: 49.1 Placebo: 47.1 29/43 pH-metry Endoscopy Rabeprazole 20 mg bid GERD 1 week
Open-label trials
 Bate et al27 UK Multicenter NA NA 69 21 47.4 38/31 pH-metry Endoscopy Omeprazole 40 daily GERD 1 week
 Fass et al28 United States Single center NA NA 35 1 55 33/2 pH-metry Endoscopy Omeprazole 40 mg am 20 mg before dinner GERD 1 week
 Huamán et al29 Spain Single center October 2011 September 2012 30 0 GERD: 50 Non-GERD: 45 17/13 pH-metry Endoscopy Pantoprazole 40 mg bid NCCP 4 week
 Remes-Troche et al30 Mexico Single center NA NA 64 1 39.3 17/47 pH-metry Endoscopy Rabeprazole 20 mg bid GERD 1 week
 Aanen et al31 The Netherlands Single center 2003 2005 74 16 NA NA pH-metry Esomeprazole 40 mg daily GERD 2 week
 Zheng et al32 China Single center April 2007 October 2007 27 0 GERD: 57 Non-GERD: 57 8/19 pH-metry Endoscopy Esomeprazole 20 mg bid NCCP 2 week
 Kim et al33 Korea Single center January 2005 December 2006 42 0 GERD: 53.8 Non-GERD: 53.9 24/18 pH-metry Endoscopy Rabeprazole 20 mg bid NCCP 2 week
 Cho et al34 Korea Single center NA NA 73 0 47 40/37 pH-metry Endoscopy Lansoprazole 30 mg bid GERD 2 week
 Zhou et al35 China Single center 2011 2012 636 24 49.3 371/265 pH-metry Endoscopy Esomeprazole 20 mg bid GERD 2 week
BID indicates twice daily; GERD, gastresophageal reflux disease; NA, not available; NCCP, noncardiac chest pain; PPI, proton pump inhibitor.

Among the 8 NCCP studies (n=318), 3 were open-label (n=99), 5 were randomized double-blind, placebo-controlled clinical trials: 4 cross-over (n=151), and 1 parallel (n=68) in design. Only 1 study relied exclusively on esophageal pH monitoring for diagnosis of GERD. A total of 7 patients (2.1%) were excluded from all trials. The average age of participants was 56.2±8.2 years.

Of the 19 unique full-text articles that met the inclusion criteria, only GERD studies provided relevant data on the performance of the PPI test patients with ERD or NERD. A total of 127 ERD patients in 3 studies and 172 NERD patients in 4 studies were included in the final analysis.

Quality Assessment

Results of the risk bias and applicability concern assessment of individual studies using the QUADAS-2 tool are summarized in Figures S1 and S2 (Supplemental Digital Content 1, https://links.lww.com/JCG/A817) and Tables S4 and S5 (Supplemental Digital Content 1, https://links.lww.com/JCG/A817). Most of the studies were free of partial verification bias, and all used an appropriate reference standard test. Information regarding blinding when reading the index and reference tests was provided for 53% of the studies. The risk of bias was low in the patient selection domain for 8 GERD studies and for all NCCP studies. Given the main aim of studies was identifying the DTA of PPIs, applicability concerns were low for all included studies.

Diagnostic Accuracy of the PPI Test in GERD

Pooled estimates of sensitivity and specificity were 79% (95% CI, 72%-84%) and 45% (95% CI, 40%-49%), respectively (Fig. 2). Individual sensitivities ranged from 63% to 100% and specificities from 19% to 64%. Heterogeneity was substantial but not statistically significant for the test’s sensitivity (I2=59%; P=0.14), whereas low-level heterogeneity was observed for specificity (I2=0%; P=0.37) across all studies (Fig. 2). The calculated AUC was 0.62. The SROC is reported in Figure S3 (Supplemental Digital Content 1, https://links.lww.com/JCG/A817). Overall DOR, positive LR and negative LR were 2.30 (95% CI, 1.78-3.0, I2=0), 1.36 (95% CI, 1.15-1.60), and 0.54 (95% CI, 0.4-0.74), respectively (Fig. S4, Supplemental Digital Content 1, https://links.lww.com/JCG/A817).

F2
FIGURE 2:
Forest plot illustrating the pooled (A) sensitivity and (B) specificity of the proton pump inhibitor test for gastroesophageal reflux disease. Boxes represent point estimates. Whiskers represent 95% CIs. CI indicates confidence interval; RCT, randomized clinical trial.

Subgroup analysis of the 5 double-blind randomized clinical trials (n=422) showed pooled estimates of sensitivity and specificity to be 76% (95% CI, 70%-80%; I2=0) and 48% (95% CI, 37%-59%, I2=0), respectively. Overall DOR, positive LR, negative LR were 2.63 (95% CI, 1.41-4.90), 1.42 (95% CI, 1.08-1.86), and 0.54 (95% CI, 0.38-0.78), respectively.

There was substantial heterogeneity for the test’s sensitivity among open-label studies, though this was not statistically significant (I2=75%; P=0.10) (Fig. 2). Estimates of sensitivity and specificity in randomized and open-label studies remained similar (76% and 48% vs. 81% and 44%), respectively.

Diagnostic Accuracy of the PPI Test in NCCP

Pooled estimates of sensitivity and specificity were 79% (95% CI, 69%-86%), and 79% (95% CI, 69%-86%), respectively (Fig. 3). Overall DOR, positive LR, and negative LR were: 17.27 (95% CI, 8.84-33.71), 3.91 (95% CI, 2.60-5.89), and 0.26 (95% CI, 0.17-0.39), respectively. Individual sensitivities and specificities ranged from 55% to 95% and 61% to 91%, respectively. The data showed that the SROC curve is positioned near the desirable upper-left corner; AUC was 0.85, indicating that the level of overall accuracy was high (Fig. S5, Supplemental Digital Content 1, https://links.lww.com/JCG/A817). Heterogeneity level was low across all diagnostic accuracy measures (Fig. 3, Fig. S6, Supplemental Digital Content 1, https://links.lww.com/JCG/A817).

F3
FIGURE 3:
Forest plot illustrating the pooled sensitivity (A) and specificity (B) of the proton pump inhibitor test for noncardiac chest pain. CI indicates confidence interval; RCT, randomized clinical trial.

Subgroup analysis of the 5 double-blind randomized clinical trials (n=219) revealed pooled estimates of sensitivity and specificity to be 83% (95% CI, 74%-90%; I2=0) and 80% (95% CI, 66%-89%; I2=45%; P=0.09), respectively (Fig. 3). Overall DOR, positive LR and negative LR were 22.21 (95% CI, 9.61-51.31; I2=0), 4.30 (95% CI, 2.40-7.70), and 0.19 (95% CI, 0.11-0.35), respectively. No between-study heterogeneity was observed in open-label studies. Estimates of sensitivity and specificity in randomized and open-label studies were similar (83% and 80% vs. 70%, and 77%), respectively.

Diagnostic Accuracy of the PPI Test in ERD and NERD

Pooled sensitivity, specificity, and DOR of the PPI test in ERD were 76% (95% CI, 66%-84%; I2=0%), 30% (95% CI, 8%-67%; I2=66%; P=0.78), and 1.41 (95% CI, 0.29-6.77), respectively (Fig. 4, Fig. S7, Supplemental Digital Content 1, https://links.lww.com/JCG/A817). In NERD, pooled sensitivity, specificity and DOR were 79% (95% CI, 70%-86%; I2=0%), 50% (95% CI, 39%-61%; I2=0%), and 3.82 (95% CI, 1.92-7.61), respectively (Fig. 5, Fig. S8, Supplemental Digital Content 1, https://links.lww.com/JCG/A817).

F4
FIGURE 4:
Forest plot illustrating the pooled sensitivity (A) and specificity (B) of the proton pump inhibitor test for erosive reflux disease. CI indicates confidence interval.
F5
FIGURE 5:
Forest plot illustrating the pooled sensitivity (A) and specificity (B) of the proton pump inhibitor test for nonerosive reflux disease. CI indicates confidence interval.

Sensitivity Analyses

Two sensitivity analyses were performed (Table S6, Supplemental Digital Content 1, https://links.lww.com/JCG/A817, Figs. S9–S11, Supplemental Digital Content 1, https://links.lww.com/JCG/A817). On restricting analysis to studies where PPIs were administered twice daily, pooled sensitivity and specificity estimates were 78% (95% CI, 70%-85%) and 44% (95% CI, 39%-49%), respectively in GERD studies.18,20,26,28,30,34,35 In NCCP, pooled sensitivity and specificity estimates were 77% (95% CI, 68%-85%; I2=26%) and 81% (95% CI, 71%-88%; I2=20%), respectively.19,22,24,25,29,32,33

On further analysis, restricting only to studies where GERD was diagnosed by endoscopy and esophageal pH monitoring, only one study was excluded.31 In the remaining 10 studies, pooled sensitivity and specificity estimates were 75% (95% CI, 69%-80%; I2=0%) and 47% (95% CI, 42%-52%; I2=0%), respectively.17,18,20,21,26–28,30,34,35 In NCCP, only one study was excluded and pooled sensitivity and specificity were 77% (95% CI, 68%-85%; I2=26%) and 81% (95% CI, 71%-88%; I2=20%), respectively.23

Assessment of Publication Bias

Publication bias for the PPI test in GERD and NCCP was explored using Egger’s regression test and visually by funnel plots, showing no small study effect for accuracy of studies (P=0.61 and 0.97, respectively) (Fig. S12, Supplemental Digital Content 1, https://links.lww.com/JCG/A817).

DISCUSSION

This comprehensive meta-analysis of 19 trials reports the diagnostic accuracy of the PPI test in 1691 patients with GERD and NCCP. Our data show the PPI test is more sensitive (78%) than specific (48%) in GERD. The PPI test had overall better discriminative ability in NCCP, with pooled sensitivity and specificity of 79%. We observed no statistically significant between-study heterogeneity by subgroup and sensitivity analyses.

The present study further expands on findings from the meta-analysis by Numans et al.9 We used restrictive inclusion and exclusion criteria with special attention to confounding factors and obtained more robust estimates. This explains the lower specificity of the PPI test reported in our study. In the previous meta-analysis, 5 of the 15 (33%) trials were not combined in the pooled estimates. Four trials (40%) were inherently designed to measure efficacy outcomes which is not an appropriate study design for estimating diagnostic accuracy of a test. For this reason, we systematically reviewed primary diagnostic accuracy literature to synthesize our results. This methodological approach favored internal validity over statistical power. A less rigorous endpoint for a symptomatic response was also chosen to optimize the sensitivity of the PPI test but at the cost of additional false positives and reduced specificity. Despite these trade-offs, our review provides a more accurate interpretation of available evidence on the diagnostic characteristics of the PPI test in typical GERD.36

Previous studies including a meta-analysis of 7 studies by Cremonini et al12 showed the PPI test to be 83% sensitive and 75% in NCCP when the partial response was the endpoint of interest. However, differences in treatment duration (1 d vs. 6 wk) and preferential use of one reference test over the other for diagnosis may have exaggerated their results. Unlike their meta-analysis, we used standardized criteria so that 7 of the 8 studies included relied on both reference tests to diagnose GERD. Moreover, we ascribed to the optimal cutoff duration to assess symptomatic response to PPIs and thus obtained more robust assessments of the test’s performance.37,38 Another meta-analysis in 2005 found the pooled sensitivity and specificity of the PPI test to be 80% and 74%, respectively, but the small sample size limited the generalizability of their results.38 Indeed by extending the literature search up to 2021, we included a larger number of patients and more relevant published reports, thereby improving the precision of the test.

This systematic review and meta-analysis is, to the best of our knowledge, the first to evaluate the diagnostic accuracy of the PPI test in patients with ERD and NERD, diagnosed by upper endoscopy and esophageal pH monitoring. Although we had a limited number of DTA studies with data on ERD and NERD, the PPI test showed comparable accuracy in both populations. Weijenborg et al39 reported the pooled estimate of partial symptom response rate after 4 weeks of PPI therapy to be 75% and 85% in those with ERD and NERD, respectively. They argued that when NERD was well-defined, the response rate to PPI therapy was comparable to ERD. Our study results corroborate their findings and highlight the critical role of functional testing for accurate diagnosis of NERD. However, our analysis is limited by the relatively small sample size. Thus, it is imperative for future studies to include larger cohorts stratified by distinct phenotypes of GERD to further validate our findings.

The strengths of this meta-analysis include its comprehensive and up-to-date literature review, well-defined inclusion and exclusion criteria, critical appraisal of evidence, and applied standardized criteria to define symptomatic response based on a prespecified meta-analysis protocol. Our search strategy identified 9 unique articles not included in prior meta-analyses.25,26,29–35 We minimized spectrum bias by including all forms of GERD. Almost all patients with GERD and NCCP underwent confirmatory testing with both upper endoscopy and esophageal pH monitoring. There was also an even distribution of each gender in both groups, and trials were selected from a different range of countries and languages.

There are several limitations to our meta-analysis. First, a variety of PPIs with different dosing regimens were used to perform the test. However, there are currently no recommended guidelines for the PPI test. Second, selection bias cannot be excluded. Some trials were limited to only ERD or NERD patients, and most of the studies were small in size. In contrast, other studies failed to differentiate between the 2 phenotypes. Important to also note erosive esophageal reflux was defined broadly by endoscopic evidence of reflux-related mucosal injury, and this reflects the wide range of classification systems used in PPI test studies designed before the inception of the Lyon consensus.3 However, to date, there is a paucity of data on the diagnostic accuracy of PPIs in patients with NERD and ERD defined according to Lyon criteria, and this is a topic of interest for future research. Third, although 18 of 19 studies defined the primary outcome as improvement in symptomatic response by >50%, one clinical trial in GERD chose 75% be the endpoint of interest.21 We feel that this outcome is still a partial symptom response, and it did not have a major influence on our findings. Therefore, we do not believe that the results would have been significantly impacted by this bias. Fourth, residual confounders such as dyspepsia cannot be entirely excluded. Further, the absence of manometric data may temper the strength of our findings. For instance, it has been demonstrated that impaired motility observed in NCCP patients may play a relevant role in delaying reflux clearing, hence increasing the time of contact between refluxate and esophageal mucosa and consequently affect the response to PPI therapy. In contrast, hypertensive esophagus, on high-resolution manometry, is strongly associated with chest pain.40 Moreover, misclassification of GERD may have occurred, as 2 of the 19 studies relied solely on esophageal pH monitoring for diagnosis. However, according to our sensitivity analyses, the performance of the PPI test remained the same even after we removed these 2 studies.23,31 Nevertheless, despite these limitations, our results demonstrate the value of the PPI test in detecting GERD in patients with NCCP-related reflux disease.

The results of this systematic review and meta-analysis suggest that the PPI test is sensitive but with suboptimal specificity for detecting GERD in patients with typical symptoms. The PPI test demonstrated moderate-to-high accuracy in diagnosing GERD-related NCCP. While the PPI test performed slightly better in NERD, the overall diagnostic accuracy was comparable in both phenotypes.

REFERENCES

1. Katz PO, Gerson LB, Vela MF. Guidelines for the diagnosis and management of gastresophageal reflux disease. Am J Gastroenterol. 2013;108:308–328; quiz 329.
2. Yamasaki T, Hemond C, Eisa M, et al. The changing epidemiology of gastresophageal reflux disease: are patients getting younger? J Neurogastroenterol Motil. 2018;24:559–569.
3. Gyawali CP, Kahrilas PJ, Savarino E, et al. Modern diagnosis of GERD: the Lyon Consensus. Gut. 2018;67:1351–1362.
4. Dent J, Brun J, Fendrick AM, et al. An evidence-based appraisal of reflux disease management The Geneva Workshop Report. Gut. 1999;44(suppl 2):S1–S16.
5. Joseph S, Hirano I Fass R. Gastresophageal reflux disease: diagnosis. GERD/Dyspepsia: Hot Topics. Philadelphia, PA: Hanley & Belfus; 2004:41–54.
6. Euler A, Byrne W. Twenty-four-hour esophageal intraluminal pH probe testing: a comparative analysis. Gastroenterology. 2004;80:957–961.
7. Pace F, Pace M. The proton pump inhibitor test and the diagnosis of gastresophageal reflux disease. Expert Rev Gastroenterol Hepatol. 2010;4:423–427.
8. Gyawali CP, Fass R. Management of Gastresophageal reflux disease. Gastroenterology. 2018;154:302–318.
9. Numans ME, Lau J, de Wit NJ, et al. Short term treatment with proton-pump inhibitors as a test for gastresophageal reflux disease. Ann Intern Med. 2004;140:518–527.
10. Johnsson F, Hatlebakk JG, Klintenberg AC, et al. One-week esomeprazole treatment: an effective confirmatory test in patients with suspected gastresophageal reflux disease. Scand J Gastroenerol. 2003;38:354–359.
11. Richter JE. Chest pain and gastresophageal reflux disease. J Clin Gastroenterol. 2000;30:S39–S41.
12. Cremonini F, Wise J, Moayyedi P, et al. Diagnostic and therapeutic use of proton pump inhibitors in non-cardiac chest pain: a metaanalysis. Am J Gastroenterol. 2005;100:1226–1232.
13. Ghoneim S, Wang J, El Hage Chade N, et al. Diagnostic accuracy of proton pump inhibitor test for detection of gastresophageal reflux disease: a systematic review and meta-analysis. 2021. Available at: www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=256644. Accessed January 7, 2021.
14. McInnes MF, Moher D, Thombs BD, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the PRISMA-DTA statement. JAMA. 2018;319:388–396.
15. Patel D, Fass R, Vaezi M. Untangling non-erosive reflux disease from functional heartburn. Clin Gastroenterol Hepatol. 2020;19:1314–1326.
16. Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155:529–536.
17. Schenk BE, Kuipers EJ, Klinkenberg-Knol EC, et al. Omeprazole as a diagnostic tool in gastresophageal reflux disease. Am J Gastroenterol. 1997;92:1997–2000.
18. Johnsson F, Weywadt L, Solhaug JH, et al. One-week omeprazole treatment in the diagnosis of gastro-esophageal reflux disease. Scand J Gastroenterol. 1998;33:15–20.
19. Fass R, Fennerty MB, Ofman JJ, et al. The clinical and economic value of a short course of omeprazole in patients with noncardiac chest pain. Gastroenterology. 1998;115:42–49.
20. Fass R, Ofman JJ, Gralnek IM, et al. Clinical and economic assessment of the omeprazole test in patients with symptoms suggestive of gastresophageal reflux disease. Arch Intern Med. 1999;159:2161–2168.
21. Juul-Hansen P, Rydning A, Jacobsen CD, et al. High-dose proton-pump inhibitors as a diagnostic test of gastro-esophageal reflux disease in endoscopic-negative patients. Scand J Gastroenterol. 2001;36:806–810.
22. Pandak WM, Arezo S, Everett S, et al. Short course of omeprazole: a better first diagnostic approach to noncardiac chest pain than endoscopy, manometry, or 24-hour esophageal pH monitoring. J Clin Gastroenterol. 2002;35:307–314.
23. Xia HH, Lai KC, Lam SK, et al. Symptomatic response to lansoprazole predicts abnormal acid reflux in endoscopy-negative patients with non-cardiac chest pain. Aliment Pharmacol Ther. 2003;17:369–377.
24. Bautista J, Fullerton H, Briseno M, et al. The effect of an empirical trial of high-dose lansoprazole on symptom response of patients with non-cardiac chest pain--a randomized, double-blind, placebo-controlled, crossover trial. Aliment Pharmacol Ther. 2004;19:1123–1130.
25. Dickman R, Emmons S, Cui H, et al. The effect of a therapeutic trial of high-dose rabeprazole on symptom response of patients with non-cardiac chest pain: a randomized, double-blind, placebo-controlled, crossover trial. Aliment Pharmacol Ther. 2005;22:547–555.
26. des Varannes SB, Sacher-Huvelin S, Vavasseur F, et al. Rabeprazole test for the diagnosis of gastro-esophageal reflux disease: results of a study in a primary care setting. World J Gastroenterol. 2006;12:2569–2573.
27. Bate CM, Riley SA, Chapman RW, et al. Evaluation of omeprazole as a cost-effective diagnostic test for gastro-esophageal reflux disease. Aliment Pharmacol Ther. 1999;13:59–66.
28. Fass R, Ofman JJ, Sampliner RE, et al. The omeprazole test is as sensitive as 24-h esophageal pH monitoring in diagnosing gastro-esophageal reflux disease in symptomatic patients with erosive oesophagitis. Aliment Pharmacol Ther. 2000;14:389–396.
29. Huamán JW, Aliaga V, Domenech G, et al. Cuál es la utilidad del test de inhibidores de la bomba de protones en el dolor torácico no cardíaco? [What is the utility of proton pump inhibitor testing in non-cardiac chest pain?] [in Spanish]. Gastroenterol Hepatol. 2014;37:452–461.
30. Remes-Troche JM, Carmona-Sánchez R, Soto Pérez JC, et al. Utilidad del rabeprazol como prueba diagnóstica en la enfermedad por reflujo gastroesofágico no erosiva [Utility of rabeprazole as a diagnostic test in non-erosive gastresophageal reflux disease]. Rev Gastroenterol Mex. 2005;70:276–283.
31. Aanen MC, Weusten BL, Numans ME, et al. Diagnostic value of the proton pump inhibitor test for gastro-esophageal reflux disease in primary care. Aliment Pharmacol Ther. 2006;24:1377–1384.
32. Zheng J, Du ZM, Chen MH, et al. Diagnosis of gastresophageal reflux disease-related noncardiac chest pain [in Chinese]. Zhonghua Yi Xue Za Zhi. 2008;88:1390–1393.
33. Kim JH, Sung IK, Sinn DH, et al. Comparison of one-week and two-week empirical trial with a high-dose rabeprazole in non-cardiac chest pain patients. J Gatroenterol Hepatol. 2009;24:1504–1509.
34. Cho YK, Choi MG, Lim CH, et al. Diagnostic value of the PPI test for detection of GERD in Korean patients and factors associated with PPI responsiveness. Scand J Gastroenterol. 2010;45:533–539.
35. Zhou LY, Wang Y, Lu JJ, et al. Accuracy of diagnosing gastresophageal reflux disease by GerdQ, esophageal impedance monitoring and histology. J Dig Dis. 2014;15:230–238.
36. Dent J, Vakil N, Jones R, et al. Accuracy of the diagnosis of GORD by questionnaire, physicians and a trial of proton pump inhibitor treatment: the Diamond Study. Gut. 2010;59:714–721.
37. de Leone A, Tonini M, Dominici P, et al. The proton pump inhibitor test for gastresophageal reflux disease: optimal cut-off value and duration. Dig Liver Dis. 2010;42:785–790.
38. Wang WH, Huang JQ, Zheng GF, et al. Is proton pump inhibitor testing an effective approach to diagnose gastresophageal reflux disease in patients with noncardiac chest pain?: a meta-analysis. Arch Intern Med. 2005;165:1222–1228.
39. Weijenborg PW, Cremonini F, Smout AJ, et al. PPI therapy is equally effective in well-defined non-erosive reflux disease and in reflux esophagitis: a meta-analysis. J Neurogastroenterol Motil. 2012;24:747–757; e350.
40. Ribolsi M, Balestrieri P, Biasutto D, et al. Role of mixed reflux and hypomotility with delayed reflux clearance in patients with non-cardiac chest pain. J Neurogastroenterol Motil. 2016;22:606–612.
Keywords:

gastroesophageal reflux disease; noncardiac chest pain; endoscopy; heartburn

Supplemental Digital Content

Copyright © 2022 Wolters Kluwer Health, Inc. All rights reserved.