Calman, Lynn PhD*; Beaver, Kinta PhD†; Hind, Daniel PhD‡; Lorigan, Paul MB§; Roberts, Chris PhD∥; Lloyd-Jones, Myfanwy DPhil¶
Lung cancer is the most commonly diagnosed cancer worldwide, with incidence rates continuing to increase in developing countries.1,2 Although survival in the first year has improved (25% in 2001),3 5-year survival remains poor. The reported purpose of follow-up is to manage toxicity of treatment, detect disease recurrence, and instigate further treatment/supportive care without delay.4 Follow-up has become increasingly relevant with the availability of second- and third-line treatments, which have resulted in improvements in quality of life (QOL) and survival. Patients with lung cancer have high levels of symptom burden,5 with uncontrolled symptoms often presenting as clusters.6 Guidelines on follow-up vary internationally (Table 1).
Concerns have been expressed about the effectiveness of conventional physician-led hospital-based follow-up, suggesting that it is ineffective at detecting recurrence,14 that earlier detection of recurrence does not improve life expectancy,15 and that patients emotional distress is inadequately identified and addressed.16 Alternative strategies include follow-up by primary care physicians/general practitioners (GPs), nurses, the telephone, and open-access clinics where patients are able to drop in on demand.14,17–19
There is a paucity of evidence for different follow-up strategies for patients with lung cancer and their cost-effectiveness. An earlier systematic review was undertaken to inform clinical guidelines for the follow-up of non-small cell lung cancer (NSCLC), but this review did not meta-analyze survival data or include QOL or patient satisfaction data.8 For this reason, we undertook a systematic review and meta-analysis of follow-up strategies for both small cell lung cancer (SCLC) and NSCLC.
A review protocol is available from the corresponding author.
We aimed to identify all literature related to follow-up of patients with lung cancer. Searches were conducted in July and August 2008. Initially no language, study/publication, or date restrictions were applied. Relevant articles were identified and retrieved from Ovid Medline, Embase, PsycINFO, CINAHL, and British Nursing Index. All databases held by the Cochrane Library Issue 3, 2008, including Cochrane Database of Systematic Reviews, Database of Abstracts of Reviews of Effects, Cochrane Central Register of Controlled Trials, Cochrane Database of Methodology Reviews, Health Technology Assessment Database, and National Health Service Economic Evaluation Database were searched. Electronic searches were supplemented by reviewing reference lists for identified studies. Work in progress was searched for using Current Controlled Trials, National Institutes of Health clinical trials databases, and the National Research Register.
Only English language articles were included in the analysis, together with one German article that contained sufficient detail in the abstract and survival curves for inclusion. The full search strategy is shown in Figure 1.
Inclusion and Exclusion Criteria
The population of interest for this review was patients older than 18 years (with no upper age limit) who had been treated for primary lung carcinoma and were in follow-up. All lung cancer histological types, SCLC and NSCLC, were included in the review. All treatment options, surgery, radiotherapy, and chemotherapy (or, as is often the case, a combination of these) were included. All stages of lung cancer were included in the review.
In line with previous reviews and published guidance, lung cancer follow-up is defined as care after treatment, which is planned and multifaceted. For inclusion in the review, the primary focus of the study had to be multifaceted follow-up interventions. Multifaceted programs were defined as those that included multiple types of assessment, not just individual types of imaging or tests, and ideally included symptom management, education, health promotion (including smoking cessation if appropriate), and psychosocial support. Control groups were defined within individual studies, and no “standard” follow-up plan was common across studies, reflecting the variability in follow-up strategies in clinical practice. Variability in programs could be defined by intensity of follow-up; we compared more intensive with less intensive follow-up.
Primary outcomes included overall survival and asymptomatic survival. Secondary outcomes were time to detection of recurrence, and change in self-reported QOL from baseline (start of follow-up) to end of follow-up, or death. Generic, disease-specific, and tumor site-specific QOL measures were included if they measured QOL in a standardized format and produced aggregate summary scores. Both interview and self-completed formats for either generic- or disease (or tumor)-specific measures were included. Change in symptom experience/burden and patient satisfaction was also included as an outcome if measured by a validated instrument. Only studies that reported at least one of the primary outcomes were included. Because of limited reported data, cost was not included as a formal outcome measure.
Although randomized controlled trials (RCTs) are widely regarded as the most appropriate design to evaluate efficacy of an intervention, a scoping review identified a paucity of studies using this design. Therefore, we included RCTs, quasiexperimental, and observational (case-control or cohort) studies. This approach may potentially result in less robust studies being used for the development of evidence but provides the best evidence currently available in this underresearched area.
Data Extraction and Quality Assessment
All abstracts were read, and studies meeting the inclusion criteria were identified. Data extraction and quality assessment were carried out by two reviewers (L.C. and K.B.). Discrepancies were discussed to reach consensus. A standardized form was developed to extract data from each study, including design, inclusion criteria, baseline characteristics of intervention and control groups, key aspects of study quality, and results. Details of the intervention in treatment and control (standard practice) groups were extracted, including frequency of visits, investigations ordered, and examinations conducted. Data on outcome measures were tabulated to facilitate quantitative data synthesis. Data on quality included evaluation of generalizability, reliability, validity, definition validity, theoretical basis, and clinical versus statistical significance.
If more than one study had similar populations, interventions, and outcomes, the relevant data were statistically synthesized. Time to event data were synthesized using hazard ratios (HRs) and 95% confidence intervals (CIs).
As summary statistics was not available in any of the articles to allow direct calculation of the HR and CIs, HRs were estimated using two of the methods presented by Tierney et al.20 When the p value was reported, the HR and variance were estimated using the p value (log rank, Mantel Hansel, or Cox regression) and events in each arm observed–expected. This was possible with 10 calculations. An Excel spreadsheet developed by the MRC Clinical Trials Unit was used to calculate the above statistics.21
When the p value was not reported, the HR was calculated using survival curve data extracted from study reports.20 The survival curve from a PDF file of the published article was copied and imported into Image J software.22 The curve was divided into time periods as suggested by Parmar et al.23; no more than 20% of total events were included in each time period, and time periods differed between each study depending on event rate. An Excel spreadsheet was used to calculate percentages in control and treatment arms who were event free and numbers at risk in each arm. From this curve data,21 variance, observed minus expected (O − E) values, and HR were calculated.
Minimum and maximum follow-up were estimated and entered into the Excel file to estimate HR accurately. These data were either identified directly from data reported or estimated from the survival curve using methods suggested by Tierney et al.20 This allowed censoring in the trial to be accounted for, otherwise estimates of HR would be based on too many patients.20 Tables 2 and 3 indicate what methods were used to calculate the HR for each study outcome.
These data were then inputted into Cochrane Collaboration Review Manager 4.2 software.32 The Peto odds ratio (HR) was chosen as the effect measure and the Exp ((O/E)/Var) statistical methods applied.33
A fixed effect model was used for the primary analysis; this is an assumption when using the variation of the Peto odds ratio for time to event data.33 Heterogeneity was investigated by examining the X2 and the I2 statistic.34
It was stated prospectively that subgroup analysis would take place on the basis of treatment intent. Treatment with curative intent and prognosis of early-stage NSCLC populations are sufficiently different to not support a pooled analysis with patients with advanced disease and palliative treatment. Curative intent treatment included surgery alone or multimodality treatment—a combination of chemotherapy, radiotherapy, and surgery with curative intent. Palliative treatment included chemotherapy and radiotherapy.
The reporting of this meta-analysis is in accordance with the PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions.35
Role of the Funding Source
No funders were involved in study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.
Electronic and hand searches identified 20,627 citations, resulting from the high sensitivity of the search. Follow-up search terms lack specificity, and a large number of citations were easily excluded when screening the title. An initial screen of the electronic search resulted in the selection of 203 abstracts for further analysis. Of these, 141 were excluded and 62 were identified for a full text review. Forty-six of the 62 were excluded because they did not meet the entry criteria. Sixteen studies had outcomes relevant to the study and were reassessed to identify the primary outcome and whether sufficient data were available for inclusion in the statistical analysis; nine were finally included in the review. Reasons for exclusions36–40 and trial flow are presented in Figure 2.
Four studies were based in North America,14,24,28,31 three in Europe,25,29,30 one in South America,27 and one in Japan.26 Five comparative cohort studies examined survival after follow-up regimes of varying intensity.24,28,26,27,29 Two single cohort studies of follow-up programs compared survival after symptomatic or asymptomatic recurrence.31,30 One RCT compared survival between nurse-led and conventional follow-up care.25 One cohort study compared survival between follow-up by GPs and conventional specialist led hospital care31 (Tables 2 and 3).
Details of the follow-up programs implemented are given in Tables 2 and 3. It is important to note that in some studies the more intensive arm resembled standard or routine practice when compared with international guidelines, but these programs were compared with less intensive programs (e.g., regular versus no standardized follow-up).29 Higher intensity programs had more frequent imaging carried out at set time points. The exceptions to this were the studies by Moore et al.,25 where intensive follow-up included clinic visits and contact with patients but not increased imaging or extra clinical procedures, and Gilbert et al.14 where standard follow-up was compared with detection by the family practitioner.
Characteristics of Participants
A total of 1669 patients were included in the studies. Study sizes ranged from 75 to 358. In eight studies that reported gender, 67% of participants were men (n = 1487). An additional study (n = 182) did not provide numbers but reported that most participants were men.28 The overall median age range was 58 to 68 years.
The studies of follow-up care after potentially curative resection (NSCLC) included patients with stages I to III disease, reflecting the stage of disease deemed appropriate for curative intent treatment. The study investigating follow-up after palliative treatment included limited and extensive stage patients with SCLC. One study (Moore et al.25) included patients with a mix of stages and treatments but the minority of included patients (23%) had stages I to IIIa disease.
Quality of the Included Studies
Seven studies had retrospective database cohort designs.14,24,26–29,31 These retrospective studies are particularly prone to bias as they rely on data collected for another purpose.41 The prospective cohort study was relatively robust, reporting no nonresponses and adherence to the intervention of 92% for chest x-rays, 83% for computed tomography (CT) scans, and 93% for fiberoptic bronchoscopy.30 The RCT failed to report how the randomization sequence was generated or whether treatment allocation was concealed.25
Overall Survival—Intensive versus Nonintensive Follow-Up
Six studies examined survival in patients with lung cancer, comparing more intensive versus less intensive follow-up programs (Figure 3). In the curative intent subgroup, the general trend for improvement in survival favored more intensive follow-up: HR: 0.83 (0.66–1.05), but this was not statistically significant (p = 0.13). Between-study heterogeneity was low.
Two further studies reported this data: the RCT by Moore et al.25 with a mix of patients including all stages and treatment regimes and a nurse-led follow-up plan; and the retrospective design presented by Sugiyama et al.,26 including patients with SCLC who had a response to first-line chemotherapy. These two studies were sufficiently different, in terms of the sample and design, that pooling the findings was not useful; heterogeneity was high (I2 = 65%) when they were pooled.
Only one article30 reported a Cox model adjusted for treatment. Some factors were found to predict survival regardless of treatment (curative, palliative, or best supportive care). Absence of symptoms, female sex, performance status of 2 or less, and age ≤61 years were found to be favorable prognostic factors.
Survival—Asymptomatic versus Symptomatic Recurrence
High rates of relapse (between 21% and 71%) were reported even when curative treatment was intended, and patients were as likely to have a relapse detected with an unscheduled consultation as with routine appointments.14,24,30,27
In the curative intent subgroup, all the studies found that asymptomatic recurrence was associated with a significantly longer survival time: HR: 0.61 (0.50–0.74) (p < 0.01), with a low level of heterogeneity (Figure 4). No included palliative treatment studies presented this data. The study with a population of patients with SCLC, who responded to first-line chemotherapy, reported a recurrence rate of 91%.26
Assessment of Secondary Outcome Measures
Time to Detection of Recurrence—Intensive versus Nonintensive Follow-Up
Three studies24,26,27 reported time to detection of recurrence with more intensive versus less intensive follow-up. Data were pooled for two studies,24,27 which reported time to disease progression after curative treatment. Intensive versus less intensive follow-up had no impact on time to detection of recurrence (HR: 0.85, CI: 0.50–1.42, p = 0.52) with low heterogeneity (I2 = 0%).
Sugiyama et al.,26 reporting on patients with SCLC after chemotherapy, found no difference between the more intensive and the less intensive follow-up program (HR: 0.98, CI: 0.65–1.48). This finding was not significant. Asymptomatic recurrence was detected more frequently in the intensive arm. The included studies indicate little or no difference in how patients were treated regardless of whether they had asymptomatic or symptomatic recurrence.
Quality of Life
Only one study reported QOL outcomes.25 QOL was assessed at baseline (end of treatment) and at monthly intervals using the European Organisation for Research and Treatment of Cancer core questionnaire and lung cancer specific module.42 This questionnaire has five functional scales (physical, role, emotional, cognitive, and social), three symptom scales (fatigue, nausea and vomiting, and pain), and a global health status/QOL scale. At 3 months, patients on the nurse-led arm of the trial rated their dyspnoea as less severe (p = 0.03) than patients on the conventional follow-up arm. No other significant differences were found at this time point or at 6 months. At 12 months, compared with the conventional care group, patients in the nurse-led arm had improved scores for emotional functioning (p = 0.03) and less peripheral neuropathy (p = 0.05).
There is little robust evidence on which to base the follow-up of patients with lung cancer. Despite heterogeneity of patient and follow-up, it was felt useful to synthesize evidence using meta-analysis techniques to identify trends in published data to guide clinical practice. There is a trend for more intensive follow-up to improve survival of patients with lung cancer, although the data do not reach statistical significance. This trend is less convincing in the palliative treatment subgroup than the curative intent subgroup. This may not count as sufficient evidence to change practice or justify the potential increased cost of intensive follow-up in routine care.
Three included studies reported an economic evaluation of more intensive follow-up.14,25,27 Although cost-effectiveness was not originally considered in the review as an outcome measure, the findings from the economic evaluations warrant discussion given the current economic climate. Standard reimbursements (or National Health Service costs in the United Kingdom) for consultations and the associated costs of any tests or investigations were used to estimate cost; treatment costs were excluded. Two retrospective studies14,27 estimated that there would be a reduction in cost of approximately 70% with patient-led symptom-oriented follow-up versus more intensive prescribed follow-up. The RCT by Moore et al.25 did not identify a significant increase in costs for the nurse-led arm, but the cost of the intervention was not included, and they note that sample size may not have been sufficient to detect a difference.
A previous review of 10 published articles retrospectively estimated the cost of follow-up (not including retreatment) after potentially curative treatment.43 Regimes using bronchoscopy and CT scanning were at the higher end of the cost spectrum. When the costs of retreatment (further surgical intervention) were included,44 the cost per life year gained increased to $56,000 compared with the acceptable threshold set in the United Kingdom (£20–30,00045) and the United States (reported in 2002 as $50,000 per life year gained).44 There are no contemporary published economic evaluations that take into account recent changes in imaging technology, such as positron emission tomography (PET) scanning, and targeted treatment, and their subsequent impact on resource utilization and QOL at follow-up.
Prognosis is poor for the majority of patients with lung cancer with recurrence or disease progression, and small improvements in survival may be clinically meaningful, even if not statistically significant. Virgo et al.28 report that, although the difference in survival between intensive and nonintensive follow-up in NSCLC was not statistically significant, patients in the intensive group had a survival advantage of 0.53 years. Large prospective studies are required to be sensitive to these small differences.
It is important to know whether follow-up regimes can identify patients who are asymptomatic at recurrence, assuming that by the time patients have symptoms, the recurrence is further advanced and, therefore, less amenable to treatment. There is a trend, but no clear evidence, that intensive follow-up reduces time to detection of recurrence.
Identification of asymptomatic recurrence seemed beneficial to patients treated with curative intent. Although this may be due in part to the opportunity for early intervention to treat recurrence, lead time bias may be a factor in the (apparently) extended survival of asymptomatic patients. This improved survival was not reflected in statistically significant differences between follow-up regimes. Studies had small sample sizes, and data were derived from retrospective database studies. The included studies are likely to be underpowered to show difference in survival. Intensive programs may not be sensitive enough to identify sufficient patients whose recurrence is asymptomatic to make a difference to overall survival. This may be due to the natural history of lung cancer as there is only a short period in which recurrence is asymptomatic, in contrast to other types of cancer such as cervical cancer where the lead time between asymptomatic disease and symptoms can be significant.
The absence of any statistical difference in time to recurrence when comparing more intensive and less intensive follow-up may be explained by deviation from scheduled follow-up appointments. In the study by Benamore et al.,24 two thirds of patients with relapse (n = 44) sought medical attention before a scheduled appointment, thus diluting the effect of intensive follow-up. This makes it difficult to make recommendations about the timing of follow-up appointments. It may be more important to give patients information about what symptoms would necessitate an earlier consultation. Sugiyama et al.26 report that although there was no statistical difference in time to recurrence with intensive follow-up, the median time to detection of initial occurrence was 5 months in the intensive arm and 7 months in the nonintensive arm. This may have clinical importance, particularly as asymptomatic recurrences were detected more frequently in the intensive arm (p = 0.03).
Whether patients had asymptomatic or symptomatic recurrence did not influence what treatment they were subsequently offered, reflecting other published research.46,47 Regardless of what therapy was instituted after recurrence, survival was poor. Nevertheless, this may change with new targeted agents.
Caution should be used when meta-analyzing observational studies as confounding and selection bias can cast doubt on findings: sources of bias and heterogeneity should be carefully examined and, although the statistical combination of observational studies should not be abandoned altogether, it should not be a prominent component of review.48 We have described issues of heterogeneity carefully in the limitations section to allow readers to make a judgment about the pooling of results. Observational studies are more prone to bias than experimental studies, leading to systematic differences in outcomes that are not due to intervention effects. As a result, observational studies tend to report larger treatment effects than RCTs.49 Only one included article attempted to address confounding variables such as age, treatment, performance status, and disease stage. Survival may be influenced by these additional factors.
Based on these findings, a case may be made for identifying asymptomatic recurrence after curative treatment. Treatment outcomes in patients with advanced disease are poor, and earlier detection of recurrence or progression may make little difference. Nevertheless, the potential of second- and third-line palliative and targeted treatments to improve survival may not have been captured in these included studies. For this reason, identifying patients with asymptomatic recurrence or progression after palliative treatment may be increasingly important.
It is important not to abandon structured follow-up programs on the basis of this survival analysis. Follow-up serves other important functions. There are few reported data on the impact of follow-up on symptom control, symptom avoidance, or QOL. Prediagnosis QOL is an important prognostic factor for patients with lung cancer,50,51 and early evidence suggests that there may be potential in investigating whether improved QOL after treatment improves survival.52 By primarily focusing on survival, the literature may be missing the valuable contribution that follow-up makes for the support of patients and their families. With the increasing importance of cancer survivorship outcomes, it may be important to include these in any future evaluation of aftercare.53
The preferences of patients with lung cancer for follow-up have recently been explored.54 Patients viewed follow-up favorably, citing reassurance as a positive outcome of the follow-up visit. Patients reported that they preferred to be seen by medical staff based in a hospital clinic, with nurse-led care as an acceptable option; primary care physicians (GPs) and telephone follow-up were viewed less favorably.54
Implications for Future Research
This meta-analysis highlights that there is scope for further research in lung cancer follow-up, particularly in SCLC. There is scant evidence for follow-up of patients after different treatment regimes.10 Robust RCTs are necessary to develop convincing evidence for follow-up interventions. It is unlikely that one follow-up program will be suitable for all patients and investigating alternative programs such as nurse-led, intensive, and primary care options would allow for patient and clinician choice. One ongoing trial with postoperative patients with NSCLC in France is comparing minimal follow-up (chest x-ray and clinical examination) with a more intensive program.55
It is important that future research includes well- defined patient groups, given the distinct differences in prognosis and focus of follow-up interventions. Patients treated with curative intent could gain a survival benefit from early diagnosis of recurrence, if more effective treatments are available in the future. Patients with advanced disease could benefit from early commencement of supportive care with the potential for improving QOL and possibly survival.52 Identifying and targeting subgroups who might benefit from intensive or other models of follow-up is of primary importance. For example, there is evidence that fluorodeoxyglucose-position emission tomography CT can identify progression amenable to potentially radical treatment in a small proportion of patients (3%) and could be targeted at asymptomatic patients.56 A further economic evaluation using a decision analytic Markov model highlighted PET-CT as cost-effective (particularly for asymptomatic patients) in the follow-up of patients with NSCLC.57
Identifying an appropriate primary end point for studies of follow-up interventions for patients with lung cancer is imperative. It is difficult to measure, without robust studies, a statistically significant improvement in survival that can be attributed to follow-up programs. Nevertheless, for some patients, any improvement in survival, even if it is measured in weeks, is the most important clinical outcome. Follow-up should be multifaceted and encompass QOL/survivorship issues, including rehabilitation, as well as a focus on survival. Patient-orientated outcomes defined by patients, such as QOL and satisfaction, may be relevant primary outcomes of studies.
If the aim of follow-up programs is to identify early recurrence, it would be important to investigate the impact that this might have on patients' psychological well-being as patients look to follow-up appointments for hope, reassurance, and “good news.”54 More work is needed to give patients an evidence-based message about what follow-up achieves for them to make choices about their aftercare. The balance between psychosocial support and the survival advantage that more intensive, investigation-driven, follow-up may bring needs to be considered.
This meta-analysis of predominately observational studies has to be interpreted with caution; results were pooled only where there was sufficient clinical homogeneity. Nevertheless, this may still be considered inappropriate for evaluating clinical effectiveness.49
An important limitation of this meta-analysis was the heterogeneity of the studies. There was heterogeneity in study design and in follow-up interventions, as described previously, and in populations and in the calculation of survival. The study samples were heterogeneous: for example, in the curative intent group, the addition of the study by Benamore et al.24 increased heterogeneity by including patients with locally advanced disease stage IIIA and stage IIIB who will have a poorer prognosis. Another source of heterogeneity was the date from which survival analysis was calculated. This was not reported in two retrospective studies, but, when reported, it was taken from the first encounter with health care after the completion of treatment. Because of the variability in follow-up regimes, this date differed from 1 week to 3 months posttreatment (Table 2).
Nevertheless, where data were pooled, heterogeneity in the analysis I2 was 0%. (Figures 3 and 4) suggesting that the effect size remained relatively constant despite the underlying clinical and methodological heterogeneity.
The interpretation of findings is complicated by issues of internal and external validity. Particular threats to the internal validity of the studies included in this analysis are selection bias through lack of randomization and lead time bias. Only English language studies were included. The role of confounding variables is addressed in one study only; survival may be influenced by confounding factors, but few data are reported, which would allow exploration of this hypothesis.
Conclusions and Recommendations
Current international clinical guidelines for NSCLC and SCLC are informed by expert opinion; limited empirical evidence is available (particularly for patients after palliative treatment) to inform clinical management of patients. Guidelines recommend regular follow-up. This review has established no evidence that would contradict this, and guidelines should be adhered to, so that regular contact with patients and carers can be maintained and further treatment and symptom management instigated if necessary.
Any future research should include patient- and carer-centered outcomes such as QOL and satisfaction to investigate the impact of follow-up on living with lung cancer and psychosocial well-being. Further research is warranted to understand patients' views of more intensive or invasive follow-up programs or those using sophisticated scanning techniques such as fluorodeoxyglucose-position emission tomography CT. False positives or early, asymptomatic, diagnosis of recurrence when this does not improve survival may impact on QOL.
Alternative, innovative, approaches to follow-up should be developed to cater for different disease subgroups and patient choice. These programs should be evaluated through robustly designed RCTs.
Supported by a Medical Research Council (MRC) Special Training Fellowship in Health Services Research (G0601695) (awarded to L.C.).
Contributors: L.C. and M.L.-J. participated in the conception and design of the study, interpretation, and analysis of data. K.B. participated in data extraction and interpretation of data. D.H. participated in the design of the study and analysis of data. C.R. and P.L. participated in interpretation of data. All authors participated in the writing of the report and approved the final version.