Labby, Zacariah E. PhD*; Armato, Samuel G. III PhD*; Dignam, James J. PhD†; Straus, Christopher MD*; Kindler, Hedy L. MD‡; Nowak, Anna K. MD, PhD§‖
For matters involving tumor response, there is only one metric that can be used to ascertain the truth: tumor burden. If tumor composition is assumed to be consistent over time, then changes in tumor volume will directly correspond to changes in the number of tumor cells. Some molecular imaging methods are moving toward proliferative cellular quantification.1–3 However, until these methods become widespread, computed tomography (CT) imaging (with the possibility of volumetric quantification) will remain the best tool to assess the tumor burden for patients with malignant pleural mesothelioma (MPM).
Advances in medical imaging and image processing methodology allow for response assessment metrics that (1) use full three-dimensional volume measurements4–6 and (2) track continuous, rather than discretized, measurements over time.7,8 Disease volumes are a logical choice for tumor burden assessment of diseases such as mesothelioma, where the disease morphology is not compatible with the spherical geometry assumptions implicit in the Response Evaluation Criteria In Solid Tumors (RECIST) response assessment technique.9–11 The segmentation and volumetric quantification of mesothelioma with any degree of automation is a challenging task. The morphology of the disease is widely variable, and its radiographic density is comparable to that of neighboring tissues.12 Although volume measurements of MPM have been shown to exhibit lower interobserver variability than linear-thickness measurements made according to the modified RECIST protocol,13,14 the computational and manual challenges of the disease volume segmentation task are problematic.
Pleural disease volume was previously shown to be a significant predictor of MPM patient survival,3,15,16 but changing tumor burden affects more than just the volume of tumor. The hemithoracic space is fairly fixed so that when disease volume increases, aerated lung volume should be expected to decrease correspondingly. This physiologic correlation implies that changes in lung volume may have prognostic value for patients with MPM. Lung volume has been investigated to monitor response to surgical MPM tumor debulking17; changes in lung volume may also be a useful tool to assess tumor response for patients receiving chemotherapy so that instead of classifying response from declining tumor volume, response would be classified from increasing lung volume.
Both the linear measurements based on modified RECIST15 and lung volumes have certain advantages over disease volumes for response assessment. Disease volumes require substantial manual intervention. Linear-thickness measurements are almost entirely manual (though some automation techniques have been suggested)18 but require much less time than disease volume segmentation. Lung volume segmentation, on the other hand, is entirely automated. The purpose of this study was to compare the prognostic performance of changing lung volumes and linear-thickness measurements (treated continuously) with changing disease volumes in survival models for patients with MPM.
PATIENTS AND METHODS
Imaging and clinical data from 61 patients were obtained from a prospective study involving FDG-positron emission tomography and CT imaging of MPM.3 All patients were older than 18 years with histologically or cytologically confirmed MPM and had not received previous chemotherapy or definitive radiotherapy. Patient accrual occurred from late 2003 to 2010, and the original study was approved by the local institutional Human Research Ethics Committee at Sir Charles Gairdner Hospital (Nedlands, Australia) with patients providing written informed consent. The retrospective analysis of the Health Insurance Portability and Accountability Act-compliant data was approved by both the originating institution’s Human Research Ethics Committee and the Institutional Review Board at The University of Chicago, where the analysis was performed. Because the original study did not mandate a specific treatment protocol, patients were treated as clinically indicated. Initially, combination chemotherapy consisted of cisplatin and gemcitabine and later, when it became available at the original study institution, cisplatin and pemetrexed. For inclusion in this study, patients were required to have available modified RECIST tumor thickness measurements at baseline (before beginning the chemotherapy) and one or more follow-up scans during chemotherapy. As lung volume analysis was limited to patients with one nondiseased lung to serve as a control, the patients were also required to have unilateral disease. Finally, all patients were required to have a complete thoracic CT scan for all scan dates (and not simply scanned films) for automated lung segmentation. The summary description of the patient cohort is given in Table 1.
Patients were imaged using helical CT up to 1 month before the first cycle of chemotherapy and throughout their treatment regimen (typically after the first cycle, then every two cycles thereafter). CT staging was performed according to the Union for International Cancer Control TNM staging system (2002). CT scans were staged by a thoracic radiologist or medical oncologist experienced in mesothelioma imaging, and tumor measurements were made clinically according to the modified RECIST protocol on baseline and all follow-up scans.13 Pathologic staging was not performed. The clinical measurement protocol dictated that all imaging examinations from an individual patient be measured by the same clinician in an attempt to minimize variability.
A total of 216 CT scans were used in this study, with a median of four scans per patient. Eight patients had only a baseline scan with one follow-up scan, while 19 patients had three scans in total, 27 patients had four scans in total, and seven patients had five scans in total. The median interval between scans was 48 days. Of the 216 scans, 150 scans had been performed on General Electric scanners (HiSpeed CT/i, n = 81; LightSpeed Pro 16, n = 1; or LightSpeed VCT, n = 68; General Electric Co., Waukesha, WI), and 66 scans had been performed on Philips Brilliance 64-slice scanners (Philips Medical Systems, Cleveland, OH). At least 101 of the scans were performed with iodinated contrast media.
Only one reconstructed series was required for lung and disease segmentation for each CT scan date, and this series was selected for each patient with consideration for reconstruction kernel and slice thickness. Preference was given to thinner slice thicknesses and “standard” reconstruction kernels, but if for a given patient, there was a scan date with only “lung” kernel reconstructions, then matched kernel and slice thickness reconstructions were used for the other scan dates. Having this type of consistency across the scan dates for a given patient was considered important for segmentation of volumetric disease, because different amounts of disease might be segmented on different reconstructions due to, for instance, partial volume effects. Although linear-thickness measurements were consistently acquired using 5-mm reconstructions, multiple reconstructed slice thicknesses existed for each CT scan. For the series used in the lung and disease segmentation components of this study, slice thicknesses were 0.63 mm (n = 4), 1 mm (n = 14), 1.25 mm (n = 28), 2.5 mm (n = 75), or 5 mm (n = 95). In-plane voxel dimensions ranged from 0.54 to 0.86 mm, and all reconstructed axial images had an in-plane matrix size of 512 by 512 pixels. The kVp setting for the scans was predominantly 120 kVp (n = 212), with 100 kVp (n = 1) and 140 kVp (n = 3) also used. Reconstruction kernels fell into two broad categories, with “Lung” kernels (including the Philips “L” and General Electric “Lung” kernels) used for 136 scans and “Standard” kernels (including Philips “B” and General Electric “chest,” “soft,” and “standard” kernels) used for the remaining 80 scans.
Lung and Disease Volume Quantification
Lung region segmentation was performed using a segmentation algorithm described previously by Sensakovic et al.19 The lung segmentation method is fully automated and utilizes gray-level, morphological, and texture features to segment the aerated lung regions. The lung segmentation method has proven successful in other studies for patients with MPM.17,20 The resulting segmentations were all reviewed for accuracy and modified when necessary by an observer (Z.E.L.) trained in thoracic anatomy. In-house software was used for this task (Abras), and duration of any necessary intervention was tracked.
The pleural disease was segmented in each scan using a semiautomated method described previously.21 Because of the considerable overlap in Hounsfield Unit values between mesothelioma tumor and pleural effusion,12 the semiautomated disease volume segmentation method produces contours of pleural disease and does not readily separate tumor from effusion. Therefore, the end goal of the disease segmentation technique used in this study was reliable volumes of pleural disease and not necessarily volumes of only mesothelioma tumor. To calculate lung volume and pleural disease volume for each patient scan, a pixel-counting technique was used.22
As an independent validation of the lung segmentation method, lung segmentations were performed on a separate set of 44 CT scans from 22 patients with MPM (one baseline and one follow-up scan per patient). Automated lung segmentation was performed for each patient, and an attending radiologist (who was blinded to the computer results) contoured the aerated lung on three axial sections for the diseased (ipsilateral) hemithorax and healthy (contralateral) hemithorax (patients had unilateral disease). The area enclosed by both sets of contours was calculated, and the section-by-section areas were compared using Pearson’s correlation coefficient and Bland–Altman analysis.23
Lung volumes were used as a response assessment metric by normalizing the ipsilateral lung volume by the contralateral lung volume for each patient scan. Although it is customary for CT scans to be acquired during patient breath-hold at full inspiration, it is possible that differences in patient respiratory phase between scan time points still exist. In patients with unilateral disease, the healthy (contralateral) lung can be used to normalize the volume of aerated lung in the diseased (ipsilateral) hemithorax, thereby controlling for any potential differences in inspiration. This normalized volume Vnorm was calculated as follows:
Equation (Uncited)Image Tools
The different tumor response assessment methods in this study (linear-thickness measurements, disease volumes, and normalized lung volumes) were compared using rank correlation statistics. An R2 value is reported for the fit between changes in linear thickness from baseline and changes in disease volume from baseline for a spherical geometry model (the geometry implicit in the derivation of the RECIST classification criteria). For a sphere with diameter d that changes by an amount Δd, the relative volume change is given by the following equation:
Equation (Uncited)Image Tools
To compare the prognostic performance of the different response assessment methods, the univariate significance of all three metrics was assessed using Cox proportional hazards (PH) models with time-varying covariates.24–26 Furthermore, survival models were built using each response assessment method, and the clinical covariates from the final multivariate prognostic model obtained by Labby et al.21: Eastern Cooperative Oncology Group performance status discretized as level 0 versus levels 1 or 2, disease histology discretized as epithelioid versus other, and presence of dyspnea. Survival was defined as the duration from baseline imaging to either patient death or censoring (some patients in the cohort remain living).
All three response assessment methods were allowed to change over time and were modeled using scaled logarithmic transforms of relative changes from baseline, known as the specific growth rate (SGR).8 The definition of the SGR metric is as follows:
Equation (Uncited)Image Tools
where m(t) denotes the measurement (linear thickness, disease volume, or normalized lung volume) at an arbitrary time point and t0 indicates the time of baseline scanning (times in this study were all modeled as fractional years). The clinical covariates mentioned earlier were included along with (1) linear measurement SGR, (2) disease volume SGR, or (3) normalized lung volume SGR in multivariate survival models.
The performance of the survival models was assessed using the Heagerty’s Cτ,27 derived from receiver operating characteristic analysis. Cτ is especially useful for survival models with time-varying covariates and is scaled from 0 to 1; Cτ = 0.5 would indicate no prognostic ability, and Cτ = 1.0 would indicate perfect prognostic ability. For this study, values of Cτ are reported from training and testing on the same data set and leave-one-out cross-validation performance values for the different models. In addition, repeated random subsampling of the patient cohort was used to assess the difference in predictive ability between the three multivariate survival models for the different response assessment methods. In each of 1000 subsample iterations, each model was trained on two-thirds of the patient cohort and tested on the remaining one-third of the patient cohort. The training set was chosen randomly without replacement at each iteration, and the testing set was considered to be the remaining patients who had not been selected for the training set at that iteration. Each model (using the linear-thickness SGR, disease volume SGR, or normalized lung volume SGR assessment metric) was trained on the training cohort then tested on the testing cohort. Therefore, for each subsample iteration, model performance statistics were tracked in a paired manner, and differences between models were assessed using the histogram of paired differences between testing cohort performance values. Models were considered significantly different if the 95% central confidence interval (CI) of subsample paired differences did not include a difference of zero. All analyses were performed using the academic edition of Revolution R Enterprise (version 4.3, based on R version 2.12; Revolution Analytics, Palo Alto, CA).28
Patients and Overall Survival
Median survival from pretreatment baseline imaging was 12.7 months (95% CI, 10.2–15.3 months; range, 1.7–60 months). Of the 61 patients, there were 58 recorded deaths; the remaining three patients were censored after a median duration of 34 months. Across all patients, the mean pleural disease volume at baseline was 1312 ± 853 ml (range, 225–4449 ml). At the time of the first follow-up scan, the mean disease volume had reduced to 1232 ml, with geometric mean change from baseline of −11%. By the end of treatment, the geometric mean change in disease volume from baseline was −17%.
Across all patients, the mean baseline ipsilateral lung volume was 1021 ± 574 ml, and the mean baseline contralateral lung volume was 2648 ± 639 ml. The mean normalized ipsilateral lung volume at baseline was 0.399 (range, 0.058–1.262). By the first follow-up scan, the normalized ipsilateral lung volume had increased to 0.420, with a 5% increase in geometric mean from baseline. By the end of treatment, the normalized ipsilateral lung volume had increased a geometric mean of 8% from baseline. Over the course of the entire treatment, the distinction between normalized ipsilateral lung volume increase and decrease was significantly associated with patient survival. Figure 1 shows the Kaplan–Meier survival curves for the two patient groups (log-rank p = 0.0003).
The extent of manual intervention necessary in the otherwise fully automated lung segmentation was minimal. For cases that required any intervention whatsoever (21% of all scans), the duration of manual intervention averaged approximately 1 minute. Only 1.9% of cases required 5 minutes or more of manual intervention. The predominant cause for manual editing of lung segmentations was erroneous inclusion of segmented bowel gas.
From the lung segmentation validation study (which did not allow manual intervention), there was very high agreement for area measurements of per-section lung segmentations between the manual approach and the automated method for the 132 axial sections evaluated. Pearson’s correlation coefficient was calculated as 0.973 (p < 0.0001). Using Bland–Altman analysis (Fig. 2), the mean bias indicated that automated lung area measurements were on average 1.17 cm2 larger than manual measurements (or 1.1% larger given that the average section lung area was 102.03 cm2). The 95% limits of agreement in the difference between manual measurements and automated measurements were −19.52 to 17.19 cm2, relatively small given the correlation and average measurement magnitude.
Linear and Volumetric Measurement Correlations
A plot comparing the relative change from baseline of linear-thickness measurements and disease volumes for the 61 patients in this study is shown in Figure 3. Each of the 155 points on the plot represents a single paired change from baseline (i.e., if a patient has a baseline CT scan and three follow-up scans, there will be three data points comparing linear measurement change from baseline with volume measurement change from baseline for that patient). For these data, Spearman’s rank correlation coefficient was estimated to be ρthickness = 0.676 and Pearson’s linear correlation coefficient was estimated to be rthickness = 0.665. Both correlations are positive, indicating that growth in disease linear thickness corresponds to growth in disease volume.
The relationship expected from a spherical geometric model (Eq. 2) is indicated in the plot with a dashed line. The quality of fit of the spherical model to the data is R2 = 0.35. Visual inspection of the plot indicates that the data do not reliably fall along the dashed line and instead seem nearly linear in some locations. Although there was no theoretical reason to believe that mesothelioma would follow a spherical geometry (indeed, the shortcomings of the spherical model for this disease have already been investigated),11,29 Figure 3 provides the first empirical evidence for the inappropriateness of the spherical assumption implicit in the standard RECIST discretized response classification criteria for MPM.
A plot comparing the relative change from baseline of normalized ipsilateral lung volumes and disease volumes for the 61 patients in this study is shown in Figure 4. Again, each of the 155 points on the plot represents a single paired change from baseline. The nonparametric Spearman’s rank correlation coefficient was estimated to be ρlung = −0.687. The linear trend correlation from Pearson’s correlation coefficient was estimated to be rlung = −0.494. The correlation coefficients are negative, indicating that for an increase in normalized lung volume, the disease volume decreases. Trajectories of the two measurement techniques for one particular patient are shown in Figure 5.
All three response assessment methods were significantly associated with patient survival in univariate Cox PH survival models. Increases in continuous time-varying linear-thickness SGR measurements were associated with poor patient prognosis (hazard ratio [HR] = 1.53; p < 0.0001), as were increases in disease volume SGR (HR = 1.32; p = 0.0003) and decreases in normalized ipsilateral lung volume SGR (HR = 0.76; p = 0.003).
In multivariate Cox PH survival models including disease histology, dyspnea, and Eastern Cooperative Oncology Group performance status, all three response assessment methods remained significantly associated with patient survival. The model coefficients for the linear-thickness model, disease volume model, and normalized lung volume model are shown in Table 2. The HR estimates for the clinical covariates vary among the three multivariate survival models, but the variability is small compared with the 95% CIs given in Table 2.
Model performance was quantified using the Cτ statistic. The performance of the full multivariate model trained and tested on the same patient cohort was 0.692, 0.680, and 0.670 for the models using linear-thickness measurements, disease volume measurements, and normalized lung volume measurements, respectively, along with the same clinical covariates. In the leave-one-out cross-validation, these scores were reduced slightly to 0.657, 0.625, and 0.630, respectively. Finally, the mean random subsample performance values for the three models were 0.659, 0.638, and 0.628, respectively. These values are summarized in Table 3.
Paired differences in subsample testing cohort performance values between survival models incorporating the different response assessment methods were used to compare the utility of the different response metrics. The mean difference in paired Cτ performance values between the linear-thickness model and the disease volume model was 0.022, with a 95% CI of −0.077 to 0.123 and was therefore not significant (bootstrap p = 0.30). The mean difference in paired Cτ performance values between the normalized ipsilateral lung volume model and the disease volume model was −0.009, with a 95% CI of −0.087 to 0.077, and was therefore not significant (bootstrap p = 0.65). The performance of the linear-thickness model is on average 3.4% higher than the performance of the disease volume model; however, considerable overlap exists in the performance of the two models (the disease volume model outperformed the linear-thickness model for 30% of the random subsample iterations). The performance of the normalized ipsilateral lung volume model is on average 1.4% lower than the performance of the disease volume model, and even more overlap exists between the lung and disease volume models than between the linear-thickness and disease volume models.
In a previous study,21 it was shown for the first time that continuous and time-varying image-based measurements of pleural disease volume were significantly associated with patient survival in mesothelioma. This study extends the previous investigation to three tumor response assessment methods: linear-thickness measurements acquired using the modified RECIST protocol, semiautomated segmentations of pleural disease volume, and automated segmentations of normalized ipsilateral lung volume. These three response assessment methods are all significantly associated with patient survival, and there are no significant differences between models that incorporate the different response metrics. Practical differences, however, exist among the three measurement techniques and the resulting models.
Until recently, measurements of complete pleural disease volume for patients with MPM were time prohibitive, and linear-thickness measurements remain the clinical standard for response assessment. In the past few years, several software algorithms for the segmentation of mesothelioma on CT scans have been published,14,16,19 and researchers are now able to explore true disease volume as a response assessment method. The novel response assessment metric in this study is lung volume; lung segmentation is a comparatively easier computational task than pleural disease segmentation, and there is a reason to expect lung volumes to be generally correlated anatomically to disease volumes for patients with MPM. Although some gross anatomic changes to the affected hemithorax are possible in mesothelioma, a decrease in disease volume should result in a corresponding increase in the ipsilateral lung volume. Normalizing the ipsilateral lung volume by the contralateral lung volume corrects for differences in respiratory phase between a patient’s CT scans, and changes in normalized lung volume form a useful response assessment metric.
The correlations among the three response assessment metrics reported in this study were in line with expectations. One would expect changes in linear thickness to be correlated with changes in disease volume, as was shown; however, the spherical geometric relationship between tumor thickness and tumor volume implicit in the RECIST protocol does not hold in mesothelioma, as evidenced in Figure 3. The correlation between normalized ipsilateral lung volumes and disease volumes was also as expected, as decreases in disease volume were met by increases in normalized ipsilateral lung volume. An example of this correlation is shown in Figure 5, where changes in normalized ipsilateral lung volume and changes in disease volume are seen to closely mirror one another. Because of the high correlation among the three metrics, using more than one response assessment metric in the same Cox PH model results in at least one of the metrics becoming a nonsignificant covariate (usually with a p value > 0.20). Therefore, no more than one response assessment method at a time can be an independent significant covariate for patient prognosis.
The fact that the survival model with linear-thickness measurements outperformed (although not at a significant level) the disease volume survival model was unexpected. Disease volumes are logically better able to capture changes in overall tumor bulk, but perhaps changes in tumor thickness are physiologically more predictive of eventual patient survival than overall volumetric changes. The two response assessment methods provide different information, and although it was previously assumed that disease volumes should be the ultimate goal of any response assessment technique, it is possible that the specific type of morphological change quantified by tumor thickness measurements is more representative of patient benefit. Another possibility is that human observers are able to place their baseline tumor thickness measurements in locations that are in some sense more relevant for response assessment; volume measurements capture changes over the total extent of disease, while tumor thickness measurements only capture change in the discrete (up to six, by modified RECIST) locations at which baseline measurements were placed. Manual linear-thickness measurements are often placed in areas of distinct tumor presence, whereas the disease volume measurements may incorporate pleural fluid in some patients. It may be possible to improve the performance of the survival model using disease volume measurements if pleural fluid could be more reliably excluded.
Also interesting is the nearly identical performance of the survival models using disease volumes and normalized ipsilateral lung volumes. The similar performance of the two models reinforces the expectation that changes in (normalized) lung volume and disease volume should convey roughly equivalent information because of the physiological correlation between the two structures. The correlation between paired Cτ values from random subsample testing cohorts showed high correlation (r = 0.77) between the survival models using disease volume and normalized ipsilateral lung volume.
There are various advantages and disadvantages for each response assessment method. It was shown by Frauenfelder et al.14 that the interobserver variability is substantially lower for disease volume measurements than for linear-thickness measurements, a fact that could become an important consideration if disease volumes were to be used clinically to assess tumor response. However, linear measurements require less manual time than semiautomated disease volume measurements, and existing techniques could potentially be used to partially automate the linear-measurement process and thereby reduce time and variability.30,31 Lung volume measurement is an automated process, and the only manual intervention used in this study was the correction of obvious segmentation errors from contrast artifacts and bowel gas. It is therefore reasonable to believe that lung volume measurements would have almost no interobserver variability. However, the utility of lung volume measurements for tumor response assessment is limited to patients with unilateral disease and those patients who do not have frequent changes in pleural fluid volume (such as with in-dwelling pleural catheters). Although unilateral disease is most common, this stipulation necessarily precludes lung volume–based response assessment for a small number of patients.
Although talc pleurodesis causes the fusion of the pleural space, there is no evidence to suggest that the procedure would affect the image-based lung volume measurement process. Furthermore, among the patients who underwent talc pleurodesis, an average of 157 days elapsed between the procedure and study entry. One patient underwent talc pleurodesis while on study, although a span of 56 days elapsed between talc pleurodesis and the next CT scan. Although talc pleurodesis induces local inflammation, this effect will likely have more of an impact on positron emission tomography–based measurements of metabolic activity than on CT disease burden or lung volume measurements.
An inherent limitation of this study is the relatively small number of patients evaluated. The survival models compared in this study form the starting point for a validation in independent patient cohorts and should not be taken as definitive response models. Although all the survival models in this study had statistically significant prognostic discrimination, absolute performance scores of approximately 0.65 are by no means perfect. Although the survival model from the linear-thickness measurements outperformed the other two models on average, there is no statistical basis to conclude that any one model is better than another. It should be further cautioned that the survival models in this study may not be applicable to patients who receive biologically different treatments than the cytotoxic therapy used for the patient cohort in this study.
In summary, this study compared survival models using three different tumor response assessment methods for patients with MPM undergoing chemotherapeutic treatment. Models were fit using clinical covariates identified in a previous study and linear-thickness measurements, pleural disease volume measurements, or normalized ipsilateral lung volume measurements. As a novel tumor response assessment technique, lung volumes exhibited the expected correlation with disease volumes. All three response assessment methods were significantly associated with patient survival. The model using linear-thickness measurements performed, on average, better than the other two models, though the differences were not significant.
The authors would like to acknowledge Philip Caligiuri, MD, for providing the manual lung contours used in the validation of the automated lung segmentation algorithm.
1. Francis RJ, Byrne MJ, van der Schaaf AA, et al. Early prediction of response to chemotherapy and survival in malignant pleural mesothelioma using a novel semiautomated 3-dimensional volume-based analysis of serial 18F-FDG PET scans. J Nucl Med. 2007;48:1449–1458
2. Lee HY, Hyun SH, Lee KS, et al. Volume-based parameter of 18)F-FDG PET/CT in malignant pleural mesothelioma: prediction of therapeutic response and prognostic implications. Ann Surg Oncol. 2010;17:2787–2794
3. Nowak AK, Francis RJ, Phillips MJ, et al. A novel prognostic model for malignant mesothelioma incorporating quantitative FDG-PET imaging with clinical parameters. Clin Cancer Res. 2010;16:2409–2417
4. Jaffe CC. Measures of response: RECIST, WHO, and new alternatives. J Clin Oncol. 2006;24:3245–3251
5. Prasad SR, Jhaveri KS, Saini S, Hahn PF, Halpern EF, Sumner JE. CT tumor measurement for therapeutic response assessment: comparison of unidimensional, bidimensional, and volumetric techniques initial observations. Radiology. 2002;225:416–419
6. Boone JM. Radiological interpretation 2020: toward quantitative image assessment. Med Phys. 2007;34:4173–4179
7. Michaelis LC, Ratain MJ. Measuring response in a post-RECIST world: from black and white to shades of grey. Nat Rev Cancer. 2006;6:409–414
8. Mehrara E, Forssell-Aronsson E, Bernhardt P. Objective assessment of tumour response to therapy based on tumour growth kinetics. Br J Cancer. 2011;105:682–686
9. Therasse P, Arbuck SG, Eisenhauer EA, et al. New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada. J Natl Cancer Inst. 2000;92:205–216
10. Therasse P, Eisenhauer EA, Verweij J. RECIST revisited: a review of validation studies on tumour assessment. Eur J Cancer. 2006;42:1031–1039
11. Oxnard GR, Armato SG 3rd, Kindler HL. Modeling of mesothelioma growth demonstrates weaknesses of current response criteria. Lung Cancer. 2006;52:141–148
12. Corson N, Sensakovic WF, Straus C, Starkey A, Armato SG 3rd. Characterization of mesothelioma and tissues present in contrast-enhanced thoracic CT scans. Med Phys. 2011;38:942–947
13. Byrne MJ, Nowak AK. Modified RECIST criteria for assessment of response in malignant pleural mesothelioma. Ann Oncol. 2004;15:257–260
14. Frauenfelder T, Tutic M, Weder W, et al. Volumetry: an alternative to assess therapy response for malignant pleural mesothelioma? Eur Respir J. 2011;38:162–168
15. Pass HI, Temeck BK, Kranda K, Steinberg SM, Feuerstein IR. Preoperative tumor volume is associated with outcome in malignant pleural mesothelioma. J Thorac Cardiovasc Surg. 1998;115:310–317; discussion 317
16. Liu F, Zhao B, Krug LM, et al. Assessment of therapy responses and prediction of survival in malignant pleural mesothelioma through computer-aided volumetric measurement on computed tomography scans. J Thorac Oncol. 2010;5:879–884
17. Sensakovic WF, Armato SG 3rd, Starkey A, Kindler HL, Vigneswaran WT. Quantitative measurement of lung reexpansion in malignant pleural mesothelioma patients undergoing pleurectomy/decortication. Acad Radiol. 2011;18:294–298
18. Armato SG 3rd, Oxnard GR, Kocherginsky M, Vogelzang NJ, Kindler HL, MacMahon H. Evaluation of semiautomated measurements of mesothelioma tumor thickness on CT scans. Acad Radiol. 2005;12:1301–1309
19. Sensakovic WF, Armato SG 3rd, Straus C, et al. Computerized segmentation and measurement of malignant pleural mesothelioma. Med Phys. 2011;38:238–244
20. Armato SG 3rd, Sensakovic WF. Automated lung segmentation for thoracic CT impact on computer-aided diagnosis. Acad Radiol. 2004;11:1011–1021
21. Labby ZE, Nowak AK, Dignam JJ, Straus C, Kindler HL, Armato SG 3rd. Disease volumes as a marker for patient response in malignant pleural mesothelioma. Ann Oncol. November 8, 2012 [Epub ahead of print].
22. Sensakovic WF, Starkey A, Roberts RY, Armato SG 3rd. Discrete-space versus continuous-space lesion boundary and area definitions. Med Phys. 2008;35:4070–4078
23. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307–310
24. Klein JP, Moeschberger ML Survival Analysis: Techniques for Censored and Truncated Data. 20102nd Ed New York, NY Springer
25. Cox DR. Regression models and life tables. J R Stat Soc B. 1972;34:187–220
26. Zhou M. Understanding the Cox regression models with time-change covariates. Am Stat. 2001;55:153–155
27. Heagerty PJ, Zheng Y. Survival model predictive accuracy and ROC curves. Biometrics. 2005;61:92–105
28. R Development Core Team. R: A Language and Environment for Statistical Computing. 2011 Vienna, Austria
29. Labby ZE, Armato SG 3rd, Kindler HL, Dignam JJ, Hasani A, Nowak AK. Optimization of response classification criteria for patients with malignant pleural mesothelioma. J Thorac Oncol. 2012;7:1728–1734
30. Armato SG 3rd, Oxnard GR, MacMahon H, et al. Measurement of mesothelioma on thoracic CT scans: a comparison of manual and computer-assisted techniques. Med Phys. 2004;31:1105–1115
31. Armato SG 3rd, Ogarek JL, Starkey A, et al. Variability in mesothelioma tumor response classification. AJR Am J Roentgenol. 2006;186:1000–1006
Chest CT; Malignant pleural mesothelioma; Therapy response assessment