Computed tomography (CT) image quality is essential for high-quality diagnostics. Soon after the first commercial CT scanners had appeared, the need for standardized quality assurance (QA) tests arose.1 The main goal of the QA measurements is to ensure that CT image quality and dose are in agreement with specifications and international recommendations. Subjective evaluations with expert readers supplement these objective criteria, but due to the time-consuming nature of these evaluations, they are mostly used in optimization of image quality for one specific examination protocol.2–4
Diagnostic image quality cannot be fully assessed without the knowledge of the anatomical area of interest and pathology to be searched for. Due to that, many different anatomical phantoms have been developed, such as cardiac, liver, lung, thorax phantoms, among others.5 These are more or less anthropomorphic, with difference in texture, density, size, and complexity. Reading conditions, such as ambient lighting or display window settings used, also affect the reader's performance, and thus diagnostic quality.
However, there are basic criteria which must be fulfilled by all CT scanners. Therefore, some general purpose image quality QA phantoms and test methods have been developed. These are meant to be used for daily, weekly, and longer term QA tests. One of the most widely used such phantom is the Catphan 600 (Phantom Laboratories, New York), which has a modular structure.1 Each module was designed for specific measurements. Performing well with this phantom is required, but not sufficient per se for accurate diagnostics.
Dose reduction and image quality are currently in the center of research. This research typically targets a few selected kernels or few selected parameters, and optimize these for dedicated diagnostic purposes.6–10 Iterative reconstructions (IRs) have just started to gain wider acceptance in clinical practice in recent years, with a research focus on specific applications with respect to increased diagnostic accuracy and/or dose reduction.11,12 In contrast, this article focuses on the basic criteria and evaluates them in large number of combinations of scan and reconstruction parameters. This exhaustive approach also allows discovering rare effects which are otherwise easily overlooked.
MATERIALS AND METHODS
The Catphan 600 phantom is a widely used general purpose phantom for CT image quality evaluation.1 It is a modular phantom where individual modules are used for specific tests. In our work, the CTP404 module was used for linearity tests to measure mean CT numbers, and the CTP515 module for contrast-to-noise measurements. The 2 modules and the performed measurements are depicted in Figures 1 and 2, respectively.
In the experiment, the same phantom was scanned 5 times with the exact same scanning parameters except for 2 parameters: peak tube voltage [kilovolt (peak), kV(p)] and effective tube current (mAs). Effective mAs was selected to produce the same CTDIvol dose level (10.0 mGy) for all of the measurements. The peak voltages used were 70, 80, 100, 120, and 140 kV(p) with 280, 613, 292, 178, and 122 mAs tube current, respectively. The common parameters are presented in Table 1. The applied dose was close to the level normally used for abdominal CT with IR. The potential for dose reduction using IR algorithms was not evaluated in this study.
CT Scanner and Reconstruction Kernels
All scans were performed on a Siemens Somatom Definition Flash dual-source multi-slice CT scanner (Siemens AG, Forchheim, Germany; http://www.healthcare.siemens.com/computed-tomography/dual-source-ct/somatom-definition-flash). The scanner provides both application specific and general purpose reconstruction kernels. A summary of the kernels is presented in Table 2. In this article, we will refer to filtered back projection (FBP) and the related IRs as a kernel family or reconstruction family. Some FBP kernels have no iterative counterpart (Table 2), and some of them (B22f, B23f) are not present at 70 kV(p).
In this study, the general body kernels were investigated both with FBP and IR where IR was available. The IR is called Sinogram Affirmed Iterative Reconstruction (SAFIRE), and it operates in the projection domain in addition to the image domain to reduce noise and artifacts. The parameter choice of SAFIRE is its level or strength, which can be varied from 1 to 5.
Minimizing External Effects
All of the measurements were performed in the exact same patient-table positions without modifying anything in the setup except the tube voltages which yielded one scan for every tube voltage. Therefore, at a given tube voltage, every reconstruction used the exact same raw data. Both positioning and inter-phantom differences were supposed to be eliminated with this approach.
One of the most important image quality features is CT number linearity. Computed tomography images are graphical representation of the linear x-ray attenuation coefficient (μ) of an object. Computed tomography numbers are measured in Hounsfield units (HU), where HU for water is 0, and HU for air is −1000. However, the relation between CT numbers and μ is not unambiguous.
A scanner can map the same physical object into slightly different CT numbers depending for instance on the spectrum of the x-ray tube, reconstruction kernel, or correction algorithms such as a dedicated beam hardening correction. Computed tomography numbers are sometimes directly used in diagnostics; therefore, it is of utmost importance that these values are accurate. If 2 kernels give different mean CT numbers for the same area, then any comparative study between them should take this fact into account.
The CTP404 module (Fig. 1) of Catphan 600 phantom contains 6 cylindrical inserts filled with solid materials and 1 with air as reference materials, and an optional water insert which was not used in this study. The reference materials [acrylic, polystyrene, low density polyethylene (LDPE), polymethylpentene (PMP), Delrin (DuPont's registered trademark), and Teflon (DuPont's registered trademark)] were selected to ensure that, for the most important density regions, the CT scanner produces the expected image. The references objects are cylinders with 10-mm diameter. To avoid edge effects radially, a centered circular region of interest (ROI) with 5-mm diameter were used for the measurements. Along the axial direction, measurements from 7 slices (2 mm each) were used to reduce the effect of statistical fluctuations. The ROI of the measurements consisted of 399 voxels. Nominal values are reported in Table 3 for the phantom materials and for some similar typical tissues, based on the phantom reference manual13 and Holmes et al.14
Low contrast detectability is an important CT image quality descriptor. The CTP515 module supports both psychophysical tests and numerical comparisons. The numerical evaluation requires the calculation of contrast-to-noise ratio (CNR). Many different definitions exist and any of them can be used as long as it is used for relative comparisons. Contrast-to-noise ratio is calculated as follows15:
Here SA and SB are the mean values for signals with 2 ROIs, and
are the variances of these signals, respectively.
The CNR value for each reconstruction was calculated from the same ROI pixels using the low contrast detectability cylinder with the largest diameter (15 mm) with 1% (10 HU) density as the signal, and a corresponding region of the same size just outside as the background. The central 10 mm area in 7 slices (2 mm each) was used for the measurement, which comprised 1575 voxels. The setup is depicted in Figure 2.
Each reconstruction provided 8 data points: 7 mean CT numbers for different materials, and 1 CNR. The materials had a nominal CT density to which the measurements should be compared during a QA test. Deviation from the nominal values should be in a certain range.13 The available combinations of tube voltages and kernels (FBP and SAFIRE) yielded 313 reconstructions, which resulted in 2504 data points for CT numbers and CNR.
Figures 3 to 9 visualize the measured mean CT numbers in the ROIs. The coloring in these figures is aimed to visualize the distribution of the negative (blue), close to zero (gray), and positive (red) relative differences. It is clear from these figures that there are kernels with similar properties. The kernels 22, 31, 35, 36, 41, 45, 46, and 50 are all within the ±3 HU range of the reference kernel for which kernel 36 was chosen. This group can be extended with kernels 60, 70, and 75, if the range is increased to ±4 HU, and air measurements are excluded.
Also, a second, smaller, weaker group of kernels can be identified. The core of this group consists of kernels 26 and 30 where the reported mean values differ in less than ±1 HU. Kernel 40 shows similar properties for medium attenuations (±2.5 HU) but for Teflon and air the differences are larger, up to 8.1 HU. Summarizing these results, the 2 groups consist of these reconstructions, where parenthesis shows the eased conditions for group 1:
- 22, 31, 35, 36, 41, 45, 46, 50, (60, 70, 75)
- 26, 30, 40
The CT number uniformity tests allow a ±4 HU range for water or water equivalent material only.2 This kind of uniformity is not required for air and dense materials where noise and artifacts might play significant roles. The previously mentioned 2 groups yield lower inhomogeneities for medium dense materials than the requirement for homogeneity. However, if a material has to be excluded (air, Teflon) although it is not tested in standard homogeneity tests, then it shows a weaker connection.
The rest of the kernels (10, 20, 23, and 80) sometimes show similarities to other kernels but not strong enough to associate them with one of the groups. The unique behavior of kernel 23 can be explained by the fact that it applies a dedicated iodine beam hardening correction.
Low Contrast Detectability
One of the important measures in low contrast detectability is CNR. In general, IRs reduce the noise level and improve CNR. This does not, however, necessarily improve the diagnostic image quality, and often a medium noise suppression is preferred.16
Three relations were examined as follows:
- correlation between CNR and SAFIRE level for a given kernel,
- CNR as function of SAFIRE level at given tube voltages,
- relative CNR improvement as function of SAFIRE level at given tube voltages.
First, sharper kernels benefited relatively more from IR, as it is demonstrated for 120 kV(p) in Figure 10. Other tube voltages produced similar results. Second, 2 of the 3 cardiac kernels (26, 36) show lower CNR for low level IR (SAFIRE 1) than for FBP. This was found for all tube voltages. Figure 11 depicts kernel 26 with this strange result, and Figure 12 shows the general case. Figure 13 shows all kernels with IR at 120 kV(p). Third, the same 2 kernels (26, 36) produced lower CNR values with IR than their slightly sharper general body kernel versions. This means that, while with FBP, kernel 26 is smoother than kernel 30; 36 is smoother than 40 and 41. This reverses at SAFIRE 3, kernel 30 became smoother than 26 whereas kernel 41 and 40 became smoother than 36 (Fig. 11). [CNR figures of the remaining kernels and peak tube voltages are present online as Supplemental Digital Content (see Figures, Supplemental Digital Content 1, http://links.lww.com/RCT/A56).]
Reported mean CT numbers might be affected by the size and the position of the ROIs. One particular example is beam-hardening artifact, which is a low-frequency artifact, and thus its appearance is affected by the low-frequency part of the modulation transfer function. Beam hardening is also sensitive to the x-ray spectrum, peak tube voltages, and patient size, among other factors. Therefore, linearity check alone cannot claim equivalence of 2 kernels. However, it is enough to claim that 2 kernels produce different results with a ROI size used in QA tests. The relatively small phantom size (20-cm diameter) and the fact that the results show no clear pattern when tube potential changes, imply that beam hardening is not the reason for the presented results. Images of some selected slices are presented in the Supplemental Digital Content (see Figures, Supplemental Digital Content 1, http://links.lww.com/RCT/A56).
There are 3 cases where similar behavior can be assumed for the kernels, see the kernel overview in Table 2. The cardiac kernels (26, 36, and 46) are different, not only in sharpness, but also kernel 26 yields different CT numbers from the 36's and 46's results. According to the application guide, kernels 30 and 31 (medium smooth and medium smooth+ kernels) should have the same visual sharpness, although with a slightly different noise structure. Therefore, it could be assumed, incorrectly, that they also produce similar mean CT numbers. The exact same pattern repeats with kernels 40 and 41 (medium and medium+ kernels).
These differences among the kernels should be taken into account during protocol optimization because they might affect the HU values and potentially the diagnostics. The assumption that kernels with similar purpose (eg, cardiac, same sharpness but different noise structure) yield similar mean CT numbers can be misleading.
In this study, no effect on the mean HU numbers was seen for SAFIRE compared to the corresponding FBP kernels. This is in accordance with other studies reporting for some selected kernels that they are not significantly affected by SAFIRE.17
CNR and IR Level
It is important to note that CNR depends on dose level, patient size, tube voltages, among other factors, and therefore, the findings in this article might not be universal. Sharper kernels produce higher noise levels, and the main advantage of IRs is the reduction of the noise level. This implies that the sharper kernels benefit more from IRs. However, the CNR decrease for the relatively smooth kernels 26 and 36 at SAFIRE 1 is unexpected. The CNR decrease could originate from signal change or from noise level change. In both cases, the cause of the lower CNR was consistently the higher noise level at SAFIRE 1. For these 2 kernels, changes in the CNR changes are decomposed into a signal and a noise part in the Supplemental Digital Content (see Figures, Supplemental Digital Content 1, http://links.lww.com/RCT/A56).
If one kernel is smoother and has higher CNR than another one (eg, B26f and B36f), then this order is expected to remain unchanged even if IR is applied (eg, I26 and I36 at SAFIRE 5). The mentioned break in the CNR curves invalidates this assumption for kernels 26 and 36. However, for the rest of the kernels, the assumption remains true. Contrast-to-noise ratio is only one of the many aspects of image quality. Despite the lower CNR results, these kernels might remain favorable for specific applications due to, for example, their different noise structure, but caution is recommended.
Unexpected results were found both for mean CT numbers and for low contrast detectability. Although kernels are manufacturer-specific, the conclusion is general: even with widely used kernels, differences can easily be overlooked. Therefore, any protocol optimization effort should devote extra attention to this detail.
This study shows that for kernels normally used for soft tissue, the HU values will be minimally shifted for tissue densities close to zero. The HU shifts were, however, observed for tissue densities in the higher and lower part of the HU scale. The results show that it is important that radiologists use absolute HU values with care for diagnostic purposes.
It is worth mentioning that the unexpected results are related to the arguable most frequently used medium soft and medium kernels (26, 36, 30, 31, 40, and 41). Future work should settle the question whether modulation transfer function and noise power spectrum provide any similar results. Dose level, patient or phantom size, material-specific modulation transfer function, and kernel-specific correction algorithms make the optimization task even more complex.
1. Goodenough DJ. Development of a phantom for evaluation and assurance of image quality
in CT scanning. Opt Eng
2. International Electrotechnical Commission (IEC). IEC 61223-3-5: Evaluation and routine testing in medical imaging departments—Part 3–5: Acceptance tests—Imaging performance of computed tomography X-ray equipment. 1.0. 2004:65.
3. International Electrotechnical Commission (IEC). Evaluation and routine testing in medical imaging departments. Part 2-6: Constancy tests – Imaging performance of computed tomography X-ray equipment. IEC publication No. 61223-2-6. Second edition. Tech. rep. IEC 61223-2-6. 2nd ed. Geneva: IEC. 2006;1–67:2006.
4. Hiles PA, Mackenzie A, Scally A, et al. Recommended Standards for the Routine Performance Testing of Diagnostic X-ray Imaging Systems: Report 91
. 2nd ed. York, UK: Institute of Physics and Engineering in Medicine; 2005.
5. DeWerd LA, Kissick M. The Phantoms of Medical and Health Physics: Devices for Research and Development
. Biological and Medical Physics. New York: Springer; 2014.
6. Fernandez A, Greffier J, Langard E, et al. Database to CT scan to reduce doses with iterative reconstructions (SAFIRE). Phys Med
7. Sieren JP, Hoffman EA, Fuld MK, et al. Sinogram Affirmed Iterative Reconstruction (SAFIRE) versus weighted filtered back projection (WFBP) effects on quantitative measure in the COPDGene 2 test object. Med Phys
8. Solomon J, Samei E. Quantum noise properties of CT images with anatomical textured backgrounds across reconstruction algorithms: FBP and SAFIRE. Med Phys
9. Straten M, van Mendrik A, Schaap M, et al. Are iterative reconstruction techniques better than filtered backprojection? Quantitative evaluation on a CT phantom. In: Proceedings, Radiological Society of North America, 97th Scientific Assembly and Annual Meeting
. Chicago, IL: Radiological Society of North America; 2011.
10. Jo JK, Cheol KD, Jong-Woong L, et al. Measurement of image quality
in CT images reconstructed with different kernels. J Korean Phys Soc
11. Korn A, Fenchel M, Bender B, et al. Iterative reconstruction in head CT: image quality
of routine and low-dose protocols in comparison with standard filtered back-projection. AJNR Am J Neuroradiol
12. Löve A, Olsson ML, Siemund R, et al. Six iterative reconstruction algorithms in brain CT: a phantom study on image quality
at different radiation dose levels. Br J Radiol
13. Goodenough DJ. Catphan 500 and 600 Manual
. Greenwich, NY: The Phantom Laboratory; 2014.
14. Holmes EJ, Forest-Hay AC, Misra RR. Interpretation of Emergency Head CT: A Practical Handbook
. Cambridge: Cambridge University Press; 2008.
15. Thitaikumar A, Krouskop TA, Ophir J. Signal-to-noise ratio, contrast-to-noise ratio and their trade-offs with resolution in axial-shear strain elastography. Phys Med Biol
16. Hardie AD, Nelson RM, Egbert R, et al. What is the preferred strength setting of the sinogram-affirmed iterative reconstruction algorithm in abdominal CT imaging? Radiol Phys Technol
17. Ghetti C, Palleri F, Serreli G, et al. Physical characterization of a new CT iterative reconstruction method operating in sinogram space. J Appl Clin Med Phys