Reproducibility of sequential ambulatory blood pressure and pulse wave velocity measurements in normotensive and hypertensive individuals

Objective: Errors in blood pressure (BP) measurement account for a large proportion of misclassified hypertension diagnoses. Ambulatory blood pressure monitoring (ABPM) is often considered to be the gold standard for measurement of BP, but uncertainty remains regarding the degree of measurement error. The aim of this study was to determine reproducibility of sequential ABPM in a population of normotensive and well controlled hypertensive individuals. Methods: Individual participant data from three randomized controlled trials, which had recorded ABPM and carotid-femoral pulse wave velocity (PWV) at least twice were combined (n = 501). We calculated within-individual variability of daytime and night-time BP and compared the variability between normotensive (n = 324) and hypertensive (n = 177) individuals. As a secondary analysis, variability of PWV measurements was also calculated, and multivariable linear regression was used to assess characteristics associated with blood pressure variability (BPV). Results: Within-individual coefficient of variation (CoV) for systolic BP was 5.4% (day) and 7.0% (night). Equivalent values for diastolic BP were 6.1% and 8.4%, respectively. No statistically significant difference in CoV was demonstrated between measurements for normotensive and hypertensive individuals. Within-individual CoV for PWV exceeded that of BP measurements (10.7%). BPV was associated with mean pressures, and BMI for night-time measurements. PWV was not independently associated with BPV. Conclusion: The variability of single ABPM measurements will still yield considerable uncertainty regarding true average pressures, potentially resulting in misclassification of hypertensive status and incorrect treatment regimes. Repeated ABPM may be necessary to refine antihypertensive therapy.


INTRODUCTION
A mbulatory blood pressure measurement (ABPM) has been shown to be the most cost-effective option to confirm a diagnosis of hypertension [1]. The reproducibility of average blood pressure (BP) taken by 24-h ABPM has been previously shown to be superior to the reproducibility of clinic BP [2][3][4]. However, the majority of studies examining ABPM reproducibility have been performed with time intervals of 12 weeks or less, or in individuals with long-term hypertension or a history of cardiovascular disease.
There are fewer studies comparing techniques for longterm monitoring and clinic BP remains a first-line tool despite well known risks from white-coat or masked hypertension [5][6][7]. Long-term monitoring using ABPM could facilitate improved BP control, but the degree of variability between sequential ABPM in normotensive and stable hypertensive individuals otherwise free from overt cardiovascular disease is poorly characterized.
A retrospective analysis of individual patient data (IPD) from randomized controlled trial (RCTs) provides an opportunity to investigate ABPM measurement variability. Here, we present analyses of measurement variability from three studies, which investigated possible benefits of dietary modification on ambulatory BP and arterial stiffness in individuals who were normotensive or with well controlled hypertension [8][9][10]. The concurrent measurement of pulse wave velocity (PWV) in these individuals presented an ideal opportunity to directly compare the reproducibility of PWV against that of ABPM, as superior reproducibility may support the alternate use of PWV as a long-term monitoring tool for cardiovascular health.
Even in a healthy population, reproducibility of BP measurements will be affected by blood pressure variability (BPV). Some degree of BPV is a normative property, but a high variability has been shown to be associated with an increased risk of cardiovascular outcomes, independent of the mean systolic pressure [11][12][13]. Determinants of increased BPV may include general cardiovascular risk factors such as increasing age, arterial stiffness and adverse lipid profiles [14,15] amongst others. Elucidation of factors associated with BPV in this cohort may provide clues to modifiable risk factors for high BPV in cohorts at a higher risk of cardiovascular morbidities.
Therefore, the primary aim of the present study was to calculate reproducibility associated with sequential ABPM in this relatively healthy population of normotensive and well controlled hypertensive individuals. Secondary aims were to estimate BPV and its potential determinants and to compare the reproducibility of arterial stiffness to that of ABPM for evaluation of its use as a surrogate technique for long-term measurement of vascular health.

Individuals and inclusion criteria
We analysed IPD from three RCTs investigating the impact of dietary modifications on cardiovascular outcomes. Firstly, the Fruit & Veg study (ISRCTN50011192) tested whether a potassium-rich diet was beneficial for treatment-naive prehypertensive individuals (n ¼ 48) [8]. Secondly, the MARINA study (ISRCTN666664610, n ¼ 312) examined if increasing intake of long-chain n-3 polyunsaturated fatty acids favourably affected endothelial function and arterial stiffness [9]. Finally, the CRESSIDA study (ISRCTN9282106, n ¼ 162) considered how following UK dietary guidelines, instead of a traditional British diet, might affect vascular function [10]. All study and trial procedures were performed at Guy's and St. Thomas' NHS Foundation Trust. Each study was approved by a local research ethics committee.
Data were eligible for inclusion and analysis if they fulfilled the following criteria: individuals must have had at least two ABPM and two PWV measurements, individual arms of each study did not show a significant change in BP measurements from baseline (statistical method detailed below) and no change in antihypertensive medications during the study. The second criterion ensured that any discrepancy between repeat measurements were secondary to measurement technique and physiological variability, rather than changes in an individual's true average BP (defined as a hypothetical estimate without measurement error and physiological variation [16]) resulting from dietary or other interventions. From the 522 available individual cases, 501 were retained in this analysis, as summarized in Fig. 1.

Measurements
ABPM measurements were performed with the A&D TM-2430 device (ScanMed, Moreton-in-Marsh, Gloucestershire, UK) in all studies. CRESSIDA and the Fruit & Veg study took five measurements of ABPM, whilst MARINA recorded three. The first baseline measurement for CRESSIDA was followed by a second baseline measurement approximately 3 weeks later, and then further measurements at 4, 8 and 12 weeks after the second baseline measurement. In Fruit & Veg, the first two measurements were approximately 6 weeks apart, and then subsequent measurements were every 11 weeks. MARINA measured ABPM at baseline, then 6 months and 12 months later. A full schedule of events can be found in Table S1, http://links.lww.com/HJH/C73 (in Supplemental Digital Content). ABPM devices were programmed to take measurements every 30 min from 0700 to 2200 h and hourly between 2200 and 0700 h, but daytime and night-time periods were defined by each participant according to a sleep diary. PWV was measured by applanation tonometry of the carotid and femoral arteries using the SphygmoCor device (Atcor Medical, Sydney, Australia) after at least 15 min of rest. Further details of study procedures and study outcomes can be found in the published articles [8][9][10].

Nocturnal dipping
Nocturnal dip category was estimated for each ABPM session firstly according to a simple dichotomous outcome of dipper (night-time SBP fall ! 10% of daytime SBP) or nondipper (night-time SBP fall < 10% daytime SBP). Dipping status was then further defined according to the four classic dipping patterns (dipper: nocturnal SBP fall >10% of daytime SBP), reduced dipping (nocturnal BP fall 1-10% of daytime SBP), reverse dipping (increase in nocturnal SBP) and extreme dipping (nocturnal SBP fall >20% of daytime SBP) [17].

Data analysis
To verify whether individual arms of studies were eligible to be included in this analysis, repeated-measures analysis of  variance (ANOVA) was used to assess if there were significant differences in BP across each study timeframe. If ANOVA demonstrated significance less than 0.05, posthoc pairwise comparisons were performed using the Bonferroni method between the initial and last ABPM measurement. Study arms were eligible to be included if the overall ANOVA significance was more than 0.05, or if P value was less than 0.05 but with no significant difference between the first and last measurement (see Table S2, http://links.lww. com/HJH/C73, Supplemental Digital Content, which details the sequential BPs for each study arm, and the significance of any differences). As such, all arms of the studies were considered eligible to be included in this analysis. Correlations were tested with Pearson's correlation coefficient unless stated otherwise. Comparison of individual characteristics at baseline and study endpoint were compared with paired t-tests. Logarithmic transformation was used to calculate the within-individual coefficient of variation (CoV) and corresponding 95% confidence interval (CI) as described by Bland and Altman [18,19] for daytime, nighttime and 24-h BP and heart rate (HR), and PWV. CoV was compared between normotensive and hypertensive individuals. Normotension was defined as baseline daytime SBP (SBP day ) less than 135 mmHg, whereas hypertension was defined as baseline SBP day at least 135 mmHg.
Each individual had two to five measurements of SBP day , night-time SBP (SBP night ), daytime DBP (DBP day ) and nighttime DBP (DBP night ). The mean and standard deviation (SD) of these measurements was calculated for each individual. This intra-individual SD was used as an estimate of BPV for each individual. Multivariable linear regression models were used to analyse associations between patient characteristics and BPV, using an enter method.
Effect of regression to the mean (or adaptation to the ABPM device) was analysed using repeated-measures ANOVA in a subset of 199 individuals who had five ABPM measurements. Fleiss' Kappa was calculated to determine agreement above chance in dipping categories.
Statistical tests were performed in SPSS version 25 (IBM, Chicago, USA), and significance defined as a P value less than 0.05. One author (L.K.) had access to all the data and takes responsibility for its integrity and the data analysis.

Baseline characteristics
Participant characteristics at baseline are summarized in Baseline SBP day and SBP night were significantly correlated with age (r ¼ 0.15, P ¼ 0.001 and r ¼ 0.14, P ¼ 0.001 respectively), but DBP day and DBP night were not (r ¼ 0.02, P ¼ 0.70 and r ¼ 0.85, P ¼ 0.06). Baseline BMI was significantly correlated with all baseline pressure measurements (r ¼ 0.26, P < 0.001 for SBP day , r ¼ 0.29, P < 0.001 for SBP night , r ¼ 0.16, P < 0.001 for DBP day and r ¼ 0.23, P < 0.001 for DBP night ).
Associations between blood pressure variability and mean pressures Significant associations were observed between mean ambulatory BP values and the variability of those measurements (Fig. 2). For SBP, both day and night measurements demonstrated a significant association between mean values and the SD of those measurements (SBP day r ¼ 0.21, P < 0.001; SBP night r ¼ 0.27, P < 0.001) (Fig. 2a and c). When the relationship was investigated using the ratio of variability and mean SBP (individual CoV), the strength of the relationship was no longer significant for day measurements (r ¼ 0.03, P ¼ 0.449) and reduced for night (r ¼ 0.15, P ¼ 0.001; Fig. 2b and d).
For ambulatory DBP measurements, a significant positive relationship was observed between mean values and the SD of those measurements (DBP day r ¼ 0.17, P < 0.001 and DBP night r ¼ 0.29, P < 0.001; Fig. 2e and g). Conducting the analyses with CoV removed the significant association for both DBP day and DBP night (r ¼ 0.02, P ¼ 0.696 and r ¼ 0.10, P ¼ 0.496, respectively; Fig. 2f and h).

Within-individual coefficient of variation for repeated ambulatory blood pressure measurements
Measures of within-individual CoV for each study and for the entire cohort are summarized in Table 2. Qualitative analysis shows that measures of CoV for each BP measurement are similar for each study. In the entire cohort, the CoV for daytime measurements is significantly lower than that compared with night-time measurements: 5.4% (95% CI 5.2-5.6) for SBP day compared with 7.0% (95% CI 6.7-7.3) for SBP night , and 6.1% (95% CI 5.9-6.4) for DBP day compared with 8.4% (95% CI 8.0-8.7) for DBP night . CoV is significantly lower for 24-h ABPM measurements: 4.8% (95% CI 4.6-5.0) for SBP, and 5.3% (95% CI 5.1-5.5) for DBP.
Reproducibility of ambulatory measurements was compared between individuals defined as normotensive on their baseline visit compared with those defined as hypertensive (Table 2 and Fig. 3). The mean baseline SBP for the normotensive group was 122 AE 8 mmHg compared with 144 AE 9 mmHg for the hypertensive group. When considering all normotensives versus all hypertensive individuals, there was no clear evidence of any difference in the reproducibility of SBP day , SBP night , DBP day or DBP night . However, both the CRESSIDA and Fruit & Veg studies showed significantly less variability in hypertensive individuals than normotensive individuals for measurements of DBP day : 4.4% (95% CI 3.9-4.9) in hypertensive individuals compared with 5.3% (95% CI 5.0-5.7) in normotensive individuals in CRESSIDA, and 5.5% (95% CI 4.8-6.2) in hypertensive individuals compared with 7.7% (95% CI 6.4-9.1) in normotensive individuals in Fruit & Veg.

Association of individual risk factors to individual blood pressure variability
Average estimates of individual BPV, as assessed by the SD, were as follows: SBP day SD 6.1 AE 3.3 mmHg, SBP night SD 6.6 AE 3.8 mmHg, DBP day SD 4.1 AE 2.3 mmHg and DBP night SD 4.4 AE 2.5 mmHg. BPV was not correlated with age for SBP day SD, SBP night SD, DBP day SD, DBP night SD (all P > 0.05). BMI at baseline was significantly correlated with SBP day SD (r ¼ 0.09, P ¼ 0.04), SBP night SD (r ¼ 0.20, P < 0.001) and DBP night SD (r ¼ 0.19, P < 0.001), and had a borderline significant correlation with DBP day SD (r ¼ 0.09, P ¼ 0.054). Table 3 summarizes multivariable linear regression investigating the associations between BPV to individual demographics and mean BP. No significant associations were demonstrated for age, sex or PWV with SD for SBP day , SBP night , DBP day or DBP night . SBP day SD was independently associated with nonwhite ethnicity, use of antihypertensive medication and mean SBP day . SBP night SD was independently associated with baseline BMI and mean SBP night . DBP day SD was only associated with mean DBP day . DBP night SD was independently associated with baseline BMI and mean DBP night . Further analyses were performed examining the effect of mean sleep duration on night-time variability, in a subset of 207 individuals in whom these data were available (the CRESSIDA and Fruit & Veg participants). Mean sleep duration was not independently associated with SBP night SD (P ¼ 0.482) or DBP night SD (P ¼ 0.160), as shown in Supplemental Digital Content, Table S3, http://links.lww.com/HJH/C73, which details the full linear regression models.
Adaptation to the ambulatory blood pressure monitoring device Adaptation to the ABPM device was tested in a subset of 199 individuals who had the full five measures of each BP. Repeated-measures ANOVA shows no evidence of adaptation to the device in terms of SBP day , SBP night or DBP day (all P > 0.05). However, DBP night changed significantly over the course of sequential measurements, being at its lowest on the baseline visit, highest on second assessment, then decreasing sequentially (P ¼ 0.001).
The majority of normotensive and hypertensive individuals were classified as normal dippers on the binary classification at study baseline. Dippers accounted for 240 (74%) of normotensive individuals compared with 145 (82%) of hypertensive individuals. When considering dipping status over all available measurements, 1% of normotensive individuals were nondippers throughout, 45% were dippers throughout and 53% were changeable over their measurements, compared with 3% of hypertensives being nondippers throughout, 54% remaining a dipper throughout and 41% changing their status. There was a weak but significant agreement in dipping status for both normotensive and hypertensive individuals (k ¼ 0.132, P < 0.001 and k ¼ 0.187, P < 0.001, respectively) when analysed over five ABPM measurements (n ¼ 194).
Using the four categories of dipping (reverse, reduced, normal and extreme), the majority of normotensive and hypertensive individuals were again classed as normal dippers (51 and 44%, respectively). Both groups also showed a tendency to change category over the course of their measurements. In the normotensive group, 271 (84%) changed their dipping category, and in the hypertensive group, 141 (80%) changed their dipping category. In individuals with the full five measurements, only 11% of normotensive individuals maintained their original dipping category (k ¼ 0.107, P < 0.001). Similarly, only 11% of hypertensive individuals maintained their original dipping category over five ABPM measurements (k ¼ 0.160, P < 0.001).

DISCUSSION
To our knowledge, this is the largest study to examine reproducibility of serial ABPM measurements in a cohort of adults with minimal cardiovascular comorbidities. Reproducibility estimates are not dissimilar to those calculated by others. Our CoV estimates of 5.4 and 6.1% for daytime SBP  and DBP, respectively, are close to the 5.5 and 4.9% calculated by Warren et al. [20] in a cohort of 163 individuals of similar age (although with a higher proportion of antihypertensive use) and lower than 7.4 and 6.3% calculated by Mansoor et al. [3] in their cohort of hypertensive patients (n ¼ 25). Our nighttime CoVs were slightly higher than those obtained by Mansoor et al. [3]: 7.0% compared with 6.3% for night-time SBP, and 8.4% compared with their 7.1% for night-time DBP [3]. Despite the large difference in baseline SBP day between the normotensive and hypertensive group, we did not demonstrate any marked differences in ABPM measurement reproducibility in normotensive versus hypertensive individuals. By using CoV as a measure of reproducibility (rather than SD, which is correlated to mean BP), we show that in our cohort, ABPM measurements were no more variable in stable hypertensive individuals than in normotensive individuals when the mean BP was accounted for. Variability of our night-time measurements generally exceeded that of daytime measurements, as also found by Bo et al. [21]. This could be attributable to inconsistency of nocturnal dipping patterns [22] or direct interruption of sleep due to the operation of the ABPM device. Poor sleep quality is associated with increased BPV [23] and with increased BP [24], but it is contentious whether ABPM devices impair sleep quality enough to produce a significant increase in nocturnal pressures [25,26]. We were not able to analyse the effect of sleep quality in this study, but sleep duration did not appear to have a significant effect on night-time variability. When we analysed patterns of nocturnal dipping, we found little agreement above chance in  categorisation of dipping status. This trend persisted whether we used four categories of classification, or a simplified dichotomous classification, and with little difference seen between normotensive and hypertensive individual groups. Although abnormal nocturnal dipping has been shown to be associated with adverse cardiovascular outcomes [27], its poor reproducibility shown by ourselves and others [21,22,28] may limit its use for stratifying risk. As many studies on nocturnal dip variability examine only two measurements, further large studies are needed to examine reproducibility of nocturnal dip over multiple measurements with an emphasis on determining subgroups particularly prone to high variation. We have shown that BPV, an important predictor of cardiovascular risk, is positively associated with mean BP but were unable to demonstrate any significant associations with age, sex or concurrent arterial stiffness when the mean BP was accounted for. Arterial stiffening may be a long-term consequence rather than a cause of BPV [15], hence the lack of association seen in cross-sectional regression. Increased BMI was associated with a higher baseline BP and increased variability of night-time measurements, but not daytime pressures, which may reflect findings by others that higher BMI is associated with increased BPV, and disruption of normal nocturnal dipping patterns [29,30]. White participants appeared to have less variability in their SBP day measurements compared with nonwhite ethnicity individuals, in agreement with other studies showing that African-Americans have higher BPV than white individuals, as well as higher mean ambulatory pressures [31], for which several physical and socioeconomic reasons have been suggested [32].
A secondary aim of this study was to examine if the variability of arterial stiffness, as measured by PWV, was superior to that of ABPM. Overall, PWV was found to have a CoV of 10.7% for the whole cohort, which is similar to that found by others in short-term studies [33], but higher than the CoV for the BP measures, which ranged from 5.4 to 8.4%. Coupled with the fact that PWV requires specialist equipment and user training, this suggests that, unless it is more strongly related to risk of clinical outcomes, it is not preferable as a surrogate measurement for long-term BP monitoring. PWV measurements appeared more variable in hypertensive compared with normotensive individuals, but this is to be expected given that mean and SD values of PWV were correlated, and PWV is itself highly correlated with concurrent BP.
Use of ABPM is becoming more widespread, as current guidelines recommend its use to confirm a new diagnosis of hypertension [34,35]. However, for long-term monitoring of BP, NICE still advises use of clinic BP measurement, with ABPM suggested as a confirmatory tool for individuals who could have white-coat or masked hypertension [35]. Reproducibility of repeated ABPM has been studied, but often in small cohorts and a wide range of reproducibility indices used across the literature. Our cohort was composed of individuals with minimal cardiovascular morbidities and who did not require initiation or alteration of antihypertensive medication during the study period. In such individuals, it could be hypothesized that variability of BP measurements should be minimal. However, we have shown that the withinindividual variability of ABPM measurements is still large when considered in clinical context. A borderline hypertensive clinic individual may be given ABPM to confirm or refute the presence of true hypertension. If their true daytime SBP was 140 mmHg, however, a CoV of 5.4% for SBP day by ABPM implies that 95% of readings will normally occur within a range of 125-155 mmHg, making diagnosis uncertain. Similarly, for a true daytime diastolic pressure of 90 mmHg, 95% of measurements would occur within a range of 78-102 mmHg (based on a CoV of 6.1%). Night-time estimates may be subject to even greater variability, as we have noted that the CoV of night-time measurements is significantly higher than those found during the day. Currently, NICE only recommends use of daytime ABPM to guide diagnosis [35], but future work could explore the use of night-time and 24-h BP to guide antihypertensive therapy, as nocturnal BP is correlated with cardiovascular outcomes [36,37], and variability of 24-h BP is less than daytime BP, as shown in this work and others [4,[38][39][40].
Clinicians should note that SD of measurements is proportional to mean pressure and precise assessment of BP in a hypertensive individual may therefore be subject to additional complexity. An additional consideration in the use of single ABPM measurements to guide treatment is the possibility of an adaptive response to the device, whereby the first use elicits an additional pressor response with subsequent values showing regression to the mean. Although we were unable to show evidence of adaptation in terms of SBP day , SBP night or DBP day , we did note some changes in DBP night over the course of sequential measurements and nocturnal ABPs have been shown to be susceptible to adaptation as well as daytime measurements [38,41].
Our recent work using Monte Carlo simulations of BP treatments showed that measurement error is the main cause for misclassification of BP target when undertaking stepwise titration of antihypertensive therapy [16,42] . Readings of low error are likely to improve BP control, a conclusion supported by general consensus [43,44]. It is interesting to note that the measurement margins calculated here are in excess of the likely response to antihypertensive monotherapy ($ 9.1 mmHg SBP, $5.5 mmHg DBP), and may even exceed that expected for dual therapy in some instances [45], highlighting the limitations of single ABPM measurements.

Limitations
The present study is subject to several important limitations. Firstly, this study uses retrospective data from interventional studies, which were designed to detect differences from baseline in ABPM and PWV, rather than assess variability within a stable population over time. Furthermore, the three studies differ in design and so the extent to which their data are directly comparable must be considered. The analyses presented here were designed to mitigate against these potential issues. Firstly, each study arm was only included if there was no significant change in measurements from baseline. This is a different approach to that used within each study, which generally compared intervention. In the CRESSIDA study this showed a significant 4.2 mmHg reduction in ambulatory daytime SBP in the intervention compared to control group. Our approach was defined a priori and was designed to maximise the data available, albeit with a recognition that various interventions may have an unknown impact on measures of interest. For example, we note that subsequent analysis from the MARI-NA study has identified that genotype may have dictated an individual's response to the fish oils given [46]. However, even with a potential postintervention increase up to 5 mmHg on endpoint SBP, our CoV estimates for SBP would not be significantly altered (calculation not shown). Secondly, we used IPD rather than summary data to provide more reliable results [47]. Thirdly, the limited number of repeat measurements for each participant may have inflated true values for individual variability but does approximate better to clinical practice than a high number of repeated ABPMs. The consistency of results between the three different studies provides some reassurance for our approach and the comparability of the datasets.
In conclusion, this study highlights that although ABPM is the gold standard for BP measurement and monitoring, variability between measurements may result in misclassification and incorrect treatment decisions. Within our analysis population, PWV measurement was not a more reproducible technique than ABPM when assessed as a CoV. Repeated ABPM may be necessary to refine antihypertensive therapy.