In critical care, the patient is surrounded by a multitude of devices for monitoring, diagnostics, and therapy. These devices may include:
- Monitoring devices (patient monitors) to measure and monitor vital signs.
- Therapeutic devices to support or replace impaired or failing organs (e.g., ventilators, renal replacement therapy).
- Therapeutic devices to administer medications and/or fluids to the patient.
Most therapeutic devices also have physiologic monitoring functionality. More and more devices combine monitoring/sensing and therapy into closed-loop controllers in which both monitoring malfunction and therapeutic malfunction can have catastrophic effects. Examples include dialysis machines, in which the ultrafiltration rate is controlled by measured intake and output, and ventilators, in which the airway pressures are regulated following the measured tidal volume.
In the terminology of industrial process analysis, monitors are sensors that provide health care professionals with patient-related data and information that supports those professionals in making appropriate therapeutic decisions. Therapeutic devices, in this terminology, are the actuators to perform therapeutic decisions.
All devices have alarm capabilities and generate optic and acoustic alarms to alert the staff to either a change in the patient’s condition or a malfunction of the equipment. Although alarms are an important, indispensable, and sometimes lifesaving feature of nearly all medical devices, not only can they be a nuisance, but they can also compromise quality and safety of care by their frequent false positive alarms. Moreover, even in the case of a true and valid alarm, further problems arise, as devices from different vendors may annunciate the same alarm instance in different ways, and different alarm instances may result in similar alarms from different devices. The alarming of therapeutic devices may not always match the criticality of the clinical situation; e.g., the alarms of an infusion device are always identical irrespective of the drug that is infused.
Thus there are two major issues: the correct identification of a situation that needs to be alarmed, which can be considered a detection and decision problem; and the consistent and unambiguous annunciation of this alarm, which can be considered a user interface and human factors problem. This article focuses on the issues of alarm detection, and gives an overview of the clinical situation, the underlying problems, methods that have been tried to solve these problems, and new approaches to solving the problems of false alarms.
Alarms from Medical Devices
From a clinical viewpoint, an alarm is an automatic warning that results from a measurement or any other acquisition of descriptors of a state (here, patient state) and shall indicate a (clinically) relevant deviation from a normal (physiological) state.
Alarms are expected to monitor vital organ function, essential device function, and to increase patient safety and quality of care by early detection of any abnormality (1).
Following this notion, there are different goals for device alarms that follow a certain hierarchy:
- Detection of life-threatening situations: the detection and annunciation of life-threatening situations was the origin of monitoring alarms. Examples of such situations are asystole, extremely low blood pressures, or hypoxia. In these situations, false negative alarms are not acceptable, as immediate action is needed to prevent major patient harm or death.
- Detection of life-threatening device malfunction: this capability is essential for all organ support devices, and must recognize malfunctions, such as disconnection from the patient, occlusion of the connection to the patient, disconnection from power, gas, or water supply, and internal malfunctions.
- Detection of imminent danger: the early detection of imminent danger, for instance, the gradual change of a monitored variable1 over time, may make the caregiver aware of a dangerous situation before it becomes life-threatening.
- Detection of imminent device malfunction: the early detection of device-related problems that may lead to malfunction is an integral part of many therapeutic devices. These warning mechanisms include simple things such as low battery warnings or complex algorithms and sensors to track the wear of respiratory valves.
- Diagnostic alarms: a diagnostic alarm could indicate a pathophysiologic condition (for instance, hypovolemia) rather than alarming for the out-of-range variables.
Of course, the final goal could be to detect changes early and suggest appropriate therapy. This would constitute an online clinical decision support system that goes far beyond the scope of physiologic monitoring.
Problems of Device Alarms in Critical Care
There are several studies addressing problems of alarms in clinical practice. The vast majority of these studies evaluate alarms from patient monitors. Only a few investigate alarms of therapeutic devices.
The Rate of False Alarms
In the medical literature, there are many references to studies into the rate of false alarms in critical care monitoring (Table 1). These studies show that up to 90% of all alarms in critical care monitoring are false positives. In many cases they result from measurement and movement artifacts. The vast majority of all threshold alarms in the intensive care unit (ICU) have no real clinical impact on the care of the critically ill (2–4). In one study, 72% of all alarms resulted in no medical action (1). The study showed that the positive predictive value was only 27% and the specificity only 58%. The negative predictive value and the sensitivity were 99% and 97%, respectively. These results are supported by earlier studies that report a rate of significant alarms as low as 10% (4) or 5% (3). Although 27% were induced alarms (i.e., caused by nursing procedures), 68% were truly false alarms (3). The positive predictive value ranged from 5% to 16% depending on the monitored variable.
Among therapeutic devices, only alarms from ventilators have been more closely investigated. One study showed that, compared with monitoring alarms, sensitivity and specificity of alarms from ventilators were not significantly different from ECG monitors in pediatric patients (3). In this study, however, technically false and clinically false alarms were combined. In another study, the rate of interventions after ventilator alarms was not different from that for monitoring alarms, but these interventions, in the case of ventilator alarms, were typically modifications of therapy (e.g., suctioning, readjustment of ventilator settings), whereas interventions triggered by monitoring alarms mostly resulted in sensor readjustments (1).
Alarms from therapeutic devices are technically correct most of the time, because they result from technical measurements inside the device (e.g., respiratory circuit pressures, flow rates), but they may often be clinically irrelevant, despite the correct measurement. Alarms from monitoring devices may often be also technically incorrect, in that the physiologic signal acquisition may be distorted.
Clinical Perception of False Alarms
The multitude of false alarms leads to a dangerous desensitization of the intensive care staff toward true alarms (7). One of the clinical results of this poor alarm handling was shown in another clinical study (8): asked for their opinion on how to adequately set monitoring alarms, 23 cardiac anesthesiologists and cardiac surgeons named only heart rate and arterial blood pressure alarms, and all other cardiovascular alarms were suggested to be disabled. Moreover, the alarm limits were set extremely wide, so that even definitely dangerous situations would not result in an alarm. Even at these settings, the clinicians would tolerate an alarm for up to 10 minutes before taking action.
There are not only too many alarms, most of them false, but there are also different alarms, including different acoustic or optic annunciations, for the same clinical alarm situation. In one study, critical care nurses could identify only 38% of all audible alarms correctly (9), results that were confirmed by other studies (10).
Reasons for and Classification of False Alarms
False alarms can be caused by technical problems or artifacts. Another reason for false alarms is the clinical inappropriateness of the alarm itself, the alarm limit, or the alarm algorithm.
From a medical perspective, there are three categories of false alarms:
- Technically false alarms are situations in which alarms are called for a specific variable, although this variable in reality does not surpass any preset threshold. Examples are an asystole call because of a low voltage lead, motion artifacts, falsely low Spo2 readings in hypothermic patients, or a wet flow sensor of a ventilator.
- Clinically false alarms are situations in which the variable is actually beyond the preset alarm threshold, but this situation does not always have clinical relevance. Examples are the intermittently increased heart rate in patients with absolute arrhythmia, when the increased heart rate may only last for a few seconds, or during weaning from mechanical ventilation, when a patient takes a deep spontaneous breath exceeding the tidal volume alarm limit.
- False alarms through interventions are actually both technically and clinically false alarms. They are caused by medical or nursing interventions, such as moving or positioning the patient, drawing blood and flushing the a-line, or disconnecting the patient from the ventilator for endotracheal suctioning.
In one study, approximately one third of all alarms originated from the ventilator, another third from the cardiovascular monitor, and some 15% from pulse oximetry alone (1). Another study found that pulse oximetry causes more than 40% of all false alarms (3).
New Alarm Algorithms
In summary, clinical experience and scientific studies show that the quality of medical device alarms is unsatisfactory to the degree that the poor quality not only annoys caregivers and hampers their work but also affects quality of care and patient safety. Although these problems are multifaceted, one root cause is the poor quality of alarm-generating algorithms, leading to unacceptably frequent false-positive alarms. Therefore, new, improved solutions for alarm algorithms need to be sought.
Requirements for Alarms and Alarm Algorithms
The ideal requirements for alarms are straightforward:
- All life-threatening situations shall be detected and alarmed, irrespective of whether they are patient- or device-related.
- Patient-related alarms shall be differentiated from equipment-related alarms.
- Devices shall warn before life-threatening situations occur.
- Devices shall give physiologic/diagnostic information with reference to an alarm situation.
Sensitivity and negative predictive value for life-threatening situations should be very close to 100%, without values for positive predictive value and specificity that are too low. The latter is actually the problem for most alarm algorithms used today in commercially available medical devices.
Methodological Approaches to Alarm Detection
Basically, there are three levels at which generation of alarms can be improved: 1. Physiologic/technical front end (signal acquisition) 2. Alarm-generating algorithms for each physiologic variable (alarm generation) 3. Validation of the generated alarms (alarm validation)
Every method that is used to enhance the clinical and technical quality of monitoring alarms must fulfill certain methodological criteria. These include:
- robustness against artifacts and missing values
- real-time application (i.e., efficient and fast algorithms)
- predictable behavior
- methodological rigor
An alarm-detection algorithm can either consider one variable at a time (univariate methods) or more than one variable at a time (multivariate methods). In this context, univariate alarms denote alarms resulting from the analysis of a single monitoring variable (e.g., heart rate or systolic blood pressure). The basic problem is that most alarms are based on simple numeric thresholds. In response to the problems of univariate alarms, two general approaches can be distinguished:
- Improvement of the univariate alarm algorithm (e.g., by improving the detection of patterns of change in a specific variable). In this case, the clinical problem is univariate, and the approach to solving this problem is also univariate.
- Incorporation of information derived from variables other than the one under observation. In this case, the clinical problem is still univariate, but the approach to solving this problem is multivariate.
Multivariate alarms, in this context, denote alarms resulting from the simultaneous analysis of more than one monitored variable (e.g., all hemodynamic variables).
Today, univariate alarm algorithms are predominantly used in clinical practice, with few exceptions. Systems have recently been introduced in which one signal source may be checked against another signal source providing information about the same alarm variable (e.g., heart rate derived from electrocardiogram, pulse oximetry waveform, and invasive arterial pressure waveform). Table 2 gives an overview of patient-related alarm features in patient monitors. In research applications, multivariate algorithms are tested. Multivariate can mean that either the results of univariate analyses are combined, for instance, by logical combination, or that truly multivariate analyses (e.g., principal component or factor analyses) are performed.
Although the latter approach is still in an experimental stage, presenting both methodological and interpretation problems (overview in 11), the logical combination of univariate analyses has been tried in clinical studies (12).
Detailed Methodological Review
Many different methods have been investigated for use in alarm systems. Most approaches for the analysis of real-time monitoring data originate from the fields of statistics and artificial intelligence.
The following paragraphs give an overview of the most important methods that have been applied to, or are under investigation for, patient monitoring.
Improved Signal Extraction. The robust extraction of an underlying signal in the form of a time-varying mean from noisy physiologic time series can be a valuable data preprocessing method. For local approximation of a linear trend, Davies et al. (13) applied different robust regression techniques, such as the repeated median (14) in a moving time window, and compared them in elementary data situations. The procedures were further improved by integrating online outlier replacement rules (15) and by eliminating the time delay of the estimation (16). A fast algorithm for the computation of the repeated median ensures applicability in online-monitoring situations (17).
In a recent publication, an algorithm for preprocessing patient monitoring data was proposed that splits the signal into line segments of different lengths (18). This online-segmentation method combines linear least-squares regression for data approximation, the cumulative sum technique for determining whether the current approximation is still acceptable, and artifact rejection based on fixed thresholds. The adequacy of threshold alarms for oxygen saturation based on the resulting signal was investigated with clinical data.
Artifact Filters. Mäkivirta et al. (19) used median filters to eliminate noise and particularly artifacts before the actual analysis of monitoring data. Their dual-limit alarm system with two median filters achieved an increase in the proportion of true alarms and a decrease in the frequency of false alarms per hour. The sensitivity was not degraded.
Statistical Process Control. Methods of statistical process control are widely used for controlling industrial production processes in real time. Control charts like the Shewhart chart (20) are used in alarm systems to detect “out-of-control” states in a process. The first approach to analyze medical data with respect to the detection of systematic changes by means of statistical process control can be traced back to Trigg (21). Kennedy (22) used a modified version of Trigg’s Tracking Variable to detect the onset of changes in systolic blood pressure. With simulated data, the method identified 94.1% of all changes correctly, whereas anesthesiologists correctly detected 85%. An overview of other early process control procedures for patient monitoring is given by Hill and Endresen (23).
Conventional industrial control charts are based on a target reference value. An application of this concept to physiologic data is problematic because a unique target value often does not exist because of intraindividual (within a patient) and interindividual (between patients) variance. Another shortcoming of conventional process control methods is that they do not account for autocorrelations (i.e., temporal dependencies in the data) and do not discriminate between clinically relevant patterns such as outliers, level shifts, and trends.
Statistical Pattern Detection with Time-Series Analysis. The task of identifying and discriminating between different patterns of change in physiologic monitoring data can be addressed by the application of statistical time series analysis methods, techniques for the assessment of single or multiple variables in the course of time.
Time-series analysis techniques have been used in various psychological, sociological, epidemiological, and other studies (24). Since the late 1980s, time series methods have been increasingly applied in intensive care medicine. It has been shown that time series analysis is suitable for retrospective analysis of physiologic variables (25). The online, i.e., real-time, analysis of monitoring data from intensive care is a statistically similar but methodologically more demanding task.
Dynamic Linear Models and Kalman Filters. An early attempt to apply time series methods to online monitoring data was made by Smith et al. (26), who used a multiprocess Kalman filter to monitor patients after kidney transplantation. This approach leads to online-probabilities for the occurrence of changes in the examined time series and identifies the type of change. A validation of this procedure was performed by Trimble et al. (27). Other research using modeling of biomedical time series with multiprocess dynamic linear models is described by Gordon and Smith (28). Daumer and Falk (29) modified the procedure of Smith et al. with the goal of detecting change-points in the time series. So far none of these approaches has been implemented in commercial products. One of the reasons is their strong sensitivity to misspecification of the model parameters.
Autoregressive (AR) models and self-adjusting thresholds. In recent years, AR models have been applied for the analysis of intensive care data. An AR model of order p describes the observation at one time point as a linear transformation of the p previous observations plus a random error. It was shown that low-order AR models are particularly suitable for the modeling of physiologic variables in the steady state. Pattern detection (i.e., the identification of deviations from the steady state) can be done by comparing the incoming observations with confidence intervals for the predictions. This principle of self-adjusting thresholds accounts for the physiologic and pathophysiologic intra- and interindividual variability of the critically ill. This approach has shown promising performance in detecting clinically relevant patterns in monitoring time series (24,25,30).
Phase-space embedding. Gather et al. (31) transformed the dynamical information of the medical time series into geometric information by regarding several consecutive observations as a point in the Euclidean space. Based on this so-called “phase-space representation,” a procedure was developed that can be regarded as a general Shewhart control chart for dependent data. The method is especially designed for the online identification of outliers and level shifts.
In a study by Gather et al. (32), dynamic linear models, autoregressive models, and phase-space models were compared with respect to their ability to detect outliers, level changes, and trends in time series of physiologic monitoring data. It could be shown that different patterns are best detected with different time series methods.
Trend detection and curve fitting. In one recent publication, a trend detection algorithm was described (33). Here, a trend is defined by certain criteria (e.g., the change in the average heart rate between the current and the previous minute is more than a specified value) which are evaluated at every time point. Each outcome is assigned a score. A trend is detected when the sum of all scores exceeds a trigger value.
In an earlier study, trends were detected by using least squares regression of median filtered values in a moving window. The reliability of a trend was determined by estimates of the error variance (34). This method was able to detect most of the clinically relevant trends.
Fried and Imhoff (35) adapted Brillinger’s (36) approach for retrospective detection of a monotonic trend to the online-monitoring context by analyzing the data in a moving time window. Time-varying autocorrelations were estimated online. The procedure provides reliable results in case of moderate positive autocorrelations.
Several curve-fitting methods have been applied for the analysis of medical data. Haimowitz et al. (37) developed the program TrenDx for diagnosing pediatric growth disorders and identifying clinically relevant trends in hemodynamics and blood gases in ICUs. The observed time series were compared with predefined patterns, so-called “trend templates.” The main disadvantages of this method are that for a precise analysis, a large number of trend templates must be defined, and that autocorrelations are not accounted for.
Multivariate Statistical Methods. In intensive care, a multitude of variables is measured over the course of time. A dimension reduction of the data on which decisions are based can have several advantages. On the one hand, it can lead to better interpretability of the data. It has been shown that humans, including experienced physicians, are not able to develop a systematic response to a problem involving more than seven variables (38). On the other hand, it is known that the analysis of high-dimensional data with statistical methods often suffers from the sparseness of the sample space. This phenomenon is called the “curse of dimensionality” (39). A good representation of the data in a space with lower dimension can therefore help to solve these problems.
One way to achieve a dimension reduction is to select a subset of the most important variables. In clinical practice this is routinely done by physicians who select a few variables subjectively according to their experience and base their decisions on them. A statistical approach is to use graphical models (40) that allow assessing linear and possibly time-lagged associations between variables. The practical value of this method for the analysis of medical data was investigated by Gather et al. (41). Known relationships within the hemodynamic system could be reidentified by graphical models calculated for critically ill patients.
Another way to reduce the dimension of data is to search for combinations of the observed variables which contain as much information as possible. Statistical methods that can be helpful in this respect are principal component analysis and factor analysis. Principal component analysis is aimed at finding a parsimonious joint description for all variables. Factor analysis assumes that there are a few latent variables (factors) that drive the series and cause the correlations between the observable variables. In a case study, Gather et al. (42) investigated the use of dynamical factor analysis for monitoring time series. Although a statistical reduction of dimension is feasible, the resulting factors may not be intuitively interpretable by the caregiver. Therefore, further research in this area is still needed.
Study results for statistical approaches are summarized in Table 3.
Artificial Intelligence Approaches
Several methods from the field of artificial intelligence have been applied in intensive care medicine and anesthesiology. An overview is given by Hanson and Marshall (43) and Krol and Reich (44). This section focuses on procedures developed for patient monitoring.
Knowledge-Based Approaches. Initially, rule-based systems were developed using expert knowledge in an explicit decision tree. Although this approach is cumbersome, several studies reported encouraging results (e.g., a significant reduction in response time of the clinician toward the cause of the alarm (45).
Koski et al. (46) reported on a knowledge-based system for alarm-based diagnoses (12). Although investigations indicated that this approach performs well with respect to the correct detection of pathological states, the research project did not result in further publications or in a commercial product despite significant industrial funding.
Knowledge Discovery Based on Machine Learning. Rule-based clinical decision support systems in critical care that rely on manually acquired expert knowledge (47) are not always validated against clinical data and may not represent actual clinical practice. Machine learning methods appear very promising for the revision, refinement, and extension of knowledge bases. Examples include knowledge-based systems with revision and learning capabilities or support vector machines.
Imhoff et al. (48) and Morik et al. (49) developed a system that combines pattern detection using phase space models (31), support vector machines (50) for learning state-action rules (rules for the appropriate intervention for a given physiological state) and a first-order logic representation in MOBAL (51) for modeling medical knowledge in terms of action-effect rules (rules for the expected effect of a given intervention). The knowledge base is constantly validated against patient data. The VIE-VENT system developed by Miksch et al. (52) for monitoring and therapy planning for mechanically ventilated newborn infants is comparable with the previous approach because it also combines numerical data and a knowledge base. Other approaches to learning decision rules from simulated or real-world data include clinical work by Müller et al. (53) based on theoretical work by Talmon (54) on multiclass nonparametric partitioning algorithms.
Neural Networks. Artificial neural networks can be used to make predictions for specific outcome variables on the basis of input observations. Neural networks attempt to imitate biological nervous systems and “learn” the input-output relationship from training data sets that contain examples of inputs together with the corresponding outputs. Based on a training data set that consists of measurements of physiological variables together with information about the respective state of the patient (e.g., obtained by annotating the observation period), the neural network learns which patterns of the input data indicate a stable patient state and which patterns mark deviations from stability.
Several studies examined the use of the neural network approach for the construction of alarm systems. Ulbricht et al. (55) used neural networks in an alarm system that reported suspicious patterns in cardiotocogram data. Farell et al. (56) and Orr and Westenskow (57) developed a neural network-based alarm system to identify several specific faults in an anesthesia breathing circuit. Tsien (58) used neural networks to detect events of interest in vital signs of pediatric ICU patients. The method has the drawbacks that the learning behavior is not reproducible, a long training phase is required, and training is not possible for primarily unstable patients.
Fuzzy Logic. The concept of fuzzy logic was introduced by Zadeh (59), and is now used in a variety of industrial applications, particularly in so-called “fuzzy control systems.” It has proven to be useful in situations where it is difficult to describe the process of interest with an exact mathematical model. Therefore, this approach may be suitable for intensive care medicine, where experience and intuition play an important role in decision-making (60). A basic component in fuzzy control systems is the definition of “fuzzy sets” for a measured variable, which divides the range of the variable into classes that are possibly overlapping so that an observed value can belong to more than one class with different membership degrees. This accounts for situations in which disjoint classes are hard to define. An example from intensive care is the difficulty of determining a precise alarm threshold for a monitored variable. After determining fuzzy sets for the variables of interest, rules can be defined that classify the different combinations of the sets according to expert knowledge and decide about an alarm. In the context of patient monitoring, fuzzy logic has been used for the development of alarm systems for anesthesia (61,62), for the monitoring of preterm infants (63), and for alarm validation (64,65).
Bayesian Networks. Bayesian networks, also called causal probabilistic networks, can be used in intensive care monitoring for calculating and updating probabilities for the occurrence of events of interest. Laursen (66), for example, designed a network that uses as input the observations of a set of physiological variables (e.g., mean arterial blood pressure, central venous pressure, etc.). Each time new observations of these variables are obtained, the probabilities for the events “cardiovascular patient event” and “measurement error” are updated using Bayes’ theorem, and form the output of the system. A disadvantage of Bayesian networks is that a large amount of prior information is needed, because the dependencies between the different variables have to be defined and quantified in terms of conditional probabilities.
Study results for artificial intelligence approaches are summarized in Table 4.
In summary, different methods from different, even diverse methodological fields have shown promising results, but none of these approaches has advanced into the mainstream of patient monitoring.
Monitoring is the serial evaluation of time-stamped data (67). Monitoring is done to automatically detect life-threatening events, to warn caregivers of imminent danger, or to identify diagnostic entities. Unfortunately, the rate of false alarms is staggering, with the majority of monitoring alarms being false positives (5,67). This has basically not changed over the last 20 years, despite significant advances in medical device technology. Some alarm and diagnostic features have benefited from these technological advances (e.g., in the digital detection of arrhythmia events or in signal processing for pulse oximetry). But most alarms are still simple threshold alarms generating many false positives without real clinical meaning (1,3).
Most of the official guidance documents and standards, such as IEC 60601-1-8, refer to alarm annunciation but not to the actual alarm detection (i.e., alarm algorithms) (68). A few variable or device-specific documents (e.g., EN864 for capnometers and IEC 62D/60601-2-54 for pulse oxymeters [69,70]) offer guidance with reference to alarm thresholds and temporal alarm behavior but still provide no specific help as to the actual methods of alarm detection.
We have tried to provide an overview of the current status of alarm algorithms in monitoring devices. Unfortunately, most vendors were not very responsive to a request for information about the alarm methodologies built into their products. Other authors have faced this problem. Therefore, the information in Table 2 must be regarded as incomplete, because it could only be compiled from the official user guides available on some major vendors’ websites. One may surmise that the reluctance to provide information may be an indication that most vendors do not have advanced alarm algorithms to offer, but this is speculative.
Although some manufacturers have tried to improve alarm algorithms in their products, there are obvious reasons why alarm handling has not seen the same progress as the rest of monitoring technology. One important reason is that in the current legal and regulatory environment, manufacturers have a vital interest in the most sensitive alarm algorithms, such that no critical event goes undetected. On the other hand, this leaves the responsibility to caregivers, who often disable alarms because of their excessive false-positive rates. This may in fact lead to a situation in which all alarms in the operating room are turned off and could have grave consequences in the case of a device malfunction, disconnection, etc. But the root cause for patient harm in such a scenario would be the poor quality of alarm algorithms.
Another reason for the slow progress may be that the market will not honor better alarm algorithms to the extent that it makes commercial sense for the manufacturers. Probably the most important reason is that there are actually no comprehensive alarm algorithm concepts available that have proven to be better than, and at least as safe as, the existing systems in a general population of patients.
Statistical methods applied with the goal of patient monitoring offer possibilities for signal extraction, artifact filtering, process control, pattern detection, and dimension reduction. Signal extraction and artifact filtering can be seen as methods for preparing the physiologic time series for further analysis. Applying statistical process control to patient monitoring has been tried, but the usefulness of conventional methods is limited. Statistical methods of time series analysis are better suited for the recognition of clinically relevant patterns such as outliers, level shifts, and trends. Univariate methods have reached a level of performance where they can be used in clinical alarm systems. Several multivariate statistical methods, such as graphical models, principal component analysis, and factor analysis, have been applied to intensive care data to achieve dimension reduction. These methods have already proven valuable in retrospective analysis, providing a better understanding of dependencies in physiological data. Their applicability to clinical online monitoring is still limited by issues of interpretability (41,42).
In the field of artificial intelligence, knowledge-based methods, machine learning, neural networks, fuzzy logic, and Bayesian networks have been applied in the context of patient monitoring. Knowledge-based methods allow the incorporation of medical expert knowledge, in the form of decision-making rules, into the monitoring process. A revision and expansion of such knowledge bases can be achieved by using machine learning techniques. These methods can be extremely helpful in development and maintenance of deterministic rule bases, which may be used for advanced clinical decision support. They may even be applicable to knowledge bases for online control of therapeutic devices.
Neural networks can “learn” to predict the patient state from the values of the physiologic variables, but they are not suitable for patients who are unstable during the initial learning phase. Fuzzy logic has been used in intensive care monitoring because of its ability to model uncertainty and ambiguity, both of which often occur in the medical decision-making process. In Bayesian networks, probabilities for the occurrence of certain medical events can be calculated and continually updated by propagating new information through the net.
Most of these methods, especially the widely used neural networks, are not fully deterministic and are therefore not completely predictable in their behavior (for instance, when they are trained on the currently monitored patient). This may not only cause clinical problems but can also be a significant obstacle to regulatory approval. Therefore, in most applications of neural or Bayesian networks, the learning capabilities are only used in the development of the decision functions. These functions are then “frozen” for the final clinical applications. But this process raises further questions about the validity of the historic learning approach:
- Which patient populations should be used for learning?
- What are the correct definitions of, for example, a truly positive alarm situation?
- How large a data set is needed?
- Does the training sample represent the wide range of patients for whom the final device/system will be used?
In summary, an array of methods from different fields appears promising for a new comprehensive alarm detection concept. But the inability of some of the investigated methods to satisfy methodological, computational, clinical, and regulatory requirements in a general patient population may explain why alarm handling in commercially available medical devices still lags behind the technological advances in this field. Therefore, further research into new alarm algorithms is urgently needed to lay the foundation of improved alarm handling.
This requires not only methodological rigor and a good understanding of the clinical problems to be solved but also adequate matching of statistical and computer science methods to clinical applications. It is important to see the broad picture of monitoring requirements in general patient populations. Many researchers have chosen one or two methods without prior analysis of alternative methods and applied them to one highly specific clinical problem. Moreover, without further improvements in signal acquisition and strengthening of signal extraction, many promising alarm detection and alarm validation methods cannot live up to their true potential. It is important to keep the sequence of signal acquisition/extraction, alarm detection, and alarm validation in mind when designing new studies of medical device alarms.
Besides methodological research, extensive validation and verification against “real-world” data are needed to assure the safety and efficacy of new algorithms. This requires the acquisition of clinically annotated monitoring and device data in adequately large clinical studies. It also requires close and trustful cooperation between researchers and industry. Moreover, development of new alarm algorithms and, later, new alarm annunciation concepts, needs to receive the attention it deserves.
1. Chambrin MC, Ravaux P, Calvelo-Aros D, et al. Multicentric study of monitoring alarms in the adult intensive care unit (ICU): a descriptive analysis. Intensive Care Med 1999;25:1360–6.
2. O’Carroll TM. Survey of alarms in an intensive therapy unit. Anaesthesia 1986;41:742–4.
3. Lawless ST. Crying wolf: false alarms in a pediatric intensive care unit. Critical Care Med 1994;22:981–5.
4. Koski EM, Mäkivirta A, Sukuvaara T, Kari A. Frequency and reliability of alarms in the monitoring of cardiac postoperative patients. Int J Clin Monit Comput 1990;7:129–33.
5. Tsien CL, Fackler JC. Poor prognosis for existing monitors in the intensive care unit. Crit Care Med 1997;25:614–9.
6. Biot L, Carry PY, Perdrix JP, et al. Évaluation clinique de la pertinence des alarmes en réanimation. Ann Fr Réanim 2000;19:459–66.
7. Meredith C, Edworthy J. Are there too many alarms in the intensive care unit? An overview of the problems. J Adv Nurs 1995;21:15–20.
8. Koski EM, Mäkivirta A, Sukuvaara T, Kari A. Clinicians’ opinions on alarm limits and urgency of therapeutic responses. Int J Clin Monit Comput 1995;12:85–8.
9. Cropp AJ, Woods LA, Raney D, Bredle DL. Name that tone: the proliferation of alarms in the intensive care unit. Chest 1994;105:1217–20.
10. Momtahan K, Retu R, Tansley B. Audibility and identification of auditory alarms in the operating room and intensive care unit. Ergonomics 1993;36:1159–76.
11. Fried R, Gather U, Imhoff M, Bauer M. Statistical methods in intensive care online monitoring, 2000. Technical Report 33/00, SFB 475. University of Dortmund. Available from http://www.sfb475.uni-dortmund.de/berichte/tr33-00.ps
12. Sukuvaara T, Koski EM, Mäkivirta A, Kari A. A knowledge-based alarm system for monitoring cardiac operated patients—technical construction and evaluation. Int J Clin Monit Comput 1993;10:117–26.
13. Davies PL, Fried R, Gather U. Robust signal extraction for on-line monitoring data. J Stat Plan Infer 2003;122:65–78.
14. Siegel AF. Robust regression using repeated medians. Biometrika 1982;68:242–4.
15. Fried R. Robust filtering of time series with trends. J Nonparametr Stat 2004; 16: 313–28 (Special Issue).
16. Gather U, Schettlinger K, Fried R. Online signal extraction by robust linear regression. Computational Statistics. In press.
17. Bernholt T, Fried R. Computing the update of the repeated median regression line in linear time. Inform Process Lett 2003;88:111–7.
18. Charbonnier S, Becq G, Biot L. On-line segmentation algorithm for continuously monitored data in intensive care units. IEEE Trans Biomed Engl 2004;51:484–92.
19. Mäkivirta A, Koski E, Kari A, Sukuvaara T. The median filter as a preprocessor for a patient monitor limit alarm system in intensive care. Comput Methods Programs Biomed 1991;34:139–44.
20. Shewhart WA. Economic control of quality manufactured product. Princeton, NJ: D. Van Nostrand Reinhold, 1931.
21. Trigg DW. Monitoring a forecasting system. Operat Res Q 1964;15:271–4.
22. Kennedy RR. A modified Trigg’s tracking variable as an “advisory” alarm during anaesthesia. Int J Clin Monit Comput 1995;12:197–204.
23. Hill DW, Endresen J. Trend recording and forecasting in intensive care therapy. Br J Clin Equipment 1978; 5–14.
24. Imhoff M, Bauer M, Gather U, Löhlein D. Statistical pattern detection in univariate time series of intensive care on-line monitoring data. Intensive Care Med 1998;24:1305–14.
25. Imhoff M, Bauer M, Gather U, Löhlein D. Time series analysis in intensive care medicine. Appl Cardiopulmonary Pathophysiol 1997;6:263–81.
26. Smith AFM, West M, Gordon K, et al. Monitoring kidney transplant patients. Statistician 1983;32:46–54.
27. Trimble IM, West M, Knapp MS, et al. Detection of renal allograft rejection by computer. BMJ 1983;286:1695–9.
28. Gordon K, Smith AFM. Modeling and monitoring biomedical time series. J Am Stat Assoc 1990;85:328–37.
29. Daumer M, Falk M. Online change point detection (for state space models) using multi-process Kalman filters. Linear Algebra Appl 1998;284:125–35.
30. Imhoff M, Bauer M, Gather U, Fried R. Pattern detection in intensive care monitoring time series with autoregressive models: influence of the model order. Biometrical J 2002;44:746–61.
31. Gather U, Bauer M, Fried R. The identification of multiple outliers in online monitoring data. Estadistica 2002;54:289–338.
32. Gather U, Fried R, Imhoff M. Online classification of states in intensive care. In: Gaul W, Opitz O, Schader M, eds. Data analysis. Berlin: Springer, 2000:413–28.
33. Schoenberg R, Sands DZ, Safran C. Making ICU alarms meaningful: a comparison of traditional vs. trend-based algorithms. Proc AMIA Symp 1999; 379–83.
34. Koski EM, Mäkivirta A, Sukuvaara T, Kari A. Development of an expert system for haemodynamic monitoring: computerized symbolization of on-line monitoring data. Int J Clin Monit Comput 1991–92; 8: 289–93.
35. Fried R, Imhoff M. On the online detection of monotonic trends in time series. Biometrical J 2004;46:90–102.
36. Brillinger DR. Consistent detection of a monotonic trend superposed by a stationary time series. Biometrika 1989;76:23–30.
37. Haimowitz I, Le PP, Kohane IS. Clinical monitoring using regression-based trend templates. Artif Intell Med 1995;7:473–96.
38. Miller G. The magical number seven, plus or minus two: some limits to our capacity for processing information. Psychol Rev 1956;63:81–97.
39. Gather U, Becker C. The curse of dimensionality—a challenge for mathematical statistics. Jahresbericht der Deutschen Mathematiker-Vereinigung 2001;103:19–36.
40. Dahlhaus R. Graphical interaction models for multivariate time series. Metrika 2000;51:157–72.
41. Gather U, Imhoff M, Fried R. Graphical models for multivariate time series from intensive care monitoring. Stat Med 2000;21:2685–701.
42. Gather U, Fried R, Lanius V, Imhoff M. Online monitoring of high dimensional physiological time series—a case study. Estadistica 2001;53:259–98.
43. Hanson CW, Marshall BE. Artificial intelligence applications in the intensive care unit. Crit Care Med 2001;29:427–35.
44. Krol M, Reich DL. Expert systems in anaesthesiology. Drugs Today 1998;34:593–601.
45. Westenskow DR, Orr JA, Simon FH, et al. Intelligent alarms reduce anesthesiologist’s response time to critical faults. Anesthesiology 1992;77:1074–9.
46. Koski EM, Sukuvaara T, Mäkivirta A, Kari A. A knowledge-based alarm system for monitoring cardiac operated patients—assessment of clinical performance. Int J Clin Monit Comput 1994;11:79–83.
47. Morris AH. Computerized protocols and beside decision support. Crit Care Clin 1999;15:523–45.
48. Imhoff M, Gather U, Morik K. Development of decision support algorithms for intensive care medicine: a new approach combining time series analysis and a knowledge base system with learning and revision capabilities. In: Burgard, Christaller, Cremers, eds. KI-99, Advances in Artificial Intelligence, 23rd Annual German Conference on Artificial Intelligence.
49. Morik K, Imhoff M, Brockhausen P, et al. Knowledge discovery and knowledge validation in intensive care. Artif Intell Med 2000;19:225–49.
50. Vapnik V. Statistical learning theory. New York: Wiley, 1998.
51. Morik K, Wrobel S, Kietz LU, Emde W. Knowledge acquisition and machine learning—theory, methods and applications. London: Academic Press, 1993.
52. Miksch S, Horn W, Popow C, Paky F. Utilizing temporal abstraction for data validation and therapy planning for artificially ventilated newborn infants. Artif Intell Med 1996;8:543–76.
53. Müller B, Hasman A, Blom JA. Evaluation of automatically learned intelligent alarm systems. Comput Methods Programs Biomed 1997;54:209–26.
54. Talmon JL. A multiclass nonparametric partitioning algorithm. Pattern Recognit Lett 1986;4:31–8.
55. Ulbricht C, Dorffner G, Lee A. Neural networks for recognizing patterns in cardiotocograms. Artif Intell Med 1998;12:271–84.
56. Farrell RM, Orr JA, Kuck K, Westenskow DR. Differential features for a neural network based anesthesia alarm system. Biomed Sci Instrum 1992;28:99–104.
57. Orr JA, Westenskow DR. A breathing circuit alarm system based on neural networks. J Clin Monit 1994;10:101–9.
58. Tsien CL. Event discovery in medical time-series data. Proc AMIA Symp 2000: 858–62.
59. Zadeh LA. Fuzzy sets. Inform Control 1965;8:338–53.
60. Bates JHT, Young MP. Applying fuzzy logic to medical decision making in the intensive care unit. Am J Respir Crit Care Med 2003;167:948–52.
61. Becker K, Thull B, Kasmacher-Leidinger H, et al. Design and validation of an intelligent patient monitoring and alarm system based on a fuzzy logic process model. Artif Intell Med 1997;11:33–53.
62. Lowe A, Jones RW, Harrison MJ. A graphical representation of decision support information in an intelligent anaesthesia monitor. Artif Intell Med 2001;22:173–91.
63. Wolf M, Keel M, von Siebenthal K, et al. Improved monitoring of preterm infants by fuzzy logic. Technol Health Care 1996;4:193–201.
64. Oberli C, Urzua J, Saez C, et al. An expert system for monitor alarm integration. J Clin Monit Comput 1999;15:29–35.
65. Zong W, Moody GB, Mark RG. Reduction of false arterial blood pressure alarms using signal quality assessment and relationships between the electrocardiogram and arterial blood pressure. Med Biol Engl Comput 2004;42:698–706.
66. Laursen P. Event detection on patient monitoring data using causal probabilistic networks. Method Inform Med 1994;33:111–5.
67. McIntosh N. Intensive care monitoring: past, present and future. Clin Med 2002;2:349–55.
68. IEC. Medical electrical equipment, Part 1-8: general requirements for safety – Collateral standard: general requirements, tests and guidance for alarm systems in medical electrical equipment and medical electrical systems. IEC/TRF 60601-1-8 - Ed. 1.0, 2004.
69. Medical electrical equipment. Capnometers for use with humans. Particular requirements. BS EN 864, 1997.
70. IEC. Medical electrical equipment, Part 2-54: particular requirements for the basic safety and essential performance of pulse oximeters for medical use. IEC 60601-1-2-Consol. Ed. 2.1, 2004.
1 The terms variable and parameter are often used interchangeably for the observed and measured instance. To avoid confusion, we follow the statistical definitions: in the context of statistics in this article the word variable is used to refer to a measurable factor, characteristic, or attribute of an individual or a system (e.g., heart rate, arterial blood pressure, or tidal volume measured by a medical device). A parameter, in this context, is a value determining the properties of a function or model describing the general behavior of a variable. Model parameters can be estimated from the data gathered by measuring variables.