Session P92.5
Searching by Example in Multi-Parameter Time Series Databases
LH Lehman*, M Saeed, GB Moody, RG Mark
Massachusetts Institute of Technology
Cambridge, MA, USA
Robust navigation and mining of physiologic time-series databases often require finding similar temporal patterns of physiological responses. In collections such as the MIMIC II database, we seek cases in which trends and interrelationships among vital signs exhibit patterns resembling those of a prototype (selected) case. Detection of these complex physiological patterns not only enables demarcation of important clinical events but can also elucidate hidden dynamical structures that may be suggestive of disease processes.
We represent time series segments by feature vectors that reflect the dynamical patterns of single and multi-dimensional physiological time series. Features include regression slopes at varying time scales, maximum transient changes, auto-correlation coefficients of individual signals, and cross correlations among multiple signals. We model the dynamical patterns with a Gaussian mixture model (GMM) learned with the expectation maximization algorithm, and compute similarity between segments as Mahalanobis distances.
We tested our method first using the UC-Irvine Control Chart data set, consisting of 600 synthetic time series (100 examples of each of six types). Using a six-component GMM to find the best matches for a randomly selected case, we found 94 members of the target type among the 99 nearest neighbors, on average. In another experiment, we examined heart rate and blood pressure trends from tilt tests of ten healthy human subjects (10 hours in all, containing 120 responses to six different interventions). Using our method to find matches for randomly selected sets of three responses to the same intervention, we were typically able to find 60% to 80% of the remaining responses of the target type among the nearest neighbors. Our results also suggest that judicious selection of patterns can improve the retrieval accuracy significantly. We have begun to explore a set of 2500 cases from the MIMIC II database (approximately 200000 hours of multi-parameter time series) using our method. A pilot study of response to withdrawal of pressors illustrates the potential of our search-by-example method to identify relevant cases with known outcomes.(Abstract Control Number: 349)