Using Dynamic Time Warping and Agglomerative Clustering of ECG data to group distinct PQRST morphologies in patients with Chagas Disease

Carlos Magno Dantas de Figueirêdo Belém1 and João Paulo do Vale Madeiro2
1Universidade Federal do Ceará, 2Federal University of Ceará


Abstract

Chagas disease, a significant public health issue in Latin America, affects millions with a risk of severe cardiac abnormalities. In this study, we aimed to capture distinct clusters of PQRST morphologies in individuals diagnosed with Chagas cardiomyopathy with the use of unsupervised learning. We derived a representative average heartbeat from 5-minute segments of lead D-II, randomly chosen from each file in a database of 409 Holter recordings. Utilizing Dynamic Time Warping (DTW), we computed a distance matrix for this dataset, enabling us to perform agglomerative hierarchical clustering. The analysis assessed various cluster numbers (K) against silhouette scores for three linkage methods. Upon performing a visual inspection of the original signals in the resulting clusters, we confirmed the coherence and meaningful separation of the different PQRST morphologies. Using complete-linkage and a K value of 20 yielded a silhouette score of 0.183. In this particular configuration, the grouping captured clinically relevant patterns, with clusters featuring prevalent notched S-waves, T-wave inversions, hyperacute T-waves as well as net positive or net negative QRS complexes. Moreover, some clusters concentrated noisy or low-resolution ECGs. We also observed that increasing the number of clusters leads to a redistribution of morphological characteristics, refining the homogeneity of the original group. This study highlights the potential of this clustering approach in stratifying ECG data for high-level analysis, segregating ECGs with notable noise levels, and providing a tool for rapid morphological categorization. The application of average heartbeat analysis with DTW for clustering makes the process computationally feasible, a pertinent advantage given the quadratic complexity of DTW over longer timescales.