A permutation entropy-based approach for early arrhythmia detection based on clustering models and deep learning assisted by genetic algorithms.

Zayd Isaac Valdez1, Luz Alexandra Dı́az1, Roy Samuel Valdez2, Antonio Gabriel Ravelo-Garcı́a3, Miguel Vizcardo4
1Escuela Profesional de Fı́sica, Universidad Nacional de San Agustı́n de Arequipa, 2Unidad de Posgrado de la Facultad de Medicina, Universidad Nacional de San Agustín de Arequipa, 3Institute for Technological Development and Innovation in Communications, Universidad de Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain Interactive Technologies Institute (ITI/LARSyS and ARDITI), 9020-105 Funchal, Portugal, 4Universidad Nacional de San Agustin de Arequipa


Abstract

Arrhythmias are disturbances in the electrical activity of the heart that can lead to a wide range of clinical manifestations, from benign symptoms to life-threatening events. This study presents a methodology for the early detection and classification of arrhythmia based on the analysis of 2-second ECG fragments derived from long-duration recordings. The data source is the ECG Fragment Database for the Exploration of Dangerous Arrhythmia from PhysioNet, which includes multi-minute ECG recordings segmented into short, high-resolution fragments. The database categorizes records into six arrhythmia classes to identify the presence of life-threatening ventricular fibrillation (class 1) and to identify warning signs such as high-frequency ventricular tachycardia (class 3) and torsade de pointes ventricular tachycardia (class 2).  Classes 4, 5, and 6 represent another group of dangerous arrhythmias. The proposed pipeline begins with a signal preprocessing stage, where a two-step filter is applied. The first step uses Approximate Entropy (ApEn) as a filter to quantify signal regularity and detect high-noise segments. Segments with entropy values above a defined threshold are corrected using polynomial interpolation. In the second step, a Normalized Least Mean Squares (NLMS) adaptive filter refines the signals using a reference channel. Subsequently, the Permutation Entropy (PE) of each 2-second segment is computed in order to extract features that capture the nonlinear and temporal complexity of the ECG signal. These features are then utilized to train a deep neural network classifier, which is optimized for distinguishing between various classes of arrhythmia. To enhance the model's generalization capabilities and address class imbalance, an unsupervised clustering approach using k-means is employed, followed by a genetic algorithm-driven data augmentation strategy. This hybrid model, combining entropy-based feature extraction, with accuracy values exceeding 90%. The system has been designed to contribute to early diagnostic workflows by providing rapid and reliable identification of high-risk ECG patterns