An AutoML pipeline for the classification of noisy and clean electrocardiogram signals using Bayesian optimization

Lucia Billeci1, Lorenzo Bachi2, Chiara Podrecca3, Maurizio Varanini4, Pia Cincione5
1Institute of Clinical Physiology, National Research Council of Italy (CNR), 2Institute of Clinical Physiology, National Research Council of Italy (CNR-IFC), 3Computer and Biomedical Engineering, University of Pavia, Pavia, Italy, 4Institute of Clinical Physiology, National Research Council (CNR), 5Medical Departments University of Foggia


Abstract

Introduction Electrocardiogram (ECG) can be affected by notorious sources of noise including powerline interference, muscle noise and motion artifacts. Although the numerous noise indices in literature, no single index emerges as the best. Our study introduces an AutoML pipeline that leverages multiple indices for ECG noise detection. Methods Our dataset resulted from merging multiple databases and each of the 155643 2 s long windows of signal was labeled as noise or clean. Twelve noise indices were considered, including two novel ones. The dataset was first split in 55% for training, 15% for validation and 30% for testing considering patients' separation. Then the training set was partially balanced by randomly undersampling the majority class (clean). AutoGluon tool was used for the classification. Four different quality presets were tested. The model that provided the best performance (by balancing balanced accuracy – BA - and inference time), underwent Hyperparameters Bayesian Optimization (BO) and the optimized model was tested on the test set. Results On the validation set the Ensemble model of the Medium Preset demonstrated the highest BA (0.952), with the NeuralNetTorch also showing high BA (0.949) and a very low inference time (0.138 s), making it an optimal choice for BO. After optimization, the top three models showed improved BA on both validation and test, with slightly increased but still low inference times. Specifically, on the test set, post-BO accuracies ranged 0.952 to 0.948, with inference times between 0.56 to 0.10 s, an enhancement from pre-BO where accuracies ranged 0.946 to 0.944 with inference times between 2.64 to 0.13 s. Conclusions The ML approach proposed allows to accurately classifying ECG signals with a low inference time, suggesting the possibility to adopt this pipeline in a clinical scenario. Further research may be warranted to expand the database with more noisy and arrhythmic episodes.