Comparative Study on the Generalization Ability of Machine Learning Algorithms for PPG Quality Assessment

Santiago Mula1, Roberto Zangroniz1, Oscar Ayo-Martin2, Jose J Rieta3, Raul Alcaraz1
1University of Castilla-La Mancha, 2Department of Neurology. Complejo Hospitalario Universitario de Albacete. Universidad de Castilla-La Mancha, 3BioMIT.org, Universitat Politecnica Valencia


Abstract

Background and Aim.

Photoplethysmogram (PPG) signals are often too noisy to accurately measure heart rate. Methods for PPG quality assessment based on machine learning (ML) concepts have gained considerable attention, but they have been commonly developed on datasets recorded under specific conditions and not tested on others. Hence, the aim of the present work is to compare their performance and generalization ability on datasets representative of different conditions.

Methodology.

Only datasets containing both PPG and ECG were used to label as clean or noisy on the difference in the inter-beat interval between both synchronous signals, using an error of 3 bpm as the threshold. Then, global and spectral features were extracted from the PPG segments and Decision Tree, SVM, and KNN classifiers were trained on WESAD, a dataset of wrist-captured PPG recordings in daily activities. The models were validated on the similar wrist-captured PPG DaLiA dataset and the finger-captured BIDMC dataset, which consists of recordings from ICU patients.

Results.

The DT-based model reached 100% accuracy (Acc) on training, with a 30% drop in sensitivity (Se) and specificity (Sp) in DaLiA and an unbalanced Se-Sp on BIDMC of 31-84%. The SVM-based model reached an Acc of 82% on training, with a 3% performance drop on Se-Sp in DaLiA and 78-21% Se-Sp in BIDMC. Finally, the KNN-based model yielded an Acc of 85% on training and DaLiA, with an Se-Sp unbalance on BIDMC of 27-96%.

Significance.

Whereas the DT-based model showed significant overfitting on the two validation datasets, the SVM-based and KNN-based models reported a successful performance on the database with similar PPG morphology as the training one. Therefore, the selection of ML classifier and datasets for training and validation are aspects that require special attention in PPG quality assessment.