Deep Learning-Based Talking Heart Eases the Identification of Abnormal Cardiac Sounds in Phonocardiogram Recordings

Mohanad Alkhodari, Leontios Hadjileontiadis, Ahsan Khandoker
Khalifa University


Introduction: The assessment of congenital heart diseases is achieved non-invasively through cardiac auscultation and analysis of irregular heart sounds in phonocardiogram (PCG) recordings. However, the visual interpretation of thousands of patient recordings is still considered a stressful, time-consuming, and erroneous diagnosis task. Methods: Here, we propose a novel deep learning approach, named talking heart, based on 12 multi-head attention blocks to generate transformer-encoded features that resemble text transformation of cardiac sounds, and thus, make it easier to automatically discriminate murmurs from normal heart sounds. In this approach, we used a pre-trained wav2vec network on 960 hours speech data to extract convolutional neural network (CNN)-based features of the input PCG signals. Then, we passed the features to the multi-head attention and feed-forward blocks to generate the final transformer-encoded features. Lastly, we feed these features as an input to a four-block generic temporal convolutional network (TCN) that applies sequential one-dimensional (1D) convolutions for training and prediction. Results: Our team, Care4MyHeart, achieved a challenge score of 1407 during the unofficial phase and an overall 10-fold cross-validation area under the receiver operating characteristics (AUROC), area under the precision-recall curve (AUPRC), accuracy, F-measure, and challenge score of 0.84, 0.76, 0.74, 0.66, and 1528, respectively. Conclusion: This study paves the way towards implementing deep learning for the detection of abnormal heart sounds, thus, congenital diseases could be well-treated at an early stage of patient life.