We propose an automatic classification algorithm to identify heart murmurs in phonocardiogram (PCG) that combines a neural network classifier trained directly on the raw signal’s spectrogram and a gradient boosting classifier trained on handcrafted features. Models that combine information across feature types have found broad success, such as in the 2016/2020 PhysioNet Challenges. We hypothesize that the neural network will be able to learn to identify heart murmur patterns within the time-frequency domain with the boosting classifier augmenting the final decision with physician-inspired, human-interpretable information.
For each patient, each of the PCG recordings at four classic auscultation locations is downsampled to 1000 Hz, before being truncated to 5 seconds in order to enforce uniform time length. These time-series are then transformed into mel-spectrogram and concatenated to be used as inputs into a convolutional neural network (CNN) composed of three sequential blocks of a convolution layer, leaky relu, max-pooling and dropout. Following this, we utilize 2 fully-connected layers with a cross-entropy loss function. Next, we utilize 5 demographic features, 20 hand-crafted time features and 25 frequency features input into a XGBoost classifier with 75 trees and a maximum depth of 8. The final decision rule to classify murmur presence was based on soft voting between the XGBoost and CNN.
The cross-validated score achieved by our team (lubdub) during the unofficial phase is 1561. We plan on modifying our architecture as an end-to-end architecture, replacing XGBoost with a neural network that concatenates physician-inspired features with the learned spectrogram features. We anticipate that this will boost performance due to the final classification layer being able to learn the relationship between the two feature types.