Automated Identification of Label Errors in Large Electrocardiogram Datasets

Peter Doggart1 and Alan Kennedy2
1PulseAI, 2PulseAI Ltd


Introduction Training Deep Neural Networks (DNN) for automated electrocardiogram (ECG) interpretation requires large datasets. These datasets are commonly extracted at scale from electronic patient healthcare records in hospitals. Typically, a single physician over-reads the machine generated interpretation as part of standard care. Incorrect interpretation of the ECG occurs frequently, reducing the quality of the labels.

Methods We trained a DNN to identify six ECG rhythms based on morphology; Sinus Rhythm, Junctional Rhythm, Ectopic Atrial Rhythm, Atrial Flutter, Atrial Fibrillation and Ventricular Rhythm. The DNN was trained on a balanced dataset of 34,703 randomly sampled ECGs taken from a proprietary 12-lead database. We then applied confident learning techniques using the DNN to identify label errors in the Physionet PTB-XL database, which is publicly available.

Results The confident learning algorithm identified 767 potential rhythm label errors in the 21,837 ECGs in PTB-XL database (3.51%). The labels were sorted by the likelihood of label error based on the predicted probabilities, and then the top 100 ECGs were manually reviewed. Of these ECGs, 61 were found to be incorrectly labelled (61%). The most common errors were unreadable ECGs due to noise (25 ECGs), Atrial Flutter mislabelled as Atrial Fibrillation (17 ECGs) and Sinus Rhythm with Ectopy mislabelled as Atrial Fibrillation (6 ECGs).

Conclusion Removing incorrectly labelled ECGs is important for Deep Neural Network development, especially for low prevalence classes, which can be significantly impacted by noisy labels. In this study, we demonstrated confident learning techniques can be applied to automatically identify potential labelling errors in ECG datasets. Once identified, these labels can either be manually corrected or the ECGs excluded.