Semi-supervised Noise Label Learning On 12-lead ECG Records For Chagas Disease Detection

Chen Hao1, Jiang Songtao1, Hui Fang2, Yu Chen1, Baotong Liu3, Jing Qin4, Lu Liu5
1Dalian University, 2Loughborough University, UK, 319824348634, 413516059892, 515941125620


Abstract

Electrocardiograms (ECG) provide a way to identify potential cases of Chagas disease, a parasitic disease in Central and South America. Despite the usefulness of 12-lead ECG records for the disease screening, the patient records are insufficient and many of them are self-reported, containing large number of false labels. To achieve a generalisable and accurate ECG analysis model, we propose a semi-supervised noise label learning approach to tackle the challenges by leveraging ECG records from several datasets, including SaMi-Trop dataset - a small dataset with reliable labels, PTB-XL dataset with large amount of healthy group data and CODE-15% dataset with self-reported disease cases. After training a base model by using samples from SaMi-Trop dataset, JS divergence is applied to align data distributions of samples from these heterogenous data sources. Following it, we further use data from healthy group and pseudo-labelled data filtered by confidence level from self-reported group to fine-tune our model. We achieve 0.9064 in terms of the TPR @ 5 % score by using the validation set from the CODE-15% dataset.