Preliminary Program

Detecting Chagas Disease from the ECG with Sharpness Aware Minimization and Domain Adversarial Learning

Jad Haidamous¹, Philip Hempel², Maurice Rohr³, Tizian Claus Dege⁴, Marcus Vollmer⁵, Nicolai Spicher⁶, Christoph Hoog Antink⁴
¹Technical University Darmstadt, ²Department of Medical Informatics, University Medical Center Goettingen, ³Technical University of Darmstadt, ⁴TU Darmstadt, ⁵Institute of Bioinformatics, University Medicine Greifswald; DZHK (German Centre of Cardiovascular Research), Partner Site Greifswald, ⁶Department of Medical Informatics, University Medical Center Goettingen, Germany

Abstract

Aims: The detection of Chagas disease through serological testing is time-consuming and only available in limited quantities. To ensure adequate prioritization of patients for serological testing, this work aimed to develop an algorithm to detect signs of Chagas disease in electrocardiograms (ECG). Due to the scarcity of serologically validated training data, using self-reported data with potentially incorrect labels is necessary. Therefore, models were adjusted to be robust against label noise.

Methods: Our model combined a pre-trained transformer backbone with a trainable dense classification head. The transformer was trained on the CODE-15% and PhysioNet/Computing in Cardiology Challenge 2021 datasets using the reduced label set (27 classes). This ensured that the transformer learned useful ECG features based on reliable labels instead of learning from scratch using the noisy Chagas labels. The classification head was trained on the SaMi-Trop, CODE-15%, and PTB-XL datasets with binary Chagas labels. The labels are serologically validated, self-reported, and negative based on geography, respectively. Finally, the relationship between the noisy and unobservable clean labels was modeled through a label transition matrix T(x).

T(x) = P(Z|Y,X=x) (1)

where X, Z, and Y denote the random variables of the ECG signals, noisy labels, and clean labels, respectively. Moreover, we aim to train the model with forward correction to leverage T(x).

Results: The algorithm without T(x) achieved a score of 0.45 on the hidden test set in the unofficial phase. The estimation of T(x) still needs to be implemented.

Conclusion: We tackled the challenge by combining a transformer backbone, trained on reliable ECG labels from public datasets, with a dense classification head. The approach showed promise in the unofficial phase with a challenge score of 0.45 (rank 28/141) but also room for improvement. By incorporating the proposed extensions, we aim to further minimize the effect of label noise.