Vector Signals from the Heart: A Foundation Model Approach to Chagas Detection

Sehun Kim
Samsung Medical Center


Abstract

Background: Chagas disease can lead to severe cardiac complications, yet its diagnostic gold standard—serological testing—remains limited in scalability. Electrocardiograms (ECGs), which can reflect Chagas cardiomyopathy, offer a promising non-invasive alternative for automated screening. Recent advances in self-supervised learning have enabled ECG foundation models that capture general-purpose physiological features (e.g., heart rate, QRS duration, age, sex, arrhythmias), making them adaptable to diverse diagnostic tasks.

Method: We adapted a self-supervised ECG foundation model (ECG-JEPA), pretrained on large-scale unlabeled ECGs, for Chagas disease detection. Given 8-lead ECG inputs (I, II, V1–V6), ECG-JEPA functions as an encoder, generating fixed-length representation vectors. We added a linear classifier on top of the encoder outputs and fine-tuned the entire model—encoder and classifier—using cross-entropy loss on the Challenge training set. To align with the Challenge metric—focused on identifying the top 5% of high-risk patients—we implemented a hybrid re-ranking strategy. First, we used the model to compute Chagas probabilities and ranked patients accordingly. Separately, we computed cosine distances between each test sample and the mean vector of Chagas-positive training samples in the representation space, yielding a similarity-based ranking. The final ranking was obtained by averaging the two rank lists, effectively ensembling probabilistic and representation-based signals. To reduce redundancy, only 8 leads were used (I, II, V1–V6), with remaining leads derivable via Einthoven's law.

Results: Re-ranking improved the 10-fold cross-validated Challenge score from 0.433 to 0.441. Transfer learning substantially outperformed training from scratch (0.441 vs. 0.102). On the hidden validation set, our model achieved a score of 0.409. These findings highlight the potential of ECG-based foundation models for scalable, non-invasive screening of Chagas disease, particularly in resource-constrained settings where serological testing is limited.