Feature-Reduced Ensemble Model for ECG-Based Chagas Disease Diagnosis

sanghyun ham and eunseo choi
Kyunghee university (biomedical engineering)


Abstract

Team KHU_BME developed a machine learning (ML) model for early detection of Chagas disease using large-scale 12-lead ECG data. Chagas disease, caused by Trypanosoma cruzi, is often underdiagnosed in endemic regions where molecular or serological testing is limited. ECG offers a low-cost, non-invasive alternative that can capture conduction abnormalities of chronic Chagas cardiomyopathy. From an initial 109 features, morphological, temporal, and spectral descriptors were extracted and reduced to 44 clinically relevant features, such as QRS duration and rsR′ patterns. This feature reduction improved generalizability, efficiency, and interpretability. Class imbalance was addressed with SMOTE, and hyperparameters were tuned for Random Forest, XGBoost, and Logistic Regression classifiers. The ensemble model achieved a Challenge score of 0.139, AUROC 0.817, AUPRC 0.718, Accuracy 0.746, and F-measure 0.645 on our held-out test set, and a Challenge score of 0.094 on the official test set. These results demonstrate the feasibility of ECG-based ML with feature reduction as an efficient screening tool for Chagas disease in resource-limited settings.