Improving ECG diagnosis of left ventricular hypertrophy with contrastive ECG-echocardiogram learning

Mously Dior Diaw1, Stéphane Papelier2, Alexandre Durand-Salmon2, Julien Oster3
1Banook Group - IADI, U1254, Inserm, Université de Lorraine, 2Banook Group, 3Inserm


Abstract

Left ventricular hypertrophy (LVH), commonly diagnosed with echocardiography (echo), refers to a left-sided thickening of the heart muscle. LVH ECG criteria have previously been explored but are insensitive. Our study investigated whether deep learning (DL) techniques could allow better ECG-based LVH screening, specifically by leveraging multimodal (ECG-echo) representation learning.

Building upon a DL-based LVH classifier trained on the UK Biobank (UKB), we propose to pretrain an ECG encoder with a loss function (CE-CEEL) defined as the sum of the cross-entropy (CE) loss and a contrastive ECG-echo loss (CEEL), which similarly to the language image CLIP loss, aligns ECG and echo embeddings. To assess this approach, we extracted from PhysioNet's MIMIC-IV 1990 patients with ECG-echo pairs amongst which 518 LVH patients (26 %). This subset was then split in 5 sets S1−5 consisting of 695, 200, 695, 200 and 200 patients. The ECG encoder, initialized with the weights from UKB training, was first pretrained with CE-CEEL on S1/S2 (training/validation, batch size=16). The EchoNet-LVH model was used as the image encoder. These pretrained weights were then copied to train the model with CE alone on S3/S4 (batch size=200). In a separate experiment, the ECG encoder was also trained with CE alone on S1−3/S4. During training, the model yielding the best AUC was saved.

On S5, used for final testing, our multimodal appproach yielded a better AUC (0.644) than unimodal models, whether retrained on MIMIC (0.582) or not (0.511), and than traditional ECG rules (e.g. AUC=0.549 for the Sokolow-Lyon index). The results thus suggest that aligning ECG and echo embeddings could improve ECG-based LVH classification.