A Multi-Stage Deep Learning Approach for Digitization of Paper ECGs and Automated Cardiovascular Disease Classification

Hong-Cheol Yoon1, Dong-Kyu Kim1, Hyun-Seok Kim2, Woo-Young Seo3, Chang-Hoe Heo4, Sung-Hoon Kim1
1Department of Anesthesiology and Pain Medicine, Asan Medical Center, Brain Korea 21 Project, University of Ulsan College of Medicine, 2Department of Anesthesiology and Pain Medicine, University of Ulsan College of Medicine, 3Biomedical Engineering Research Center, Asan Institute for Life Science, Asan Medical Center, 4Medical AI Research Team, Signal House Co., Ltd


Abstract

Background: Large amounts of paper-printed electrocardiogram (ECG) data are still used worldwide, particularly in developing countries. Digitizing and classifying paper ECGs are crucial to increasing access to cardiac care globally and understanding the diversity of cardiovascular diseases (CVDs).

Methods: We (BAPORLab) propose a multi-stage deep learning approach that combines object detection and signal extraction models to digitize ECG images into waveforms, followed by a classification model for diagnosis. The proposed method consists of two stages. First, for waveform reconstruction, we trained a YOLOx model to detect ECG leads and text regions using the bounding box annotations from synthetic ECG images. A U-net architecture was used to train a signal extraction model that separates the ECG signal from gridlines. The model takes cropped ECG images as input, waveform images from the original signal plot, and gridline images obtained via Otsu's thresholding as targets. The output image of the model consists of two channels: the denoised ECG waveforms and gridlines. The output images were converted into reconstructed waveforms using a sliding window algorithm, with normalized power calculated from the detected gridlines. Second, for CVDs classification, we constructed an EfficientNetV2-Small architecture that takes the reconstructed ECG waveforms as input and classifies the ECGs as either normal or abnormal cases.

Results: We trained using 8,000 generated ECG images and validated with 2,000. Our approach achieved a signal-to-noise ratio (SNR) of -1.651 and an F-measure of 0.826 in the internal validation for the reconstruction and classification tasks, respectively. We achieved a SNR of 0.00 and an F-measure of 0.46 on the leaderboard.

Conclusion: Our methodology employs neural networks to reconstruct and classify ECG signals, introducing a novel technique for signal extraction from waveform images. This approach allows for the digitization of large paper ECG archives, which facilitates research on cardiovascular disease.