Text-to-ECG Generation and Image Style Transfer Helps ECG Images Digitalization and Classification

Zisheng Liang1, Shanwei Zhang2, 可鑫 王2, qinghao zhao3, Deyun Zhang4, Shijia Geng4, Jun Li5, Yuxi Zhou6, Shenda Hong7
1National Institute of Health Data Science, Peking University, Beijing, China, 2Tianjin University of Technology, 3Department of Cardiology, Peking University People's Hospital, 4Heartvoice Medical Technology, 5Jilin University, 6Peking University, 7Georgia Institute of Technology


Abstract

Recent advancements have led to the development of algorithms for interpreting ECG time series. However, the continued prevalence of ECG images (including photo, print screen, or scan from paper-based ECG), especially in the underdeveloped countries, underscores the need for digitization and affordable data analysis to ensure com- prehensive cardiac care and capture the diverse manifes- tations of cardiovascular diseases worldwide. As part of the George B. Moody PhysioNet Challenge 2024, our goal is to propose a series of simple yet effective methods for the digitization and classification of ECG images. First, we generate training samples with the ECG-Image-Kit and refine them using neural style transfer for data augmen- tation. Then, we employ a U-net architecture to perform ECG digitization, utilizing this large scale ECG images paired with their corresponding ground-truth time series. Next, we pre-train a RegNet model for ECG classification using a large scale ECG time series data from open-source datasets. This pre-trained classifier is then further fine- tuned with the digitized ECG time series derived from ECG images. Additionally, we devise an adaptable preprocess- ing strategy to address issues related to varying lengths and asynchronization when digitizing diverse ECG image layouts. Our team, PKU NIHDS, receives a SNR of 0.00 on the reconstruction task, and F-measure of 0.52 on the classification task on the hidden test set.