A Semantic Segmentation-based Digitization of ECG Papers

Davyd Melo1, João Paulo Madeiro1, Luis Rigo Jr.2, Cláudia Pessoa1, José Macêdo1, Danielo Gomes1
1Federal University of Ceará, 2Universidade Federal do Espirito Santo


Abstract

The electrocardiogram (ECG) is crucial to identify biomarkers of cardiovascular diseases (CVD) in a non-invasive and non-expensive manner.. While digital ECGs have become more prevalent and offer many advantages, the use of ECGs papers may still be necessary in certain situations, e.g. equipment failure, accessibility, familiarity, legal requirements. Healthcare providers must weigh the benefits and limitations of each method and choose the most appropriate option for their patients.. Once ECG papers are converted into digital format, they can be used to teach machine learning models how to accurately diagnose heart conditions. By using a collection of digital ECGs alongside their corresponding diagnoses, healthcare providers can help train these models to automate the diagnostic process, making it faster and more efficient. The current prototype involves analyzing a batch of scanned ECG papers from a Smartwatch and the Samsung Health App using a combination of machine learning and image processing techniques. The system first identifies the regions of interest on the scanned papers by using a statistical windowing process. It then uses a labeling tool to separate the ECG signal from other elements (e.g. the background), by using a Random Forest model trained using a Gabor filter bank. Finally, the ECG waveform in millivolt units is extracted through an iterative process that uses the ECG baseline as a reference. The baseline identification is based on the Hough transform. After testing the proposed digitalization pipeline on a set of images, the results were very accurate (Mean Squared Error = 0.0034). Also, the estimations were closely matched the actual ECG data (Pearson Coefficient = 0.92).