Multi-step Approach for the Extraction of RR Intervals from Scanned Paper ECGs

Marcus Vollmer1, Marc Dörner2, Stefanie Schreiber3, Stefan Vielhaber3, Lars Kaderali4
1Institute of Bioinformatics, University Medicine Greifswald; DZHK (German Centre of Cardiovascular Research), Partner Site Greifswald, 21) Department of Consultation-Liaison-Psychiatry and Psychosomatic Medicine, University Hospital Zurich, University of Zurich 2) German Center for Neurodegenerative Diseases (DZNE) within the Helmholtz Association, 3Otto-von-Guericke University, 4Institute of Bioinformatics, University Medicine Greifswald


Abstract

Background: Even with modern ML/AI technology, re-digitization of scanned paper ECGs often fails due to significant artifacts on the paper, issues with brightness, speed, and saturation, which sometimes render R peaks invisible. Another problem is that available software is unable to account for inaccurately scanned images, such as varying numbers of pixels that make up a second of ECG recording.

Methods: For a retrospective analysis, routine clinical resting ECGs from patients with amyotrophic lateral sclerosis (ALS) and cerebral amyloid angiopathy (CAA) were reexamined. The scanned paper ECGs were used to compute RR intervals to allow comparison of heart rate variability between both groups. Here we propose a multi-step MATLAB-based approach to extract precise RR intervals. Three regions of interest (ROI) were selected via a graphical user interface to mark areas from the scanned RGB image: a) processing the ECG standard grid, b) processing an ECG waveform, c) checking the ECG duration. The grid-ROI was Gaussian filtered and a peak finder was applied to the column sum redness values to determine the grid spacing. The ECG-ROI was transformed to mask background color and suppress grid lines. In addition, morphological erosion and dilation were applied to read the locations of the peaks based on peak finding in the weighted column sums of gray values. Mean RR intervals were compared to the device information printed on the paper using Bland-Altman analysis.

Results: The proposed approach has been successfully tested on clinical paper ECGs from 48 ALS and 70 CAA patients. Validation with printed RR interval information based on the digital ECG shows high accuracy of the method (median difference: 2.6ms, quartile range: -0.6ms to 6.1ms).

Conclusion: This multi-step approach can be applied to extract precise RR intervals from scanned paper-ECGs, enabling automated analysis and comparison of heart rate variability between different patient groups.