Background: Even with modern ML/AI technology, redigitizing scanned paper ECGs often fails due to significant artifacts introduced during printing, editing and scanning. In addition, available software lacks control for image distortion and inconsistent scanning speeds.
Methods: In a retrospective analysis, clinical paper ECGs from patients with amyotrophic lateral sclerosis (ALS) and cerebral amyloid angiopathy (CAA) were reexamined to compare heart rate variability between both groups. Here, we propose a multi-step MATLAB-based approach to extract accurate RR intervals. Image rotation was verified using Radon transform. Three regions of interest (ROI) were selected from the scanned RGB image: a) the ECG standard grid was subjected to Gaussian filtering and converted to a 1D vector to determine the number of pixels per second, b) an ECG waveform was transformed using image erosion and dilation to mask the background color and suppress the grid lines to detect the R peak locations, c) verification of ECG duration. Mean RR intervals were compared to the device information printed on the paper using Bland-Altman analysis.
Results: The proposed approach was successfully tested on clinical paper ECGs from 48 ALS and 70 CAA patients. Validation with printed RR interval information based on the digital ECG shows high accuracy of the method (median difference: 2.6ms, quartile range: -0.6ms to 6.1ms).