Impact of Pre-Processing Decisions on Automated ECG Classification Accuracy

Adrian Cornely and Grace Mirsky
Benedictine University


Electrocardiography is well-established as an effective clinical tool for detection and diagnosis of cardiac arrhythmias and abnormalities. The objective of the 2021 PhysioNet/Computing in Cardiology Challenge was for teams to develop automated classification algorithms for reduced-lead ECGs on a large dataset acquired from several geographically separated sites. While it is well-known that proper pre-processing is very important for the success of classification algorithms, there is not universal agreement as to the appropriate pre-processing steps for automated ECG classification. Papers from the top 15 finishers in the Challenge as well as the bottom ten finishers were examined to determine what pre-processing steps were applied by each team. In order to assess the generalizability of the algorithms, we examined the standard deviation of the scores on the different test sets.

The most commonly used pre-processing steps included resampling to a consistent sampling rate, applying a bandpass filter, normalizing and fixed signal length. There were a number of similarities in the preprocessing steps used by the top 15 teams, whereas all of these steps were not applied in the majority of approaches for the bottom ten teams. In the bottom ten participants, less than half used a bandpass filter, and only three applied some type of normalization. About half of the participants in the bottom ten group used a fixed sampling rate, and most of the teams used a fixed signal length, which was typically achieved by cropping or zero-padding the signals. This investigation underscores the importance of appropriate pre-processing for strong classification accuracy and, more importantly, the need for a universal approach to pre-processing techniques across research in automated ECG classification.