Influence of the Training Set Composition on the Estimation Performance of Linear ECG-Lead Transformations

Daniel Guldenring1, Dewar Finlay2, Raymond Bond2, Alan Alan Kennedy2, Peter Doggart3, Ghalib Janjua4, James McLaughlin2
1HS Kempten, 2Ulster University, 3PulseAI, 4Robert Gordon University


Abstract

Linear ECG-lead transformations (LELTs) are used to estimate unrecorded target leads by applying a number of recorded basis leads to a LELT matrix. LELT matrices are commonly developed using a training dataset and linear regression analysis. In this research, we assess the influence of the training set composition on the estimation performance a LELT.

Our research was performed using 12-lead ECG data obtained from 225 subjects with left ventricular hypertrophy (LVH) and 225 normal subjects. First, the basis leads (I, II, V2 and V5) and the target leads (V1, V3, V4 and V6) of the LELT matrix under investigation were extracted from the 12-lead ECGs. Second, three training sets with different compositions were assembled using ECG data from 170 subjects. More precisely, one training set based upon 170 ECGs from normal subjects (TRAINnorm), one based upon 170 ECGs from subjects with LVH (TRAINlvh) and one based upon 85 ECGs from normal subjects and 85 ECGs from subjects with LVH (TRAINmix) were assembled. Third, the LELT matrices Mnorm, Mlvh, Mmix were developed using the data in TRAINnorm, TRAINlvh and TRAINmix respectively. Forth, the LELT matrices were used to derive the target leads in the test sets TESTnorm (55 normal ECGs) and TESTlvh (55 ECGs from subjects with LVH). Fifth, the estimation performance of each LELT matrix was quantified using mean of the root-mean-squared-error (RMSE) values calculated between the estimated and the recorded target leads.

For target lead V3 mean RMSE values associated with {Mnorm; Mlvh; Mmix} were found to be {95.8 µV; 167.2 µV; 120.3 µV} and {213.2 µV; 148.4 µV; 165.7 µV} using the data in TESTnorm and TESTlvh respectively. Similar observations were made for the remaining target leads. The findings of our research indicate that unbalanced training sets can lead to biased LELT matrices with reduced estimation performance.