Explainable AI (xAI) enables the identification of features relevant to deep neural networks for detecting cardiovascular diseases in the electrocardiogram (ECG). However, the link between explanations and diagnostic criteria is limited but relevant for clinical usage. xECGArch comprises two convolutional neural networks (CNNs) with different temporal focus and the xAI method deep Taylor decomposition (DTD). A systematic validation demonstrated that the short-term CNN self-learns morphological features while the long-term CNN (LT-CNN) focuses on QRS complexes. Relevance changes in the QRS complexes correlate with rhythm. However, it is unclear whether the QRS morphology is a decisive factor. To investigate this, we used transfer learning to make the LT-CNN focus particularly on rhythm for atrial fibrillation (AF) detection in 10 s single-lead ECGs. The LT-CNN was trained to detect R peaks utilizing 9,675 ECGs from the Icentia11k database and tested on 1,320 unseen ECGs. The pretrained model was retrained to differentiate between AF and non-AF using the xECGArch dataset, comprising 8,868 ECGs, with frozen weights from the first 4 to 9 of 10 convolutional layers. We performed hyperparameter optimizations in 5-fold cross-validation for both pre- and retraining. The LT-CNN achieved a sample-accurate F1 score of 91.5% for R peak detection, clearly focusing R peaks according to DTD. The number of frozen layers affected the LT-CNN's ability to detect AF in the xECGArch test dataset (n = 986 ECGs), with an F1 score improving from 83.2% to 93.3% for 9 to 4 frozen layers. According to the DTD, with an increasing number of frozen layers, there is a focus shift from the surrounding area to the R peaks. Our findings demonstrate the efficacy of transfer learning in training AI to focus on particular characteristics, thereby enhancing the interpretability of the models, a prerequisite for trustworthiness and the use of AI in clinical practice.