Transformer Network with Time Prior for Predicting Clinical Outcome from EEG of Cardiac Arrest Patients

Maurice Rohr1, Tobias Schilke1, Laurent Willems2, Christoph Reich1, Sebastian Dill1, Gökhan Güney1, Christoph Hoog Antink3
1Technische Universität Darmstadt, 2Goethe-University Frankfurt am Main, 3TU Darmstadt


Abstract

Prognostication in patients with hypoxic encephalopathy (HE) after cardiopulmonary resuscitation is a challenging aspect of modern neurocritical care. Apart from clinical and laboratory diagnostics, electroencephalography (EEG) is of particular diagnostic importance. Several morphologic patterns have been shown for poor functional outcome after CPR, e.g., burst suppres-sion patterns with identical bursts or isoelectric EEG. In addition to the low specificity of these visually detected EEG patterns, poor interrater reliability in the detection of these patterns is often a problem. Computationally learned features based on repetitive or continuous EEG may have the potential to significantly improve the prognosis of functional outcome in HE patients. In this study, we plan to present a model based on the transformer architecture for predicting the outcome from EEG at arbitrary timepoints after cardiac arrest. The model incorporates a time prior, which can be described as enforcing a monotonous increase in the model's belief about the true outcome. The prior is expected to improve model generalization and is initially implemented as a reward term in the cost function. For later reference, we implemented a simple convolutional neural network (submitted) with 13 convolutional layers with ReLU activations, interleaved with batch-norm and max-pooling. A sigmoid layer with cross-entropy loss optimization is used as output. From layer six on, dropout is employed after each layer. We use a private stratified train/dev/test split of 60/20/20. The input is the zero-padded, unfiltered 18-channel EEG data of the most recent 5 hours. The current challenge metric as ranked is 0.0 (team "IWillSurvive”). Subsequent qualitative inspection showed that using the raw, 18-channel EEG signal from zero-padded 5 hour or even 72 hour data leads to difficult model tuning for a range of deep learning models (LSTM, CNN, Multilayer Perceptrons). Using feature representations such as the spectrogram in the 0.4-30 Hz range improved training.