Optimal Fluid and Vasopressor Interventions in Septic ICU Patients Through Reinforcement Learning Model

Maximiliano Mollura1, Cristian Drudi1, Li-wei Lehman2, Riccardo Barbieri1
1Politecnico di Milano, 2Massachusetts Institute of Technology


Introduction: Timely management of sepsis is of primary importance in the intensive care unit (ICU). Fluids and vasopressors represent the cornerstone for dealing with sepsis-induced hemodynamic instability. However, optimal personalized and standardized treatments strategies are still missing. Goal: This study evaluates the ability of a reduced set of cardiovascular features in determining optimal actions with a reinforcement learning approach. Methods: Data were extracted from the MIMIC-III (PhysioNet) database collecting electronic health records of ICU patients. Patients' trajectories were modeled as a Markov decision process with a target reward based on 90-day mortality. Performances with a reduced set of cardiovascular features (CARDIO), including heart rate, systolic blood pressure, diastolic blood pressure, shock index, oxygen saturation, and mechanical ventilation were compared with a random policy model (RANDOM) and a model with a full set of 48 clinical variables including physiologic, laboratory measurement, and ventilation parameters (FULL). Results: The CARDIO model achieved the highest results with a 95\% lower bound (LB) of estimated policy value equal to 96.17 compared with the 86.00 obtained from the FULL model and 82.62 from the RANDOM policy model. Conclusions: Results show that information from cardiovascular features and ongoing treatments have the potential to determine the optimal dosage of fluids and vasopressors for septic patients admitted to ICU when using reinforcement learning tools for the development of medical decision support systems.