Session P92.2
How a Decision System Trained on a Large Database Recognizes New Cases: Prelude before Clinical Implementation
R Mlynarski*, A Wlodyka, G Ilczuk, E Pilat, W Kargul
Upper Silesian Cardiology Centre
Katowice, Poland
In the research presented here, a joint team (cardiologists and software specialists) focused on the evaluation of previously created, multi-stage decision systems trained on real clinical data. A set of methods to cover all stages necessary for a complete decision system were created and implanted in a user-friendly software based on JAVA 6.0 – our specific results have been presented over the last three years (including CinC in Lyon, Valencia and Durham). However, the question of how a decision system trained on a large database recognizes new cases remains unanswered?.
Methods: The decision system presented was trained using the medical records of 5425 patients hospitalized in Electrocardiology Dept. in 2003-2006. Using this data as input, decision rules were generated (Rough Sets Grzymala-Busse’ MLEM 2; our own implementation). Their accuracy was validated (10-times repeated 10-fold cross-validation) with a set of a new data – 198 patients hospitalized in 2007. In addition, the medical relevance of the generated rules was checked by cardiology experts.
Results: Selected results are presented as: decision > accuracy of prediction in % (train dataset / test dataset): number of rules. For pacemaker implantation > (87, 92 / 85, 32): 15; DDD type > (89, 43 / 75, 95): 21; VVI type > (83, 56 / 70, 95): 43. Pharmacotherapy was tested using the example of ACE inhibitor > (79, 67 / 68, 05): 11 and B-blockers > (78, 45 / 73, 54): 23. This stage of the software development includes, but is not limited to, the importing data from a clinical database infrastructure, preprocessing data (removing noisy and irrelevant data, converting free text reports to binary attributes, joining binary attributes into group attributes, generating decision rules using our own implementation of the MLEM2 algorithm and finally visualizing the results.
Conclusions: The results presented seem to confirm that decision systems trained on a large dataset works well on new data – which may confirm its usefulness in clinical practice. An acceptable level of accuracy for doctors’ decisions was achieved. For more specialist areas (like pacemakers implantation) where strict guidelines are available, the accuracy was higher.(Abstract Control Number: 300)