Machine Learning to Assess the Risk of Multidrug-Resistant Gram-Negative Bacilli Infections in Febrile Neutropenic Hematological Patients

Infect Dis Ther. 2021 Apr 16. doi: 10.1007/s40121-021-00438-2. Online ahead of print.

ABSTRACT

INTRODUCTION: We aimed to assess risk factors for multidrug-resistant Gram-negative bacilli (MDR-GNB) from a large amount of data retrieved from electronic health records (EHRs) and determine whether machine learning (ML) may be useful in assessing the risk of MDR-GNB infection at febrile neutropenia (FN) onset.

METHODS: Retrospective study of almost 7 million pieces of structured data from all consecutive episodes of FN in hematological patients in a tertiary hospital in Barcelona (January 2008-December 2017). Conventional multivariate analysis and ML algorithms (random forest, gradient boosting machine, XGBoost, and GLM) were done.

RESULTS: A total of 3235 episodes of FN in 349 patients were documented; MDR-GNB caused 180 (5.6%) infections in 132 patients. The most frequent MDR-GNBs were MDR-Pseudomonas aeruginosa (53%) and extended-spectrum beta-lactamase-producing Enterobacterales (46%). According to conventional logistic regression analysis, independent factors associated with MDR-GNB infection were age older than 45 years (OR 2.07; 95% CI 1.31-3.24), prior antibiotics (2.62; 1.39-4.92), first-ever FN in this hospitalization (2.94; 1.33-6.52), prior hospitalizations for FN (1.72; 1.02-2.89); at least 15 prior hospital visits (2.65; 1.31-5.33), high-risk hematological diseases (3.62; 1.12-11.67), and hospitalization in a room formerly occupied by patients with MDR-GNB isolation (1.69; 1.20-2.38). ML algorithms achieved the following AUC and F1 score for MDR-GNB prediction: random forest, 0.79-0.9711; GMB, 0.79-0.9705; XGBoost, 0.79-0.9670; and GLM, 0.78-0.9716.

CONCLUSION: Data generated in EHRs proved useful in assessing risk factors for MDR-GNB infections in patients with FN. The great number of analyzed variables allowed us to identify new factors related to MDR infection, as well as to train ML algorithms for infection predictions. This information may be used by clinicians to make better clinical decisions.

PMID:33860912 | DOI:10.1007/s40121-021-00438-2