文献分享：机器学习模型预测机械通气患者（数据库来源MIMIC-III）

Machine Learning Prediction Models for Mechanically Ventilated Patients: Analyses of the MIMIC-III Database

Machine Learning Prediction Models for Mechanically Ventilated Patients: Analyses of the MIMIC-III Database - PubMed
Machine Learning Prediction Models for Mechanically Ventilated Patients: Analyses of the MIMIC-III Database

关注微信公众号【科研收录】，回复 “mimic3 Ventilated” 获取相关文献

摘要

实验目的：

ICU患者机械通气死亡率较高。评分标准众多，因此在诸多评分的基础上，添加其他变量，生成新的评分系统。

数据来源：

Medical Information Mart for Intensive Care (MIMIC-III) database。

模型选择：

K临近法（KNN）, 逻辑回归, bagging, 决策树, 随机森林, Extreme Gradient Boosting (XGBoost), and 神经网络。

70%的测试集，30%的验证集；calibration plots（校准图）和Areas under the receiver operating characteristic curves（受试者曲线下面积）评估模型。

Methods

纳入标准：

Adult ICU patients treated with invasive mechanical ventilation during ICU stay were included.

排除标准：

Subjects aged younger than 18 years or older than 90 years or who lack information on the outcome measure were excluded.

疾病的定义：

The definition of the medical condition was referred to the ICD-9 code (13) and derived from the GitHub (GitHub - MIT-LCP/mimic-code: MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases).

参考不同的评分：

The severity of respiratory, coagulation, liver, cardiovascular, central nervous system, or renal failure referred to the SOFA score of the specific organ (scores 0–4). The first day indicates the first 24 h of ICU admission. The SOFA, SAPS II, and OASIS scores refer to the first scores after ICU admission. After the extraction of the data, subjects who met the exclusion criteria were excluded.

统计方式：

该研究分为存活组和死亡组，两组间数据的统计方法，经典的连续性t检验，计数资料卡方检验.

The continuous variables are presented as the median and interquartile range (IQR) and compared using the t-test. The counting data are presented as numbers and percentages and compared using the chisquare test.

机器学习

K临近法（KNN）, 逻辑回归, bagging, 决策树, 随机森林, 梯度增加 (XGBoost), and 神经网络。

70%的测试集，30%的验证集；calibration plots（校准图）和Areas under the receiver operating characteristic curves（受试者曲线下面积）评估模型。

Results

七个模型的受试者工作特征（ROC）曲线。 KNN，k-近邻； XGBoost，极端梯度提升。

The KNN, logistic regression, decision tree, random forest, neural network, bagging, and XGBoost models were established with the training set; the AUCs of the testing set were 0.806, 0.818, 0.743, 0.819, 0.780, 0.803, and 0.821, respectively (Figure 2).

图 3. 七个模型的校准图。 KNN，k-近邻； XGBoost，极端梯度提升。

The calibration curves of all the models, except that of the neural network, performed well. Among the seven models, XGBoost performed best, with the highest receiver operating characteristic (ROC) and the best calibration curve (Figure 3).

横坐标为预测的事件发生率（Predicted Probablity），纵坐标是观察到的实际事件发生率（Actual Rate），范围均为0到1，可以理解为事件发生率（百分比）。对角线的虚线是参考线，即预测值=实际值的情况。

最理想的情况下, 校准曲线是一条对角线(预测概率等于经验概率)，校准曲线不一定会单调递增, 比如, 当分桶的数量比较多时或者分类器比较弱时，通常情况下, Logistic Regression的校准曲线非常贴近于对角线，缺乏自信的模型的校准曲线是sigmoid形的。

The significance of the predictors in the XGBoost model is presented in Figure 4. In the SHAP methodology, the top five predictors were age, respiratory dysfunction, SAPS II score, maximum hemoglobin, and minimum lactate (the importance values were 0.410, 0.309, 0.302, 0.209, and 0.194, respectively).

图 4. XGBoost 模型中预测变量的重要性。

CHF，慢性心力衰竭；
Diabetes_complicated，糖尿病并发症；
Diabetes_uncomplicated，无并发症的糖尿病；
Diasbp，舒张压；高血压_复杂性，高血压合并并发症；
Hypertension_uncomplicated，高血压无并发症；
OASIS，牛津急性疾病严重程度评分；
Organ_failure，任何器官衰竭；
Perivasc，血管周围疾病；
SAPS II，简化急性生理学评分 II；
s心血管，严重心血管衰竭；
sCNS，严重中枢神经系统衰竭；
s凝血，严重凝血失败；
SOFA，序贯器官衰竭评估；
s 肾脏，严重肾功能衰竭；
s呼吸，严重呼吸衰竭；
Sysbp，收缩压；
Tempc，温度；
WBC，白细胞。

表 2. XGBoost 模型的混淆矩阵。

The confusion matrix of the XGBoost model is presented in Table 2. The SHAP plot and a decision tree of the XGBoost model are in the Supplementary Material.

关于混淆矩阵

混淆矩阵的每一列代表了预测类别，每一列的总数表示预测为该类别的数据的数目；

每一行代表了数据的真实归属类别，每一行的数据总数表示该类别的数据实例的数目；每一列中的数值表示真实数据被预测为该类的数目。

Limitations

Firstly, our models were retrospectively established based on a single-center database.

Secondly, there were missing data in our research.

Thirdly, external validation has not been employed in this study

Fourthly, our study only focused on hospital mortality, while other important outcome measures such as ventilator-free days within 28 days and long-term mortalities still needed further investigation.

Lastly, we did not exclude patients who were withdrawn from care, which may also provide bias.

Conclusion

Our results suggest that age, respiratory dysfunction, SAPS II score, maximum hemoglobin, and minimum lactate might be closely associated with hospital mortality in mechanically ventilated ICU patients. The XGBoost model performs better than the KNN, logistic regression, bagging, decision tree, random forest, and neural network models in our study. Further external validations are needed to test the generalization of our models and predictors.

补充

XGBoost是一个优化的分布式梯度增强库，旨在实现高效，灵活和便携，是梯度提升法的进阶版。