[Prognostic model of small sample critical diseases based on transfer learning]

Autor: Jing, Xia, Su, Pan, Molei, Yan, Guolong, Cai, Jing, Yan, Gangmin, Ning
Rok vydání: 2020
Předmět:
Zdroj: Sheng Wu Yi Xue Gong Cheng Xue Za Zhi
ISSN: 1001-5515
Popis: Aiming at the problem that the small samples of critical disease in clinic may lead to prognostic models with poor performance of overfitting, large prediction error and instability, the long short-term memory transferring algorithm (transLSTM) was proposed. Based on the idea of transfer learning, the algorithm leverages the correlation between diseases to transfer information of different disease prognostic models, constructs the effictive model of target disease of small samples with the aid of large data of related diseases, hence improves the prediction performance and reduces the requirement for target training sample quantity. The transLSTM algorithm firstly uses the related disease samples to pretrain partial model parameters, and then further adjusts the whole network with the target training samples. The testing results on MIMIC-Ⅲ database showed that compared with traditional LSTM classification algorithm, the transLSTM algorithm had 0.02-0.07 higher AUROC and 0.05-0.14 larger AUPRC, while its number of training iterations was only 39%-64% of the traditional algorithm. The results of application on sepsis revealed that the transLSTM model of only 100 training samples had comparable mortality prediction performance to the traditional model of 250 training samples. In small sample situations, the transLSTM algorithm has significant advantages with higher prediciton accuracy and faster training speed. It realizes the application of transfer learning in the prognostic model of critical disease with small samples.针对临床上重症疾病样本数量少容易导致预后模型过拟合、预测误差大、不稳定的问题,本文提出迁移长短时程记忆算法(transLSTM)。该算法基于迁移学习思想,利用疾病间的相关性实现不同疾病预后模型的信息迁移,借助相关疾病的大数据辅助构建小样本目标病种有效模型,提升模型预测性能,降低对目标训练样本量的要求。transLSTM 算法先利用相关疾病数据预训练部分模型参数,再用目标训练样本进一步调整整个网络。基于 MIMIC-Ⅲ数据库的测试结果显示,相比传统的 LSTM 分类算法,transLSTM 算法的 AUROC 指标高出 0.02~0.07,AUPRC 指标超过 0.05~0.14,训练迭代次数仅为传统算法的 39%~64%。应用于脓毒症疾病的结果显示,仅 100 个训练样本的 transLSTM 模型死亡率预测性能与 250 个训练样本的传统模型相当。在小样本情况下,transLSTM 算法预测精度更高、训练速度更快,具有显著优势。它实现了迁移学习在小样本重症疾病预后模型中的应用。.
Databáze: OpenAIRE