Application of Multi-Layered Gradient Boosting Decision Trees in Pharmaceutical Classification
Autor: | DU Shishuai, QIU Tian, LI Lingqiao, HU Jinquan, ZHENG Anbing, FENG Yanchun, HU Changqin, YANG Huihua |
---|---|
Jazyk: | čínština |
Rok vydání: | 2020 |
Předmět: | |
Zdroj: | Jisuanji kexue yu tansuo, Vol 14, Iss 2, Pp 260-273 (2020) |
Druh dokumentu: | article |
ISSN: | 1673-9418 14749750 |
DOI: | 10.3778/j.issn.1673-9418.1901069 |
Popis: | Near-infrared spectroscopy technology is highly effective in pharmaceutical analysis. For high-dimensional and non-linear small-scale near-infrared data, traditional drug identification algorithms lack enough feature learning ability, neural network-based methods have problems of local optima and over-fitting, and they tend to ignore the sample imbalance. Aiming at the above disadvantages, a pharmaceutical classification approach with multi-layered gradient Boosting decision trees based on feature selection and cost-sensitive learning (CS_FGBDT) is proposed. Firstly, the raw data are preprocessed by Savitsky-Golay smoothing and first derivative. Secondly, the random forest is used to adaptively extract features from the preprocessed spectra, and the feature map is constructed by multi-layered gradient Boosting trees. Then the negative effect of sample imbalance is minimized by combining cost-sensitive learning. The experimental results show that the model comparatively evaluated on two imbalanced data-sets of capsule and tablet has higher prediction accuracy and stability and is an effective method for drug identification. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |