Automatic classification of RDoC positive valence severity with a neural network
Autor: | Ben Wellner, Cheryl Clark, Rachel Davis, John S. Aberdeen, Lynette Hirschman |
---|---|
Rok vydání: | 2017 |
Předmět: |
Normalization (statistics)
020205 medical informatics Computer science Ordinal Scale Health Informatics Feature selection 02 engineering and technology Overfitting Machine learning computer.software_genre Article Machine Learning Automation 03 medical and health sciences 0302 clinical medicine 0202 electrical engineering electronic engineering information engineering Humans 030212 general & internal medicine Multinomial logistic regression Artificial neural network business.industry Percentage point Mutual information Computer Science Applications Neural Networks Computer Artificial intelligence business computer |
Zdroj: | Journal of Biomedical Informatics. 75:S120-S128 |
ISSN: | 1532-0464 |
Popis: | Objective Our objective was to develop a machine learning-based system to determine the severity of Positive Valance symptoms for a patient, based on information included in their initial psychiatric evaluation. Severity was rated on an ordinal scale of 0–3 as follows: 0 (absent = no symptoms), 1 (mild = modest significance), 2 (moderate = requires treatment), 3 (severe = causes substantial impairment) by experts. Materials and methods We treated the task of assigning Positive Valence severity as a text classification problem. During development, we experimented with regularized multinomial logistic regression classifiers, gradient boosted trees, and feedforward, fully-connected neural networks. We found both regularization and feature selection via mutual information to be very important in preventing models from overfitting the data. Our best configuration was a neural network with three fully connected hidden layers with rectified linear unit activations. Results Our best performing system achieved a score of 77.86%. The evaluation metric is an inverse normalization of the Mean Absolute Error presented as a percentage number between 0 and 100, where 100 means the highest performance. Error analysis showed that 90% of the system errors involved neighboring severity categories. Conclusion Machine learning text classification techniques with feature selection can be trained to recognize broad differences in Positive Valence symptom severity with a modest amount of training data (in this case 600 documents, 167 of which were unannotated). An increase in the amount of annotated data can increase accuracy of symptom severity classification by several percentage points. Additional features and/or a larger training corpus may further improve accuracy. |
Databáze: | OpenAIRE |
Externí odkaz: |