Bug Severity Prediction Using a Hierarchical One-vs.-Remainder Approach

Autor:	Nonso Alex Nnamoko, Daniel Campbell, Luis Adrián Cabrera-Diego, Yannis Korkontzelos
Jazyk:	angličtina
Předmět:	Exploit Computer science business.industry BitTorrent tracker Decision tree learning 020207 software engineering 02 engineering and technology Software maintenance 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Multiclass classification 0202 electrical engineering electronic engineering information engineering Artificial intelligence Macro Remainder business computer Classifier (UML) 0105 earth and related environmental sciences
Zdroj:	Lecture Notes in Computer Science Lecture Notes in Computer Science-Natural Language Processing and Information Systems Natural Language Processing and Information Systems-24th International Conference on Applications of Natural Language to Information Systems, NLDB 2019, Salford, UK, June 26–28, 2019, Proceedings Natural Language Processing and Information Systems ISBN: 9783030232801 NLDB
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-23281-8_20
Popis:	Assigning severity level to reported bugs is a critical part of software maintenance to ensure an efficient resolution process. In many bug trackers, e.g. Bugzilla, this is a time consuming process, because bug reporters must manually assign one of seven severity levels to each bug. In addition, some bug types may be reported more often than others, leading to a disproportionate distribution of severity labels. Machine learning techniques can be used to predict the label of a newly reported bug automatically. However, learning from imbalanced data in a multi-class task remains one of the major difficulties for machine learning classifiers. In this paper, we propose a hierarchical classification approach that exploits class imbalance in the training data, to reduce classification bias. Specifically, we designed a classification tree that consists of multiple binary classifiers organised hierarchically, such that instances from the most dominant class are trained against the remaining classes but are not used for training the next level of the classification tree. We used FastText classifier to test and compare between the hierarchical and standard classification approaches. Based on 93,051 bug reports from 38 Eclipse open-source products, the hierarchical approach was shown to perform relatively well with \(65\%\) Micro F-Score and \(45\%\) Macro F-Score.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::127af4c022079c2740d3659d133d389b Zobrazit plný text záznamu