SVM Machine Learning Classifier to Automate the Extraction of SRS Elements

Autor:	Aysh Alhroob, Ayad Tareq Imam, Wael Jumah Alzyadat
Rok vydání:	2021
Předmět:	General Computer Science Requirements engineering Computer science business.industry Software requirements specification computer.software_genre User requirements document Information extraction Semantic role labeling Named-entity recognition Software requirements Artificial intelligence business computer Natural language processing Natural language
Zdroj:	International Journal of Advanced Computer Science and Applications. 12
ISSN:	2156-5570 2158-107X
DOI:	10.14569/ijacsa.2021.0120322
Popis:	The process of extraction of software entities such as system, use case, and actor from an English natural language description of a user’s software requirements is a linguistic and semantic process of a natural language processing application. Entity extraction is known to be a complicated and challenging problem by researchers in the fields of linguistics or computation, due to the ambiguities in natural languages. This paper presents a named entity recognition method called SyAcUcNER (System Actor Use-Case Named Entity Recognizer), for extracting the system, actor, and use case entities from unstructured English descriptions of user requirements for the software. SyAcUcNER uses one of the Machine Learning (ML) approaches, that is, the Support Vector Machine (SVM) as an effective classifier. Also, SyAcUcNER uses a semantic role labeling process to tag the words in the text of user software requirements. SyAcUcNER is the first work that defines the structure of a requirements engineering specialized NER, the first work that uses a specialized NER model as an approach for extracting actor and use case entities from English language requirements description, and the first time an SVM has been used to specify the semantic meanings of words in a certain domain of discourse; that is the Software Requirements Specification (SRS). The performance of SyAcUcNER, which utilizes WEKA’s SVM, is evaluated using a binomial technique, and the results gained from running SyAcUcNER on text corpora from assorted sources give weighted averages of 76.2% for precision, 76% for recall, and 72.1% for the F-measure.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::310cee48af06600317fa5825faa06e87 https://doi.org/10.14569/ijacsa.2021.0120322 Zobrazit plný text záznamu