What is the Message About? Automatic Multi-label Classification of Open Source Repository Messages into Content Types
Autor: | Yannis Korkontzelos, Luis Adrián Cabrera-Diego, Daniel Campbell |
---|---|
Rok vydání: | 2020 |
Předmět: |
Multi-label classification
0209 industrial biotechnology Information retrieval business.industry Computer science Tracking system 02 engineering and technology 020901 industrial engineering & automation Open source Software 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing business Classifier (UML) Natural language |
Zdroj: | Advances in Intelligent Systems and Computing ISBN: 9783030442880 AICV |
DOI: | 10.1007/978-3-030-44289-7_49 |
Popis: | Users of Open Source Software (OSS) projects discuss a diverse range of topics online. The content of a post often corresponds to one or more context-sensitive content types, e.g. a suggestion for a solution, a request for further clarification or indication that a proposed solution did not work. The detection of content types can provide several benefits for software developers. For instance, content types can be used as indicators that summarise the content of the messages. These indicators can be exploited as part of a developer-centric knowledge mining platform allowing developers and project managers to create action alerts concerning new bugs found outside of a bug tracker or they can be combined with other metrics to assess the quality of an OSS project. We present a multi-label classifier, able to classify messages exchanged on communication means about OSS, and detailed evaluation results. We experimented with two state-of-the-art multi-label classification approaches HOMER (Hierarchy Of Multilabel classifiER) and RAkEL (RAndom k-labELsets) as these met the technical requirements of the CROSSMINER project. A manually-annotated threaded corpus of posts form newsgroups discussions, bug tracking systems and forums related to Eclipse projects was also used. The results are promising and indicate the potential to attract novel and deeper research for this task. |
Databáze: | OpenAIRE |
Externí odkaz: |