Data Extraction and Preprocessing of code snippets on Stack Overflow for Static Code Analysis

Autor: Ndukwe, Ifeanyi G., Licorish, Sherlock A., MacDonell, Stephen G., Tahir, Amjed
Rok vydání: 2022
Předmět:
DOI: 10.5281/zenodo.7156176
Popis: Stack Overflow is noteworthy in its value to software practitioners. However, few formal datasets exist to facilitate proper benchmarking research. Our study catalogues the data extraction process to collect massive amounts of Java code snippets from Stack Overflow. We also perform and document the process of static code analysis we carried out to encourage further community code quality investigations. A potential research agenda is then outlined.  
Databáze: OpenAIRE