Dataset for Software Engineering Learning Resources

Autor: Muddassira Arshad, Muhammad Murtaza Yousaf, Syed Mansoor Sarwar
Rok vydání: 2023
Zdroj: International Conference on Scientific and Innovative Studies. 1:118-123
ISSN: 2980-1931
DOI: 10.59287/icsis.588
Popis: – In the current digital age, an abundance of digital resources is readily available to learners. Withthe ongoing COVID pandemic and prevalent economic crises, a significant number of learners prefer toengage in self-learning. To develop customized self-learning applications and guide learners to utilizeresources based on their learning preferences, a dataset containing learning resources and their prerequisiterelationships is required. Several learning resource datasets exist for Machine Learning (ML), InformationRetrieval (IR), and Natural Language Processing (NLP). To contribute to this area, we present the SoftwareEngineering Learning Resource Dataset (SELRD), which is a publicly available dataset specificallydesigned for learning Software Engineering (SE). We have extracted the data for SELRD from multiplesources, including edX, my-mooc, and textbooks. The SE learning resources (SELR) are organized basedon topics, and the dataset includes 602 SELRs referring to 302 topics. We have extracted the content fromlectures and books available in presentation files (pptx) and Portable Document Format (PDF) using Pythonlibraries. Additionally, we have computed the expected reading time for each SELR, which would facilitatelearners by guiding them on the time required to read each respective resource. The SELRD comprises 692prerequisite pairs, including 592 positive pairs and 100 negative pairs. This data can be used along withmachine learning algorithms to generate learning paths that would facilitate self-learners. Additionally, theSELRD can also serve as a repository of SE learning resources. In the future, we plan to add best practicesand examples for each SELR, making it even more useful for learners.
Databáze: OpenAIRE