An Empirical Investigation on the Challenges in Scientific Workflow Systems Development
Autor: | Alam, Khairul, Roy, Chanchal, Roy, Banani, Mittal, Kartik |
---|---|
Rok vydání: | 2024 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | Scientific Workflow Systems (SWSs) are advanced software frameworks that drive modern research by orchestrating complex computational tasks and managing extensive data pipelines. These systems offer a range of essential features, including modularity, abstraction, interoperability, workflow composition tools, resource management, error handling, and comprehensive documentation. Utilizing these frameworks accelerates the development of scientific computing, resulting in more efficient and reproducible research outcomes. However, developing a user-friendly, efficient, and adaptable SWS poses several challenges. This study explores these challenges through an in-depth analysis of interactions on Stack Overflow (SO) and GitHub, key platforms where developers and researchers discuss and resolve issues. In particular, we leverage topic modeling (BERTopic) to understand the topics SWSs developers discuss on these platforms. We identified 10 topics developers discuss on SO (e.g., Workflow Creation and Scheduling, Data Structures and Operations, Workflow Execution) and found that workflow execution is the most challenging. By analyzing GitHub issues, we identified 13 topics (e.g., Errors and Bug Fixing, Documentation, Dependencies) and discovered that data structures and operations is the most difficult. We also found common topics between SO and GitHub, such as data structures and operations, task management, and workflow scheduling. Additionally, we categorized each topic by type (How, Why, What, and Others). We observed that the How type consistently dominates across all topics, indicating a need for procedural guidance among developers. The dominance of the How type is also evident in domains like Chatbots and Mobile development. Our study will guide future research in proposing tools and techniques to help the community overcome the challenges developers face when developing SWSs. Comment: 36 pages, 8 figures |
Databáze: | arXiv |
Externí odkaz: |