How do Machine Learning Models Change?

Autor:	Castaño, Joel, Cabañas, Rafael, Salmerón, Antonio, Lo, David, Martínez-Fernández, Silverio
Rok vydání:	2024
Předmět:	Computer Science - Software Engineering Computer Science - Machine Learning
Druh dokumentu:	Working Paper
Popis:	The proliferation of Machine Learning (ML) models and their open-source implementations has transformed Artificial Intelligence research and applications. Platforms like Hugging Face (HF) enable the development, sharing, and deployment of these models, fostering an evolving ecosystem. While previous studies have examined aspects of models hosted on platforms like HF, a comprehensive longitudinal study of how these models change remains underexplored. This study addresses this gap by utilizing both repository mining and longitudinal analysis methods to examine over 200,000 commits and 1,200 releases from over 50,000 models on HF. We replicate and extend an ML change taxonomy for classifying commits and utilize Bayesian networks to uncover patterns in commit and release activities over time. Our findings indicate that commit activities align with established data science methodologies, such as CRISP-DM, emphasizing iterative refinement and continuous improvement. Additionally, release patterns tend to consolidate significant updates, particularly in documentation, distinguishing between granular changes and milestone-based releases. Furthermore, projects with higher popularity prioritize infrastructure enhancements early in their lifecycle, and those with intensive collaboration practices exhibit improved documentation standards. These and other insights enhance the understanding of model changes on community platforms and provide valuable guidance for best practices in model maintenance.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2411.09645 Zobrazit plný text záznamu View this record from Arxiv