The raise of machine learning hyperparameter constraints in Python code
Autor: | Ingkarat Rak-amnouykit, Ana Milanova, Guillaume Baudart, Martin Hirzel, Julian Dolby |
---|---|
Přispěvatelé: | Rensselaer Polytechnic Institute (RPI), Parallélisme de Kahn Synchrone ( Parkas), Département d'informatique - ENS Paris (DI-ENS), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria), IBM T. J. Watson Research Centre |
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: | |
Zdroj: | International Symposium on Software Testing and Analysis ISSTA 2022-31st ACM SIGSOFT International Symposium on Software Testing and Analysis ISSTA 2022-31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul 2022, Virtual, South Korea. ⟨10.1145/3533767.3534400⟩ |
DOI: | 10.1145/3533767.3534400⟩ |
Popis: | International audience; Machine-learning operators often have correctness constraints that cut across multiple hyperparameters and/or data. Violating these constraints causes the operator to raise runtime exceptions, but those are usually documented only informally or not at all. This paper presents the first interprocedural weakest-precondition analysis for Python to extract hyperparameter constraints. The analysis is mostly static, but to make it tractable for typical Python idioms in machine-learning libraries, it selectively switches to the concrete domain for some cases. This paper demonstrates the analysis by extracting hyperparameter constraints for 181 operators from a total of 8 ML libraries, where it achieved high precision and recall and found real bugs. Our technique advances static analysis for Python and is a step towards safer and more robust machine learning. |
Databáze: | OpenAIRE |
Externí odkaz: |