Hyperparameter Optimization of Topological Features for Machine Learning Applications

Autor: Franco Marinozzi, Hugh K. Haddox, Marcio Gameiro, Hamed Eramian, Christopher J. Tralie, Jed Singer, Steve Haase, Nicholas Leiby, Gilberto Bini, Devin Strickland, Rossella Bedini, Fabiano Bini, Gabe Rocklin, John Harer, Scott Novotney, Matt Vaughn, Francis C. Motta
Rok vydání: 2019
Předmět:
Zdroj: ICMLA
DOI: 10.1109/icmla.2019.00185
Popis: This paper describes a general pipeline for generating optimal vector representations of topological features of data for use with machine learning algorithms. This pipeline can be viewed as a costly black-box function defined over a complex configuration space, each point of which specifies both how features are generated and how predictive models are trained on those features. We propose using state-of-the-art Bayesian optimization algorithms to inform the choice of topological vectorization hyperparameters while simultaneously choosing learning model parameters. We demonstrate the need for and effectiveness of this pipeline using two difficult biological learning problems, and illustrate the nontrivial interactions between topological feature generation and learning model hyperparameters.
Databáze: OpenAIRE