Analyzing Data Selection Techniques with Tools from the Theory of Information Losses

Autor: Brandon Foggo, Nanpeng Yu
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Popis: In this paper, we present and illustrate some new tools for rigorously analyzing training data selection methods. These tools focus on the information theoretic losses that occur when sampling data. We use this framework to prove that two methods, Facility Location Selection and Transductive Experimental Design, reduce these losses. These are meant to act as generalizable theoretical examples of applying the field of Information Theoretic Deep Learning Theory to the fields of data selection and active learning. Both analyses yield insight into their respective methods and increase their interpretability. In the case of Transductive Experimental Design, the provided analysis greatly increases the method's scope as well.
This paper has now been published as a conference proceeding in IEEE Big Data 2021
Databáze: OpenAIRE