Continuous Sweep for Binary Quantification Learning

Autor: Kloos, Kevin, Karch, Julian D., Meertens, Quinten A., de Rooij, Mark
Rok vydání: 2023
Předmět:
Druh dokumentu: Working Paper
Popis: A quantifier is a supervised machine learning algorithm, focused on estimating the class prevalence in a dataset rather than labeling its individual observations. We introduce Continuous Sweep, a new parametric binary quantifier inspired by the well-performing Median Sweep, which is an ensemble method based on Adjusted Count estimators. We modified two aspects of Median Sweep: 1) using parametric class distributions instead of empirical distributions for the true and false positive rate; 2) using the mean instead of the median of a set of Adjusted Count estimates. These two modifications allow for a theoretical analysis of the bias and variance of Continuous Sweep. Furthermore, the expressions of bias and variance can be used to define optimal decision boundaries of the set of Adjusted count estimates to be used in the ensemble. We show in three simulation studies that Continuous Sweep outperforms the quantifiers in the group Classify, Count, and Correct, including Median Sweep, and is competitive with the two best quantifiers from the group Distribution Matchers. Also an empirical data set is analysed with these quantifiers showing similar performances.
Databáze: arXiv