Robust multiple-instance learning ensembles using random subspace instance selection

Autor: Marc-André Carbonneau, Ghyslain Gagnon, Eric Granger, Alexandre J. Raymond
Rok vydání: 2016
Předmět:
Zdroj: Pattern Recognition. 58:83-99
ISSN: 0031-3203
DOI: 10.1016/j.patcog.2016.03.035
Popis: Many real-world pattern recognition problems can be modeled using multiple-instance learning (MIL), where instances are grouped into bags, and each bag is assigned a label. State-of-the-art MIL methods provide a high level of performance when strong assumptions are made regarding the underlying data distributions, and the proportion of positive to negative instances in positive bags. In this paper, a new method called Random Subspace Instance Selection (RSIS) is proposed for the robust design of MIL ensembles without any prior assumptions on the data structure and the proportion of instances in bags. First, instance selection probabilities are computed based on training data clustered in random subspaces. A pool of classifiers is then generated using the training subsets created with these selection probabilities. By using RSIS, MIL ensembles are more robust to many data distributions and noise, and are not adversely affected by the proportion of positive instances in positive bags because training instances are repeatedly selected in a probabilistic manner. Moreover, RSIS also allows the identification of positive instances on an individual basis, as required in many practical applications. Results obtained with several real-world and synthetic databases show the robustness of MIL ensembles designed with the proposed RSIS method over a range of witness rates, noisy features and data distributions compared to reference methods in the literature. HighlightsA new method, Random Subspace Instance Selection, is proposed to design MIL ensembles.The method yields ensembles that are robust to variations of witness rate, data distributions and noise.The method yields state-of-the-art results on several benchmark data sets.
Databáze: OpenAIRE