Concept Matching for Low-Resource Classification

Autor: Federico Errica, Ludovic Denoyer, Fabrizio Silvestri, Fabio Petroni, Vassilis Plachouras, Bora Edizel, Sebastian Riedel
Rok vydání: 2021
Předmět:
Zdroj: IJCNN
DOI: 10.1109/ijcnn52387.2021.9533640
Popis: In many applications that rely on machine learning, the availability of labelled data is a matter of primary importance. However, when tackling new tasks, labels are usually missing and must be collected from scratch by the users. In this work, we address the problem of learning classifiers when the amount of labels is very scarce. We do so by learning multiple vectors, called prototypes, that represent relevant semantic concepts for the task at hand. We propose a theoretically inspired mechanism that computes probabilities of matching between the prototypes and the input elements, and we combine these probabilities to increase the expressiveness of the classifier. Moreover, by leveraging low-cost extra annotations in the training data, a simple error-boosting technique guides the learning process and provides substantial performance improvements. Empirical results confirm the benefits of the proposed approach in both balanced and unbalanced datasets. Our methodology is thus of practical use when gathering and labelling new examples is more expensive than annotating what we already have.
Databáze: OpenAIRE