Semi-Supervised Learning With Label Proportion
Autor: | Hong Tao, Chenping Hou, Dewen Hu, Ningzhao Sun, Tingjin Luo, Wenzhang Zhuge |
---|---|
Rok vydání: | 2023 |
Předmět: |
business.industry
Computer science Supervised learning Semi-supervised learning Extension (predicate logic) Type (model theory) Machine learning computer.software_genre Computer Science Applications Submodular set function Consistency (database systems) ComputingMethodologies_PATTERNRECOGNITION Cardinality Computational Theory and Mathematics Minification Artificial intelligence business computer Information Systems |
Zdroj: | IEEE Transactions on Knowledge and Data Engineering. 35:877-890 |
ISSN: | 2326-3865 1041-4347 |
DOI: | 10.1109/tkde.2021.3076457 |
Popis: | The scarcity of labels is common and great challenge in traditional supervised learning. Semi-supervised learning (SSL) leverages unlabeled samples to alleviate the absence of label information. Similar with annotation, label proportion is another type of prior information and plays a significant role in classification tasks. Compared with the acquisition of labels, label proportion can be obtained more easily. For example, only a small number of patients have been diagnosed with or not with cancers in hospital database, while the proportion with cancer can be generally estimated by historical records. How to incorporate such prior information of label proportion is crucial but rarely studied in literature. Traditional SSL methods often ignore this prior information and will lead to performance degradation inevitably. To solve this problem, we propose a novel SSL with Label Proportion (SSLLP). Our approach encourages to preserve label consistency and label proportion by imposing the cardinality bound constraints. Our formulated problem equals to a mixed-integer constrained submodular minimization and it is difficult to be solved directly. Therefore, we transformed the original problem into a convex one by Lov $\acute{\text{a}}$ sz extension and designed an efficient solving algorithm. Extensive experimental results present the improved performance of our method over several state-of-the-art methods. |
Databáze: | OpenAIRE |
Externí odkaz: |