Popis: |
Deep neural networks (DNNs) have proven highly effective in various computational tasks, but their success depends largely on access to large datasets with accurate labels. Obtaining such data may be challenging and costly in real-world scenarios. Common alternatives, such as the use of search engines and crowdsourcing, often result in datasets with inaccurately labeled, or “noisy,” data. This noise may significantly reduce the ability of DNNs to generalize and maintain reliability. Traditional methods for learning with noisy labels mitigate this drawback by training DNNs selectively on reliable data, but they often underutilize available data. Although data augmentation techniques are useful, they do not directly solve the noisy label problem and are limited in such contexts. This paper proposes a confidence-guided Mixup named ConfidentMix, which is a data augmentation strategy based on label confidence. Our method dynamically adjusts the intensity of data augmentation according to label confidence, to protect DNNs from the detrimental effects of noisy labels and maximize the learning potential from the most reliable portions of the dataset. ConfidentMix represents a unique blend of label confidence assessment and customized data augmentation, and improves model resilience and generalizability. Our results on standard benchmarks with synthetic noise, such as CIFAR-10 and CIFAR-100, demonstrate the superiority of ConfidentMix in high-noise environments. Furthermore, extensive experiments on Clothing1M and mini-WebVision have confirmed that ConfidentMix surpasses state-of-the-art methods in handling real-world noise. |