Generalized Cross Entropy Loss for Training Deep Neural Networks with
  Noisy Labels

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Deep neural networks (DNNs) have achieved tremendous success in a variety of applications across many disciplines. Yet, their superior performance comes with the expensive cost of requiring correctly annotated large-scale datasets. Moreover, due to DNNs' rich capacity, errors in training labels can hamper performance. To combat this problem, mean absolute error (MAE) has recently been proposed as a noise-robust alternative to the commonly-used categorical cross entropy (CCE) loss. However, as we show in this paper, MAE can perform poorly with DNNs and challenging datasets. Here, we present a theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE. Proposed loss functions can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. We report results from experiments conducted with CIFAR-10, CIFAR-100 and FASHION-MNIST datasets and synthetically generated noisy labels.

Related collections

Most cited references 10

Record: found
Abstract: found
Article: not found

Classification in the presence of label noise: a survey.

Benoît Frénay, Michel Verleysen (2014)

Label noise is an important issue in classification, with many potential negative consequences. For example, the accuracy of predictions may decrease, whereas the complexity of inferred models and the number of necessary training samples may increase. Many works in the literature have been devoted to the study of label noise and the development of techniques to deal with label noise. However, the field lacks a comprehensive survey on the different types of label noise, their consequences and the algorithms that consider label noise. This paper proposes to fill this gap. First, the definitions and sources of label noise are considered and a taxonomy of the types of label noise is proposed. Second, the potential consequences of label noise are discussed. Third, label noise-robust, label noise cleansing, and label noise-tolerant algorithms are reviewed. For each category of approaches, a short discussion is proposed to help the practitioner to choose the most suitable technique in its own particular field of application. Eventually, the design of experiments is also discussed, what may interest the researchers who would like to test their own algorithms. In this paper, label noise consists of mislabeled instances: no additional information is assumed to be available like e.g., confidences on labels.

0 comments Cited 198 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Learning Deconvolution Network for Semantic Segmentation

Hyeonwoo Noh, Bohyung Han, Seunghoon Hong (2015)

We propose a novel semantic segmentation algorithm by learning a deconvolution network. We learn the network on top of the convolutional layers adopted from VGG 16-layer net. The deconvolution network is composed of deconvolution and unpooling layers, which identify pixel-wise class labels and predict segmentation masks. We apply the trained network to each proposal in an input image, and construct the final semantic segmentation map by combining the results from all proposals in a simple manner. The proposed algorithm mitigates the limitations of the existing methods based on fully convolutional networks by integrating deep deconvolution network and proposal-wise prediction; our segmentation method typically identifies detailed structures and handles objects in multiple scales naturally. Our network demonstrates outstanding performance in PASCAL VOC 2012 dataset, and we achieve the best accuracy (72.5%) among the methods trained with no external data through ensemble with the fully convolutional network.

0 comments Cited 186 times – based on 0 reviews

Preprint

     Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Deep Networks with Stochastic Depth

, , … (2016)

Very deep convolutional networks with hundreds or more layers have lead to significant reductions in error on competitive benchmarks like the ImageNet or COCO tasks. Although the unmatched expressiveness of the many deep layers can be highly desirable at test time, training very deep networks comes with its own set of challenges. The gradients can vanish, the forward flow often diminishes and the training time can be painfully slow even on modern computers. In this paper we propose stochastic depth, a training procedure that enables the seemingly contradictory setup to train short networks and obtain deep networks. We start with very deep networks but during training, for each mini-batch, randomly drop a subset of layers and bypass them with the identity function. The resulting networks are short (in expectation) during training and deep during testing. Training Residual Networks with stochastic depth is compellingly simple to implement, yet effective. We show that this approach successfully addresses the training difficulties of deep networks and complements the recent success of Residual and Highway Networks. It reduces training time substantially and improves the test errors on almost all data sets significantly (CIFAR-10, CIFAR-100, SVHN). Intriguingly, with stochastic depth we can increase the depth of residual networks even beyond 1200 layers and still yield meaningful improvements in test error (4.91%) on CIFAR-10.

0 comments Cited 51 times – based on 0 reviews

Preprint

     Review now

Bookmark

All references

Author and article information

Journal

Publication date Created: 20 May 2018

Article

ArXiV ID: 1805.07836

SO-VID: 308bae6e-0761-4d16-b89f-e05d6f165a2d

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.LG cs.CV stat.ML

ScienceOpen disciplines: Computer vision & Pattern recognition,Machine learning,Artificial intelligence

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition, Machine learning, Artificial intelligence

Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels

Read this article at

Abstract

Related collections

Semantic Knowledge Base

Most cited references 10

Classification in the presence of label noise: a survey.

Learning Deconvolution Network for Semantic Segmentation

Deep Networks with Stochastic Depth

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 61

Cited by 3

Most referenced authors 318