Self-informed neural network structure learning

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We study the problem of large scale, multi-label visual recognition with a large number of possible classes. We propose a method for augmenting a trained neural network classifier with auxiliary capacity in a manner designed to significantly improve upon an already well-performing model, while minimally impacting its computational footprint. Using the predictions of the network itself as a descriptor for assessing visual similarity, we define a partitioning of the label space into groups of visually similar entities. We then augment the network with auxilliary hidden layer pathways with connectivity only to these groups of label units. We report a significant improvement in mean average precision on a large-scale object recognition task with the augmented model, while increasing the number of multiply-adds by less than 3%.

Related collections

Most cited references 3

Record: found
Abstract: not found
Article: not found

Adaptive Mixtures of Local Experts

Robert Jacobs, Michael Jordan, Steven J. Nowlan … (1991)

0 comments Cited 453 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Model compression

Rich Caruana, Cristian Buciluǎ, Alexandru Niculescu-Mizil (2006)

0 comments Cited 167 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Deep Learning of Representations: Looking Forward

(2013)

Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data. It also proposes a few forward-looking research directions aimed at overcoming these challenges.