Building high-level features using large scale unsupervised learning

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art.

Related collections

Most cited references 5

Record: found
Abstract: found
Article: not found

How does the brain solve visual object recognition?

James J. DiCarlo, Davide Zoccolan, Nicole C. Rust (2012)

Mounting evidence suggests that 'core object recognition,' the ability to rapidly recognize objects despite substantial appearance variation, is solved in the brain via a cascade of reflexive, largely feedforward computations that culminate in a powerful neuronal representation in the inferior temporal cortex. However, the algorithm that produces this solution remains poorly understood. Here we review evidence ranging from individual neurons and neuronal populations to behavior and computational models. We propose that understanding this algorithm will require using neuronal and psychophysical data to sift through many computational models, each based on building blocks of small, canonical subnetworks with a common functional goal. Copyright © 2012 Elsevier Inc. All rights reserved.

0 comments Cited 488 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Invariant visual representation by single neurons in the human brain.

R. Quian Quiroga, L. Reddy, G Kreiman … (2005)

It takes a fraction of a second to recognize a person or an object even when seen under strikingly different conditions. How such a robust, high-level representation is achieved by neurons in the human brain is still unclear. In monkeys, neurons in the upper stages of the ventral visual pathway respond to complex images such as faces and objects and show some degree of invariance to metric properties such as the stimulus size, position and viewing angle. We have previously shown that neurons in the human medial temporal lobe (MTL) fire selectively to images of faces, animals, objects or scenes. Here we report on a remarkable subset of MTL neurons that are selectively activated by strikingly different pictures of given individuals, landmarks or objects and in some cases even by letter strings with their names. These results suggest an invariant, sparse and explicit code, which might be important in the transformation of complex visual percepts into long-term and more abstract memories.

0 comments Cited 433 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position

Kunihiko Fukushima, Sei Miyake (1982)

0 comments Cited 77 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Publication date Created: 2011-12-28

Publication date Updated: 2012-07-12

Article

ArXiV ID: 1112.6209

SO-VID: 4fba1ad6-fdde-464b-acf4-623bb40b7756

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.LG

ScienceOpen disciplines: Artificial intelligence

Data availability:

ScienceOpen disciplines: Artificial intelligence

Building high-level features using large scale unsupervised learning

Read this article at

Abstract

Related collections

Semantic Knowledge Base

Most cited references 5

How does the brain solve visual object recognition?

Invariant visual representation by single neurons in the human brain.

Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 97

Cited by 1

Most referenced authors 97