ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

6

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Unsupervised Learning of Visual Representations using Videos

Preprint

Author(s): Xiaolong Wang , Abhinav Gupta

Publication date Created: 2015-05-04

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Is strong supervision necessary for learning a good visual representation? Do we really need millions of semantically-labeled images to train a Convolutional Neural Network (CNN)? In this paper, we present a simple yet surprisingly powerful approach for unsupervised learning of CNN. Specifically, we use hundreds of thousands of unlabeled videos from the web to learn visual representations. Our key idea is that visual tracking provides the supervision. That is, two patches connected by a track should have similar visual representation in deep feature space since they probably belong to the same object or object part. We design a Siamese-triplet network with a ranking loss function to train this CNN representation. Without using a single image from ImageNet, just using 100K unlabeled videos and the VOC 2012 dataset, we train an ensemble of unsupervised networks that achieves 52% mAP (no bounding box regression). This performance comes tantalizingly close to its ImageNet-supervised counterpart, an ensemble which achieves a mAP of 54.4%. We also show that our unsupervised network can perform competitively in other tasks such as surface-normal estimation.

Related collections

Most cited references 4

Record: found
Abstract: not found
Article: not found

Sparse coding with an overcomplete basis set: A strategy employed by V1?

Bruno Olshausen, David J Field (1997)

0 comments Cited 437 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Article: not found

The "wake-sleep" algorithm for unsupervised neural networks

R. Neal, B. Frey, P. Dayan … (1995)

0 comments Cited 168 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Article: not found

Learning Invariance from Transformation Sequences

Peter Földiák (1991)

0 comments Cited 104 times – based on 0 reviews      Review now

Author and article information

Journal

Publication date Created: 2015-05-04

Publication date Updated: 2015-10-06

Article

ArXiV ID: 1505.00687

SO-VID: 3bbdf6f1-1470-4faf-9d28-41e96b25c192

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.CV

ScienceOpen disciplines: Computer vision & Pattern recognition

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition

Comments

Comment on this article

Similar content 442

See all similar

Cited by 13

See all cited by

Most referenced authors 113

See all reference authors

- Version 1