Net2Net: Accelerating Learning via Knowledge Transfer

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.

Related collections

Author and article information

Journal

Publication date Created: 2015-11-17

Publication date Updated: 2016-04-23

Article

ArXiV ID: 1511.05641

SO-VID: 5770b9eb-213a-49f7-aba9-52fc9f519269

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments ICLR 2016 submission

Categories cs.LG

ScienceOpen disciplines: Artificial intelligence

Data availability:

ScienceOpen disciplines: Artificial intelligence

Net2Net: Accelerating Learning via Knowledge Transfer

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 122

Cited by 1