An Introductory Review of Deep Learning for Prediction Models With Big Data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Deep learning models stand for a new learning paradigm in artificial intelligence (AI) and machine learning. Recent breakthrough results in image analysis and speech recognition have generated a massive interest in this field because also applications in many other domains providing big data seem possible. On a downside, the mathematical and computational methodology underlying deep learning models is very challenging, especially for interdisciplinary scientists. For this reason, we present in this paper an introductory review of deep learning approaches including Deep Feedforward Neural Networks (D-FFNN), Convolutional Neural Networks (CNNs), Deep Belief Networks (DBNs), Autoencoders (AEs), and Long Short-Term Memory (LSTM) networks. These models form the major core architectures of deep learning models currently used and should belong in any data scientist's toolbox. Importantly, those core architectural building blocks can be composed flexibly—in an almost Lego-like manner—to build new application-specific network architectures. Hence, a basic understanding of these network architectures is important to be prepared for future developments in AI.

Related collections

Most cited references 142

Record: found
Abstract: found
Article: not found

Deep learning.

Yann LeCun, Yoshua Bengio, Geoffrey E Hinton (2015)

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

0 comments Cited 8346 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Long Short-Term Memory

Jürgen Schmidhuber, Jürgen Schmidhuber (2002)

Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

0 comments Cited 6210 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Going deeper with convolutions

Christian Szegedy, Wei Liu, Yangqing Jia … (2016)

0 comments Cited 2063 times – based on 0 reviews

Bookmark

All references

Author and article information

Contributors

Frank Emmert-Streib: URI : http://loop.frontiersin.org/people/37376/overview

Han Feng: URI : http://loop.frontiersin.org/people/835285/overview

Shailesh Tripathi: URI : http://loop.frontiersin.org/people/483672/overview

Journal

Journal ID (nlm-ta): Front Artif Intell

Journal ID (iso-abbrev): Front Artif Intell

Journal ID (publisher-id): Front. Artif. Intell.

Title: Frontiers in Artificial Intelligence

Publisher: Frontiers Media S.A.

ISSN (Electronic): 2624-8212

Publication date (Electronic): 28 February 2020

Publication date Collection: 2020

Volume: 3

Electronic Location Identifier: 4

Affiliations

[1] ¹Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University , Tampere, Finland

[2] ²Institute of Biosciences and Medical Technology , Tampere, Finland

[3] ³School of Management, University of Applied Sciences Upper Austria , Steyr, Austria

[4] ⁴Department of Biomedical Computer Science and Mechatronics, University for Health Sciences, Medical Informatics and Technology (UMIT) , Hall in Tyrol, Austria

[5] ⁵College of Artificial Intelligence, Nankai University , Tianjin, China

Author notes

Edited by: Fabrizio Riguzzi, University of Ferrara, Italy

Reviewed by: Karthik Soman, University of California, San Francisco, United States; Arnaud Fadja Nguembang, University of Ferrara, Italy

*Correspondence: Frank Emmert-Streib v@ 123456bio-complexity.com

This article was submitted to Machine Learning and Artificial Intelligence, a section of the journal Frontiers in Artificial Intelligence

Article

DOI: 10.3389/frai.2020.00004

PMC ID: 7861305

PubMed ID: 33733124

SO-VID: 1fd93dc1-42c0-45b1-8286-00aeb9577a70

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

History

Date received : 24 October 2019

Date accepted : 31 January 2020

Page count

Figures: 17, Tables: 3, Equations: 35, References: 154, Pages: 23, Words: 15917

Comments

Comment on this article

scite_

Cited by 92

See all cited by

Most referenced authors 2,534

See all reference authors

- Version 1

An Introductory Review of Deep Learning for Prediction Models With Big Data

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Most cited references 142

Deep learning.

Long Short-Term Memory

Going deeper with convolutions

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 249

Cited by 92

Most referenced authors 2,534