19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Adam: A Method for Stochastic Optimization

      journal-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

          Abstract

          Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015

          Related collections

          Author and article information

          Journal
          arXiv
          2014
          22 December 2014
          23 December 2014
          17 January 2015
          20 January 2015
          27 February 2015
          03 March 2015
          03 March 2015
          04 March 2015
          23 April 2015
          24 April 2015
          23 June 2015
          24 June 2015
          20 July 2015
          21 July 2015
          23 July 2015
          27 July 2015
          30 January 2017
          31 January 2017
          December 2014
          Article
          10.48550/ARXIV.1412.6980
          35895330
          8c4f2bfb-2f48-4936-994d-82a3a546b309

          arXiv.org perpetual, non-exclusive license

          History

          FOS: Computer and information sciences,Machine Learning (cs.LG)

          Comments

          Comment on this article