1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Variationally Inferred Sampling through a Refined Bound

      research-article
      1 , 2 , * , 1 , 3
      Entropy
      MDPI
      variational inference, MCMC, stochastic gradients, neural networks

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this work, a framework to boost the efficiency of Bayesian inference in probabilistic models is introduced by embedding a Markov chain sampler within a variational posterior approximation. We call this framework “refined variational approximation”. Its strengths are its ease of implementation and the automatic tuning of sampler parameters, leading to a faster mixing time through automatic differentiation. Several strategies to approximate evidence lower bound (ELBO) computation are also introduced. Its efficient performance is showcased experimentally using state-space models for time-series data, a variational encoder for density estimation and a conditional variational autoencoder as a deep Bayes classifier.

          Related collections

          Most cited references60

          • Record: found
          • Abstract: found
          • Article: not found

          Stan: A Probabilistic Programming Language

          Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            A tutorial on hidden Markov models and selected applications in speech recognition

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Adam: A Method for Stochastic Optimization

              , (2015)
              We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.
                Bookmark

                Author and article information

                Journal
                Entropy (Basel)
                Entropy (Basel)
                entropy
                Entropy
                MDPI
                1099-4300
                19 January 2021
                January 2021
                : 23
                : 1
                : 123
                Affiliations
                [1 ]Institute of Mathematical Sciences (ICMAT), 28049 Madrid, Spain; david.rios@ 123456icmat.es
                [2 ]Statistical and Applied Mathematical Sciences Institute, Durham, NC 7333, USA
                [3 ]School of Management, University of Shanghai for Science and Technology, Shanghai 201206, China
                Author notes
                Author information
                https://orcid.org/0000-0003-0349-0714
                Article
                entropy-23-00123
                10.3390/e23010123
                7832329
                33477766
                9cc1a6c2-e6d6-460b-bad3-dadb15866c0f
                © 2021 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 24 December 2020
                : 13 January 2021
                Categories
                Article

                variational inference,mcmc,stochastic gradients,neural networks

                Comments

                Comment on this article