59
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A weakly informative default prior distribution for logistic and other regression models

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-\(t\) prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribution attained by assuming one-half additional success and one-half additional failure in a logistic regression. Cross-validation on a corpus of datasets shows the Cauchy class of prior distributions to outperform existing implementations of Gaussian and Laplace priors. We recommend this prior distribution as a default choice for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small), and also automatically applying more shrinkage to higher-order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation. We implement a procedure to fit generalized linear models in R with the Student-\(t\) prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several applications, including a series of logistic regressions predicting voting preferences, a small bioassay experiment, and an imputation model for a public health data set.

          Related collections

          Most cited references14

          • Record: found
          • Abstract: not found
          • Article: not found

          On the existence of maximum likelihood estimates in logistic regression models

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The Selection of Prior Distributions by Formal Rules

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Large-Scale Bayesian Logistic Regression for Text Categorization

                Bookmark

                Author and article information

                Journal
                26 January 2009
                Article
                10.1214/08-AOAS191
                0901.4011
                4eed51ee-8a8f-4ef4-a335-f8092caee8d3

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                IMS-AOAS-AOAS191
                Annals of Applied Statistics 2008, Vol. 2, No. 4, 1360-1383
                Published in at http://dx.doi.org/10.1214/08-AOAS191 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)
                stat.AP
                vtex

                Comments

                Comment on this article