43
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Variable selection – A review and recommendations for the practicing statistician

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Statistical models support medical research by facilitating individualized outcome prognostication conditional on independent variables or by estimating effects of risk factors adjusted for covariates. Theory of statistical models is well‐established if the set of independent variables to consider is fixed and small. Hence, we can assume that effect estimates are unbiased and the usual methods for confidence interval estimation are valid. In routine work, however, it is not known a priori which covariates should be included in a model, and often we are confronted with the number of candidate variables in the range 10–30. This number is often too large to be considered in a statistical model. We provide an overview of various available variable selection methods that are based on significance or information criteria, penalized likelihood, the change‐in‐estimate criterion, background knowledge, or combinations thereof. These methods were usually developed in the context of a linear regression model and then transferred to more generalized linear models or models for censored survival data. Variable selection, in particular if used in explanatory modeling where effect estimates are of central interest, can compromise stability of a final model, unbiasedness of regression coefficients, and validity of p‐values or confidence intervals. Therefore, we give pragmatic recommendations for the practicing statistician on application of variable selection methods in general (low‐dimensional) modeling problems and on performing stability investigations and inference. We also propose some quantities based on resampling the entire variable selection process to be routinely reported by software packages offering automated variable selection algorithms.

          Related collections

          Most cited references75

          • Record: found
          • Abstract: not found
          • Article: not found

          Regression Shrinkage and Selection Via the Lasso

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Regularization Paths for Generalized Linear Models via Coordinate Descent

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Estimating the Dimension of a Model

                Bookmark

                Author and article information

                Contributors
                georg.heinze@meduniwien.ac.at
                Journal
                Biom J
                Biom J
                10.1002/(ISSN)1521-4036
                BIMJ
                Biometrical Journal. Biometrische Zeitschrift
                John Wiley and Sons Inc. (Hoboken )
                0323-3847
                1521-4036
                02 January 2018
                May 2018
                : 60
                : 3 ( doiID: 10.1002/bimj.v60.3 )
                : 431-449
                Affiliations
                [ 1 ] Section for Clinical Biometrics Center for Medical Statistics, Informatics and Intelligent Systems Medical University of Vienna Vienna 1090 Austria
                Author notes
                [*] [* ] Correspondence

                Georg Heinze, Section for Clinical Biometrics, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Spitalgasse 23, Vienna 1090, Austria

                Email: georg.heinze@ 123456meduniwien.ac.at

                Author information
                http://orcid.org/0000-0003-1147-8491
                Article
                BIMJ1842
                10.1002/bimj.201700067
                5969114
                29292533
                d775b055-2c9e-45ac-b5b5-2330b6c78ae9
                © 2017 The Authors. Biometrical Journal Published by WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim

                This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

                History
                : 18 April 2017
                : 13 November 2017
                : 17 November 2017
                Page count
                Figures: 2, Tables: 6, Pages: 19, Words: 13069
                Categories
                Review Article
                Biometry in Practice
                Custom metadata
                2.0
                bimj1842
                May 2018
                Converter:WILEY_ML3GV2_TO_NLMPMC version:version=5.3.8.2 mode:remove_FC converted:25.05.2018

                Quantitative & Systems biology
                change‐in‐estimate criterion,penalized likelihood,resampling,statistical model,stepwise selection

                Comments

                Comment on this article