155
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Purposeful selection of variables in logistic regression

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The main problem in many model-building situations is to choose from a large set of covariates those that should be included in the "best" model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms in existence. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow describe a purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process.

          Methods

          In this paper we introduce an algorithm which automates that process. We conduct a simulation study to compare the performance of this algorithm with three well documented variable selection procedures in SAS PROC LOGISTIC: FORWARD, BACKWARD, and STEPWISE.

          Results

          We show that the advantage of this approach is when the analyst is interested in risk factor modeling and not just prediction. In addition to significant covariates, this variable selection procedure has the capability of retaining important confounding variables, resulting potentially in a slightly richer model. Application of the macro is further illustrated with the Hosmer and Lemeshow Worchester Heart Attack Study (WHAS) data.

          Conclusion

          If an analyst is in need of an algorithm that will help guide the retention of significant covariates as well as confounding ones they should consider this macro as an alternative tool.

          Related collections

          Most cited references16

          • Record: found
          • Abstract: not found
          • Book: not found

          Applied Logistic Regression

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            10.1162/153244303322753616

            (2000)
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The impact of confounder selection criteria on effect estimation.

              Much controversy exists regarding proper methods for the selection of variables in confounder control. Many authors condemn any use of significance testing, some encourage such testing, and other propose a mixed approach. This paper presents the results of a Monte Carlo simulation of several confounder selection criteria, including change-in-estimate and collapsibility test criteria. The methods are compared with respect to their impact on inferences regarding the study factor's effect, as measured by test size and power, bias, mean-squared error, and confidence interval coverage rates. In situations in which the best decision (of whether or not to adjust) is not always obvious, the change-in-estimate criterion tends to be superior, though significance testing methods can perform acceptably if their significance levels are set much higher than conventional levels (to values of 0.20 or more).
                Bookmark

                Author and article information

                Journal
                Source Code Biol Med
                Source Code for Biology and Medicine
                BioMed Central
                1751-0473
                2008
                16 December 2008
                : 3
                : 17
                Affiliations
                [1 ]Biostatistics, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
                [2 ]Biostatistics, University of Massachusetts, Amherst, MA 01003, USA
                Article
                1751-0473-3-17
                10.1186/1751-0473-3-17
                2633005
                19087314
                353cc813-50e2-450b-905e-f399b1ed6c4d
                Copyright © 2008 Bursac et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 22 August 2008
                : 16 December 2008
                Categories
                Research

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article