6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A two-step method for variable selection in the analysis of a case-cohort study

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance on methods for variable selection in case-cohort studies.

          Methods

          We describe and explore the application of three variable selection methods to data from a case-cohort study. These are: (i) selecting variables based on their level of significance in univariable (i.e. one-at-a-time) Prentice-weighted Cox regression models; (ii) stepwise selection applied to Prentice-weighted Cox regression; and (iii) a two-step method which applies a Bayesian variable selection algorithm to obtain posterior probabilities of selection for each variable using multivariable logistic regression followed by effect estimation using Prentice-weighted Cox regression.

          Results

          Across nine different simulation scenarios, the two-step method demonstrated higher sensitivity and lower false discovery rate than the one-at-a-time and stepwise methods. In an application of the methods to data from the EPIC-InterAct case-cohort study, the two-step method identified an additional two fatty acids as being associated with incident type 2 diabetes, compared with the one-at-a-time and stepwise methods.

          Conclusions

          The two-step method enables more powerful and accurate detection of exposure-outcome associations in case-cohort studies. An R package is available to enable researchers to apply this method.

          Related collections

          Most cited references22

          • Record: found
          • Abstract: not found
          • Article: not found

          The Adaptive Lasso and Its Oracle Properties

          Hui Zou (2006)
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Sparsity and smoothness via the fused lasso

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Variable Selection Via Gibbs Sampling

                Bookmark

                Author and article information

                Journal
                Int J Epidemiol
                Int J Epidemiol
                ije
                International Journal of Epidemiology
                Oxford University Press
                0300-5771
                1464-3685
                April 2018
                10 November 2017
                10 November 2017
                : 47
                : 2
                : 597-604
                Affiliations
                [1 ]MRC Biostatistics Unit, Cambridge, UK
                [2 ]MRC Epidemiology Unit, Cambridge, UK
                Author notes
                Corresponding author. MRC Biostatistics Unit, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge Biomedical Campus, Cambridge, CB2 0SR, UK. E-mail: paul.newcombe@ 123456mrc-bsu.cam.ac.uk
                Article
                dyx224
                10.1093/ije/dyx224
                5913627
                29136145
                2d684f5f-3c77-4b0f-86cc-7d9786c951bc
                © The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 2 October 2017
                : 6 October 2017
                Page count
                Pages: 8
                Product
                Funding
                Funded by: Medical Research Council 10.13039/501100000265
                Award ID: MC_UP_0801/1
                Funded by: Medical Research Council 10.13039/501100000265
                Award ID: MC_U105260558
                Funded by: Medical Research Council 10.13039/501100000265
                Award ID: MC_UU_12015/1
                Categories
                Methods

                Public health
                case-cohort study,survival analysis,variable selection,bayesian variable selection,type 2 diabetes,fatty acids

                Comments

                Comment on this article