42
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SNP Selection in Genome-Wide Association Studies via Penalized Support Vector Machine with MAX Test

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          One of main objectives of a genome-wide association study (GWAS) is to develop a prediction model for a binary clinical outcome using single-nucleotide polymorphisms (SNPs) which can be used for diagnostic and prognostic purposes and for better understanding of the relationship between the disease and SNPs. Penalized support vector machine (SVM) methods have been widely used toward this end. However, since investigators often ignore the genetic models of SNPs, a final model results in a loss of efficiency in prediction of the clinical outcome. In order to overcome this problem, we propose a two-stage method such that the the genetic models of each SNP are identified using the MAX test and then a prediction model is fitted using a penalized SVM method. We apply the proposed method to various penalized SVMs and compare the performance of SVMs using various penalty functions. The results from simulations and real GWAS data analysis show that the proposed method performs better than the prediction methods ignoring the genetic models in terms of prediction power and selectivity.

          Related collections

          Most cited references51

          • Record: found
          • Abstract: not found
          • Article: not found

          Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A genome-wide association study identifies novel risk loci for type 2 diabetes.

            Type 2 diabetes mellitus results from the interaction of environmental factors with a combination of genetic variants, most of which were hitherto unknown. A systematic search for these variants was recently made possible by the development of high-density arrays that permit the genotyping of hundreds of thousands of polymorphisms. We tested 392,935 single-nucleotide polymorphisms in a French case-control cohort. Markers with the most significant difference in genotype frequencies between cases of type 2 diabetes and controls were fast-tracked for testing in a second cohort. This identified four loci containing variants that confer type 2 diabetes risk, in addition to confirming the known association with the TCF7L2 gene. These loci include a non-synonymous polymorphism in the zinc transporter SLC30A8, which is expressed exclusively in insulin-producing beta-cells, and two linkage disequilibrium blocks that contain genes potentially involved in beta-cell development or function (IDE-KIF11-HHEX and EXT2-ALX4). These associations explain a substantial portion of disease risk and constitute proof of principle for the genome-wide approach to the elucidation of complex genetic traits.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genome-wide association analysis by lasso penalized logistic regression.

              In ordinary regression, imposition of a lasso penalty makes continuous model selection straightforward. Lasso penalized regression is particularly advantageous when the number of predictors far exceeds the number of observations. The present article evaluates the performance of lasso penalized logistic regression in case-control disease gene mapping with a large number of SNPs (single nucleotide polymorphisms) predictors. The strength of the lasso penalty can be tuned to select a predetermined number of the most relevant SNPs and other predictors. For a given value of the tuning constant, the penalized likelihood is quickly maximized by cyclic coordinate ascent. Once the most potent marginal predictors are identified, their two-way and higher order interactions can also be examined by lasso penalized logistic regression. This strategy is tested on both simulated and real data. Our findings on coeliac disease replicate the previous SNP results and shed light on possible interactions among the SNPs. The software discussed is available in Mendel 9.0 at the UCLA Human Genetics web site. Supplementary data are available at Bioinformatics online.
                Bookmark

                Author and article information

                Journal
                Comput Math Methods Med
                Comput Math Methods Med
                CMMM
                Computational and Mathematical Methods in Medicine
                Hindawi Publishing Corporation
                1748-670X
                1748-6718
                2013
                24 September 2013
                : 2013
                : 340678
                Affiliations
                1Department of Statistics and Information Science, Dongguk University, Gyeongju 780-714, Republic of Korea
                2Samsung Cancer Research Institute, Samsung Medical Center, Seoul 137-710, Republic of Korea
                3Department of Medical Oncology and Hematology, Princess Margaret Hospital, University of Toronto, Toronto, ON, Canada M5G 2M9
                4Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA
                Author notes

                Academic Editor: Wenqing He

                Author information
                http://orcid.org/0000-0003-2126-9378
                Article
                10.1155/2013/340678
                3794570
                24174989
                0fddfb8c-fc6d-4152-93a8-18649c6e1f93
                Copyright © 2013 Jinseog Kim et al.

                This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 22 May 2013
                : 14 August 2013
                : 22 August 2013
                Categories
                Research Article

                Applied mathematics
                Applied mathematics

                Comments

                Comment on this article