6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MultiGWAS: An integrative tool for Genome Wide Association Studies in tetraploid organisms

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The genome‐wide association studies (GWASs) are essential to determine the genetic bases of either ecological or economic phenotypic variation across individuals within populations of the model and nonmodel organisms. For this research question, the GWAS replication testing different parameters and models to validate the results' reproducibility is common. However, straightforward methodologies that manage both replication and tetraploid data are still missing. To solve this problem, we designed the MultiGWAS, a tool that does GWAS for diploid and tetraploid organisms by executing in parallel four software packages, two designed for polyploid data (GWASpoly and SHEsis) and two designed for diploid data (GAPIT and TASSEL). MultiGWAS has several advantages. It runs either in the command line or in a graphical interface; it manages different genotype formats, including VCF. Moreover, it allows control for population structure, relatedness, and several quality control checks on genotype data. Besides, MultiGWAS can test for additive and dominant gene action models, and, through a proprietary scoring function, select the best model to report its associations. Finally, it generates several reports that facilitate identifying false associations from both the significant and the best‐ranked association Single Nucleotide Polymorphisms (SNPs) among the four software packages. We tested MultiGWAS with public tetraploid potato data for tuber shape and several simulated data under both additive and dominant models. These tests demonstrated that MultiGWAS is better at detecting reliable associations than using each of the four software packages individually. Moreover, the parallel analysis of polyploid and diploid software that only offers MultiGWAS demonstrates its utility in understanding the best genetic model behind the SNP association in tetraploid organisms. Therefore, MultiGWAS probed to be an excellent alternative for wrapping GWAS replication in diploid and tetraploid organisms in a single analysis environment.

          Abstract

          The genome‐wide association studies (GWASs) are essential to determine the genetic bases of either ecological or economic phenotypic variation across individuals within populations of model and nonmodel organisms. Replication is a good practice to assess results, but straightforward methodologies that manage both replication and tetraploid data are still missing. To solve this problem, we designed the MultiGWAS, a tool that does GWAS for diploid and tetraploid organisms by executing in parallel four software packages, two for polyploid data (GWASpoly and SHEsis) and two for diploid data (GAPIT and TASSEL). MultiGWAS includes (1) the input and preprocessing of genomic data in different formats (including VCF files), (2) association analysis by running the GWAS software in parallel, (3) postprocessing and summarizing of their results, and (4) reporting using graphical and tabular views. MultiGWAS identifies both the highest scoring and shared associations between the four software packages, which helps users decide more intuitively on possible true or false associations. MultiGWAS can test for additive and dominant gene action models, and, through a proprietary scoring function, select the best model to report its associations. We tested MultiGWAS with public tetraploid potato data for tuber shape and several simulated data under both additive and dominant models. These tests demonstrated that MultiGWAS is better at detecting reliable associations than using each of the four software packages individually.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: not found

          PLINK: a tool set for whole-genome association and population-based linkage analyses.

          Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The variant call format and VCFtools

            Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Availability: http://vcftools.sourceforge.net Contact: rd@sanger.ac.uk
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              TASSEL: software for association mapping of complex traits in diverse samples.

              Association analyses that exploit the natural diversity of a genome to map at very high resolutions are becoming increasingly important. In most studies, however, researchers must contend with the confounding effects of both population and family structure. TASSEL (Trait Analysis by aSSociation, Evolution and Linkage) implements general linear model and mixed linear model approaches for controlling population and family structure. For result interpretation, the program allows for linkage disequilibrium statistics to be calculated and visualized graphically. Database browsing and data importation is facilitated by integrated middleware. Other features include analyzing insertions/deletions, calculating diversity statistics, integration of phenotypic and genotypic data, imputing missing data and calculating principal components.
                Bookmark

                Author and article information

                Contributors
                phreyes@agrosavia.co
                Journal
                Ecol Evol
                Ecol Evol
                10.1002/(ISSN)2045-7758
                ECE3
                Ecology and Evolution
                John Wiley and Sons Inc. (Hoboken )
                2045-7758
                12 May 2021
                June 2021
                : 11
                : 12 ( doiID: 10.1002/ece3.v11.12 )
                : 7411-7426
                Affiliations
                [ 1 ] Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA) CI Tibaitatá Bogota Colombia
                [ 2 ] Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA) CI El Mira Tumaco Colombia
                Author notes
                [*] [* ] Correspondence

                Paula H. Reyes‐Herrera, Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA), CI Tibaitatá, Kilómetro 14, Vía a Mosquera, 250047 Bogota, Colombia.

                Email: phreyes@ 123456agrosavia.co

                Author information
                https://orcid.org/0000-0003-0502-4266
                Article
                ECE37572
                10.1002/ece3.7572
                8216910
                34188823
                dd49fbf8-6178-4389-b462-6a77215a8edc
                © 2021 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd.

                This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                History
                : 22 March 2021
                : 20 November 2020
                : 23 March 2021
                Page count
                Figures: 11, Tables: 0, Pages: 16, Words: 10269
                Funding
                Funded by: AGROSAVIA
                Funded by: Departamento Administrativo de Ciencia, Tecnología e Innovación
                Award ID: 811‐2019
                Categories
                Original Research
                Original Research
                Custom metadata
                2.0
                June 2021
                Converter:WILEY_ML3GV2_TO_JATSPMC version:6.0.2 mode:remove_FC converted:21.06.2021

                Evolutionary Biology
                gapit,gwas on polyploids,gwaspoly,shesis,snp,software,tassel
                Evolutionary Biology
                gapit, gwas on polyploids, gwaspoly, shesis, snp, software, tassel

                Comments

                Comment on this article