Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
43
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Assessing the Pathogenicity of Insertion and Deletion Variants with the Variant Effect Scoring Tool (VEST‐Indel)

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          ABSTRACT

          Insertion/deletion variants (indels) alter protein sequence and length, yet are highly prevalent in healthy populations, presenting a challenge to bioinformatics classifiers. Commonly used features—DNA and protein sequence conservation, indel length, and occurrence in repeat regions—are useful for inference of protein damage. However, these features can cause false positives when predicting the impact of indels on disease. Existing methods for indel classification suffer from low specificities, severely limiting clinical utility. Here, we further develop our variant effect scoring tool (VEST) to include the classification of in‐frame and frameshift indels (VEST‐indel) as pathogenic or benign. We apply 24 features, including a new “PubMed” feature, to estimate a gene's importance in human disease. When compared with four existing indel classifiers, our method achieves a drastically reduced false‐positive rate, improving specificity by as much as 90%. This approach of estimating gene importance might be generally applicable to missense and other bioinformatics pathogenicity predictors, which often fail to achieve high specificity. Finally, we tested all possible meta‐predictors that can be obtained from combining the four different indel classifiers using Boolean conjunctions and disjunctions, and derived a meta‐predictor with improved performance over any individual method.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses in R

          Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a population genomics package for the R software environment (a de facto standard for statistical analyses). PopGenome can efficiently process genome-scale data as well as large sets of individual loci. It reads DNA alignments and single-nucleotide polymorphism (SNP) data sets in most common formats, including those used by the HapMap, 1000 human genomes, and 1001 Arabidopsis genomes projects. PopGenome also reads associated annotation files in GFF format, enabling users to easily define regions or classify SNPs based on their annotation; all analyses can also be applied to sliding windows. PopGenome offers a wide range of diverse population genetics analyses, including neutrality tests as well as statistics for population differentiation, linkage disequilibrium, and recombination. PopGenome is linked to Hudson’s MS and Ewing’s MSMS programs to assess statistical significance based on coalescent simulations. PopGenome’s integration in R facilitates effortless and reproducible downstream analyses as well as the production of publication-quality graphics. Developers can easily incorporate new analyses methods into the PopGenome framework. PopGenome and R are freely available from CRAN (http://cran.r-project.org/) for all major operating systems under the GNU General Public License.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Proportionally more deleterious genetic variation in European than in African populations.

            Quantifying the number of deleterious mutations per diploid human genome is of crucial concern to both evolutionary and medical geneticists. Here we combine genome-wide polymorphism data from PCR-based exon resequencing, comparative genomic data across mammalian species, and protein structure predictions to estimate the number of functionally consequential single-nucleotide polymorphisms (SNPs) carried by each of 15 African American (AA) and 20 European American (EA) individuals. We find that AAs show significantly higher levels of nucleotide heterozygosity than do EAs for all categories of functional SNPs considered, including synonymous, non-synonymous, predicted 'benign', predicted 'possibly damaging' and predicted 'probably damaging' SNPs. This result is wholly consistent with previous work showing higher overall levels of nucleotide variation in African populations than in Europeans. EA individuals, in contrast, have significantly more genotypes homozygous for the derived allele at synonymous and non-synonymous SNPs and for the damaging allele at 'probably damaging' SNPs than AAs do. For SNPs segregating only in one population or the other, the proportion of non-synonymous SNPs is significantly higher in the EA sample (55.4%) than in the AA sample (47.0%; P < 2.3 x 10(-37)). We observe a similar proportional excess of SNPs that are inferred to be 'probably damaging' (15.9% in EA; 12.1% in AA; P < 3.3 x 10(-11)). Using extensive simulations, we show that this excess proportion of segregating damaging alleles in Europeans is probably a consequence of a bottleneck that Europeans experienced at about the time of the migration out of Africa.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The 1000 Genomes Project: data management and community access.

              The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalog of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available. Members of the project data coordination center have developed and deployed several tools to enable widespread data access.
                Bookmark

                Author and article information

                Journal
                Hum Mutat
                Hum. Mutat
                10.1002/(ISSN)1098-1004
                HUMU
                Human Mutation
                John Wiley and Sons Inc. (Hoboken )
                1059-7794
                1098-1004
                26 October 2015
                January 2016
                : 37
                : 1 ( doiID: 10.1002/humu.2016.37.issue-1 )
                : 28-35
                Affiliations
                [ 1 ] Department of Biomedical Engineering and Institute for Computational MedicineThe Johns Hopkins University Baltimore Maryland
                [ 2 ] Institute of Medical Genetics School of MedicineCardiff University Heath Park CardiffUK
                [ 3 ]In Silico Solutions Fairfax Virginia
                [ 4 ] Department of OncologyJohns Hopkins University School of Medicine Baltimore Maryland
                Author notes
                [*] [* ]Correspondence to: Rachel Karchin, Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD. E‐mail: karchin@ 123456jhu.edu
                [†]

                These authors contributed equally to this work.

                Article
                HUMU22911
                10.1002/humu.22911
                5057310
                26442818
                d469bb87-1c7e-40d0-8525-f2396a133358
                © 2015 The Authors. ** Human Mutation published by Wiley Periodicals, Inc.

                This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivatives License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

                History
                : 09 July 2015
                : 14 September 2015
                Page count
                Pages: 8
                Categories
                Informatics
                Informatics
                Custom metadata
                2.0
                humu22911
                January 2016
                Converter:WILEY_ML3GV2_TO_NLMPMC version:4.9.4 mode:remove_FC converted:12.10.2016

                Human biology
                insertion deletion variant,indel,in‐frame frameshift,bioinformatics pathogenicity predictor,meta‐predictor

                Comments

                Comment on this article