A general framework for estimating the relative pathogenicity of human genetic variants

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation. Current genomic annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Here, we describe Combined Annotation Dependent Depletion (CADD), a framework that objectively integrates many diverse annotations into a single, quantitative score. We implement CADD as a support vector machine trained to differentiate 14.7 million high-frequency human derived alleles from 14.7 million simulated variants. We pre-compute “C-scores” for all 8.6 billion possible human single nucleotide variants and enable scoring of short insertions/deletions. C-scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects, and complex trait associations, and highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious, and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current annotation.

Related collections

Most cited references 23

Record: found
Abstract: found
Article: not found

A high-coverage genome sequence from an archaic Denisovan individual.

Matthias Meyer, Martin Kircher, Marie-Theres Gansauge … (2012)

We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (30×) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of "missing evolution" in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans.

0 comments Cited 780 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

De novo gene disruptions in children on the autistic spectrum.

Ivan Iossifov, Michael Ronemus, Dan Levy … (2012)

Exome sequencing of 343 families, each with a single child on the autism spectrum and at least one unaffected sibling, reveal de novo small indels and point substitutions, which come mostly from the paternal line in an age-dependent manner. We do not see significantly greater numbers of de novo missense mutations in affected versus unaffected children, but gene-disrupting mutations (nonsense, splice site, and frame shifts) are twice as frequent, 59 to 28. Based on this differential and the number of recurrent and total targets of gene disruption found in our and similar studies, we estimate between 350 and 400 autism susceptibility genes. Many of the disrupted genes in these studies are associated with the fragile X protein, FMRP, reinforcing links between autism and synaptic plasticity. We find FMRP-associated genes are under greater purifying selection than the remainder of genes and suggest they are especially dosage-sensitive targets of cognitive disorders. Copyright © 2012 Elsevier Inc. All rights reserved.

0 comments Cited 600 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

High-resolution mapping and characterization of open chromatin across the genome.

Alan Boyle, Sean Davis, Hennady Shulha … (2008)

Mapping DNase I hypersensitive (HS) sites is an accurate method of identifying the location of genetic regulatory elements, including promoters, enhancers, silencers, insulators, and locus control regions. We employed high-throughput sequencing and whole-genome tiled array strategies to identify DNase I HS sites within human primary CD4+ T cells. Combining these two technologies, we have created a comprehensive and accurate genome-wide open chromatin map. Surprisingly, only 16%-21% of the identified 94,925 DNase I HS sites are found in promoters or first exons of known genes, but nearly half of the most open sites are in these regions. In conjunction with expression, motif, and chromatin immunoprecipitation data, we find evidence of cell-type-specific characteristics, including the ability to identify transcription start sites and locations of different chromatin marks utilized in these cells. In addition, and unexpectedly, our analyses have uncovered detailed features of nucleosome structure.

0 comments Cited 561 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Martin Kircher

Daniela M. Witten

Preti Jain

Brian J. O’Roak

Gregory M. Cooper

Jay Shendure

Journal

Journal ID (nlm-journal-id): 9216904

Journal ID (pubmed-jr-id): 2419

Journal ID (nlm-ta): Nat Genet

Journal ID (iso-abbrev): Nat. Genet.

Title: Nature genetics

ISSN (Print): 1061-4036

ISSN (Electronic): 1546-1718

Publication date Nihms-submitted: 28 February 2014

Publication date (Electronic): 02 February 2014

Publication date (Print): March 2014

Publication date PMC-release: 01 September 2014

Volume: 46

Issue: 3

Pages: 310-315

Affiliations

[1 ]Department of Genome Sciences, University of Washington, Seattle, WA, USA

[2 ]Department of Biostatistics, University of Washington, Seattle, WA, USA

[3 ]HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA

Author notes

[# ]To whom correspondence should be addressed: shendure@ 123456uw.edu , gcooper@ 123456hudsonalpha.org

[*]

These authors contributed equally to this work

[4]

Present address: Department of Molecular & Medical Genetics, Oregon Health & Science University, Portland, OR, USA

Article

Manuscript ID: NIHMS555958

DOI: 10.1038/ng.2892

PMC ID: 3992975

PubMed ID: 24487276

SO-VID: b979f824-9408-43e1-bb10-1bee3550d2f8

License:

Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

A general framework for estimating the relative pathogenicity of human genetic variants

Read this article at

Abstract

Related collections

RNA drug delivery

Most cited references 23

A high-coverage genome sequence from an archaic Denisovan individual.

De novo gene disruptions in children on the autistic spectrum.

High-resolution mapping and characterization of open chromatin across the genome.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 108

Cited by 2,555

Most referenced authors 2,419