Identity-by-descent filtering of exome sequence data for disease–gene identification in autosomal recessive disorders

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Motivation: Next-generation sequencing and exome-capture technologies are currently revolutionizing the way geneticists screen for disease-causing mutations in rare Mendelian disorders. However, the identification of causal mutations is challenging due to the sheer number of variants that are identified in individual exomes. Although databases such as dbSNP or HapMap can be used to reduce the plethora of candidate genes by filtering out common variants, the remaining set of genes still remains on the order of dozens.

Results: Our algorithm uses a non-homogeneous hidden Markov model that employs local recombination rates to identify chromosomal regions that are identical by descent (IBD = 2) in children of consanguineous or non-consanguineous parents solely based on genotype data of siblings derived from high-throughput sequencing platforms. Using simulated and real exome sequence data, we show that our algorithm is able to reduce the search space for the causative disease gene to a fifth or a tenth of the entire exome.

Availability: An R script and an accompanying tutorial are available at http://compbio.charite.de/index.php/ibd2.html.

Contact: peter.robinson@ 123456charite.de

Related collections

Most cited references 22

Record: found
Abstract: found
Article: not found

A high-resolution recombination map of the human genome.

Augustine Kong, Daniel Gudbjartsson, Jesús Sainz … (2002)

Determination of recombination rates across the human genome has been constrained by the limited resolution and accuracy of existing genetic maps and the draft genome sequence. We have genotyped 5,136 microsatellite markers for 146 families, with a total of 1,257 meiotic events, to build a high-resolution genetic map meant to: (i) improve the genetic order of polymorphic markers; (ii) improve the precision of estimates of genetic distances; (iii) correct portions of the sequence assembly and SNP map of the human genome; and (iv) build a map of recombination rates. Recombination rates are significantly correlated with both cytogenetic structures (staining intensity of G bands) and sequence (GC content, CpG motifs and poly(A)/poly(T) stretches). Maternal and paternal chromosomes show many differences in locations of recombination maxima. We detected systematic differences in recombination rates between mothers and between gametes from the same mother, suggesting that there is some underlying component determined by both genetic and environmental factors that affects maternal recombination rates.

0 comments Cited 379 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Analysis of genetic inheritance in a family quartet by whole-genome sequencing.

Jared Roach, Gustavo Glusman, Arian Smit … (2010)

We analyzed the whole-genome sequences of a family of four, consisting of two siblings and their parents. Family-based sequencing allowed us to delineate recombination sites precisely, identify 70% of the sequencing errors (resulting in > 99.999% accuracy), and identify very rare single-nucleotide polymorphisms. We also directly estimated a human intergeneration mutation rate of approximately 1.1 x 10(-8) per position per haploid genome. Both offspring in this family have two recessive disorders: Miller syndrome, for which the gene was concurrently identified, and primary ciliary dyskinesia, for which causative genes have been previously identified. Family-based genome analysis enabled us to narrow the candidate genes for both of these Mendelian disorders to only four. Our results demonstrate the value of complete genome sequencing in families.

0 comments Cited 354 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes.

Kim D. Pruitt, Jennifer Harrow, Rachel A. Harte … (2009)

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.

0 comments Cited 263 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Bioinformatics

Journal ID (publisher-id): bioinformatics

Journal ID (hwp): bioinfo

Title: Bioinformatics

Publisher: Oxford University Press

ISSN (Print): 1367-4803

ISSN (Electronic): 1367-4811

Publication date (Print): 15 March 2011

Publication date (Electronic): 28 January 2011

Publication date PMC-release: 28 January 2011

Volume: 27

Issue: 6

Pages: 829-836

Affiliations

¹Institute for Medical and Human Genetics, ²Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, Berlin, ³Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany and ⁴Department of Pediatrics, University of Washington, Seattle, WA 98195, USA

Author notes

* To whom correspondence should be addressed.

^† The authors wish it to be known that, in their opinion, the first three authors should be regarded as joint First Authors.

Associate Editor: Jeffrey Barrett

Article

Publisher ID: btr022

DOI: 10.1093/bioinformatics/btr022

PMC ID: 3051326

PubMed ID: 21278187

SO-VID: cf5924c5-d89d-4ca4-a3df-b6b967ac26c5

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 9 October 2010

Date revision received : 13 December 2010

Date accepted : 11 January 2011

Comments

Comment on this article

scite_

Cited by 10

See all cited by

Most referenced authors 1,764

See all reference authors

Identity-by-descent filtering of exome sequence data for disease–gene identification in autosomal recessive disorders

Read this article at

Abstract

Related collections

Genetoberfest

Most cited references 22

A high-resolution recombination map of the human genome.

Analysis of genetic inheritance in a family quartet by whole-genome sequencing.

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 230

Cited by 10

Most referenced authors 1,764