Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Genome-wide association studies suggest that common genetic variants explain only a small fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability 1, 2. While DNA sequencing costs have fallen dramatically 3, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions (`exomes'), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of twelve humans. These include eight HapMap individuals representing three populations 4, and four unrelated individuals with a rare dominantly inherited disorder, Freeman-Sheldon syndrome (FSS) 5. We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases (Mb) of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for monogenic disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of nonsynonymous variants by predicted functional impact.

Related collections

Most cited references 18

Record: found
Abstract: found
Article: not found

Is Open Access

The complete genome of an individual by massively parallel DNA sequencing.

David Wheeler, Maithreyan Srinivasan, Michael Egholm … (2008)

The association of genetic variation with disease and drug response, and improvements in nucleic acid technologies, have given great optimism for the impact of 'genomic medicine'. However, the formidable size of the diploid human genome, approximately 6 gigabases, has prevented the routine application of sequencing methods to deciphering complete individual human genomes. To realize the full potential of genomics for human health, this limitation must be overcome. Here we report the DNA sequence of a diploid genome of a single individual, James D. Watson, sequenced to 7.4-fold redundancy in two months using massively parallel sequencing in picolitre-size reaction vessels. This sequence was completed in two months at approximately one-hundredth of the cost of traditional capillary electrophoresis methods. Comparison of the sequence to the reference genome led to the identification of 3.3 million single nucleotide polymorphisms, of which 10,654 cause amino-acid substitution within the coding sequence. In addition, we accurately identified small-scale (2-40,000 base pair (bp)) insertion and deletion polymorphism as well as copy number variation resulting in the large-scale gain and loss of chromosomal segments ranging from 26,000 to 1.5 million base pairs. Overall, these results agree well with recent results of sequencing of a single individual by traditional methods. However, in addition to being faster and significantly less expensive, this sequencing technology avoids the arbitrary loss of genomic sequences inherent in random shotgun sequencing by bacterial cloning because it amplifies DNA in a cell-free system. As a result, we further demonstrate the acquisition of novel human sequence, including novel genes not previously identified by traditional genomic sequencing. This is the first genome sequenced by next-generation technologies. Therefore it is a pilot for the future challenges of 'personalized genome sequencing'.

0 comments Cited 505 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Mapping and sequencing of structural variation from eight human genomes.

Jeffrey M. Kidd, Gregory M. Cooper, William F Donahue … (2008)

Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale--particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation--a standard for genotyping platforms and a prelude to future individual genome sequencing projects.

0 comments Cited 365 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

DNA sequencing of a cytogenetically normal acute myeloid leukemia genome

Timothy Ley, Elaine R. Mardis, Li Ding … (2008)

Lay Summary Acute myeloid leukemia is a highly malignant hematopoietic tumor that affects about 13,000 adults yearly in the United States. The treatment of this disease has changed little in the past two decades, since most of the genetic events that initiate the disease remain undiscovered. Whole genome sequencing is now possible at a reasonable cost and timeframe to utilize this approach for unbiased discovery of tumor-specific somatic mutations that alter the protein-coding genes. Here we show the results obtained by sequencing a typical acute myeloid leukemia genome and its matched normal counterpart, obtained from the patient’s skin. We discovered 10 genes with acquired mutations; two were previously described mutations thought to contribute to tumor progression, and 8 were novel mutations present in virtually all tumor cells at presentation and relapse, whose function is not yet known. Our study establishes whole genome sequencing as an unbiased method for discovering initiating mutations in cancer genomes, and for identifying novel genes that may respond to targeted therapies. We used massively parallel sequencing technology to sequence the genomic DNA of tumor and normal skin cells obtained from a patient with a typical presentation of FAB M1 Acute Myeloid Leukemia (AML) with normal cytogenetics. 32.7-fold ‘haploid’ coverage (98 billion bases) was obtained for the tumor genome, and 13.9-fold coverage (41.8 billion bases) was obtained for the normal sample. Of 2,647,695 well-supported Single Nucleotide Variants (SNVs) found in the tumor genome, 2,588,486 (97.7%) also were detected in the patient’s skin genome, limiting the number of variants that required further study. For the purposes of this initial study, we restricted our downstream analysis to the coding sequences of annotated genes: we found only eight heterozygous, non-synonymous somatic SNVs in the entire genome. All were novel, including mutations in protocadherin/cadherin family members (CDH24 and PCLKC), G-protein coupled receptors (GPR123 and EBI2), a protein phosphatase (PTPRT), a potential guanine nucleotide exchange factor (KNDC1), a peptide/drug transporter (SLC15A1), and a glutamate receptor gene (GRINL1B). We also detected previously described, recurrent somatic insertions in the FLT3 and NPM1 genes. Based on deep readcount data, we determined that all of these mutations (except FLT3) were present in nearly all tumor cells at presentation, and again at relapse 11 months later, suggesting that the patient had a single dominant clone containing all of the mutations. These results demonstrate the power of whole genome sequencing to discover novel cancer-associated mutations.

0 comments Cited 349 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-journal-id): 0410462

Journal ID (pubmed-jr-id): 6011

Journal ID (nlm-ta): Nature

Title: Nature

ISSN (Print): 0028-0836

ISSN (Electronic): 1476-4687

Publication date Nihms-submitted: 12 March 2010

Publication date (Electronic): 16 August 2009

Publication date (Print): 10 September 2009

Publication date PMC-release: 23 March 2010

Volume: 461

Issue: 7261

Pages: 272-276

Affiliations

[1 ]Department of Genome Sciences, University of Washington, Seattle, WA 98195

[2 ]Department of Pediatrics, University of Washington, Seattle, WA 98195

[3 ]Agilent Technologies, Santa Clara, CA 95051

[4 ]Howard Hughes Medical Institute, Seattle, Washington, WA, 98195

Author notes

Author Contributions The project was conceived and experiments planned by S.B.N., E.H.T., A.B., E.E.E., M.B., D.A.N., and J.S. Experiments were performed by S.B.N., E.H.T., C.L., and M.W. Algorithm development and data analysis were performed by S.B.N., P.D.R., S.D.F., A.W.B., T.S., M.B., D.A.N., and J.S. The manuscript was written by S.B.N. and J.S.

Author Information Reprints and permissions information is available at www.nature.com/reprints. The authors declare competing financial interests: details accompany the full HTML version of the paper at www.nature.com/nature.

Correspondence and requests for material should be addressed to J.S. ( shendure@ 123456u.washington.edu ) or S.B.N. ( sarahng@ 123456u.washington.edu ).

Article

Manuscript ID: nihpa128791

DOI: 10.1038/nature08250

PMC ID: 2844771

PubMed ID: 19684571

SO-VID: f32e8d5d-2aa5-4a03-91a0-935e353bd6f7

History

Funding

Funded by: National Human Genome Research Institute : NHGRI

Funded by: National Heart, Lung, and Blood Institute : NHLBI

Award ID: R21 HG004749-01 ||HG

Funded by: National Human Genome Research Institute : NHGRI

Funded by: National Heart, Lung, and Blood Institute : NHLBI

Award ID: R01 HL094976-01 ||HL

Comments

Comment on this article

scite_

Cited by 593

See all cited by

Most referenced authors 1,676

See all reference authors

Targeted Capture and Massively Parallel Sequencing of Twelve Human Exomes

Read this article at

Abstract

Related collections

Cancer Immunotherapy

Most cited references 18

The complete genome of an individual by massively parallel DNA sequencing.

Mapping and sequencing of structural variation from eight human genomes.

DNA sequencing of a cytogenetically normal acute myeloid leukemia genome

Author and article information

Journal

Affiliations

Author notes

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 255

Cited by 593

Most referenced authors 1,676