Accuracy and quality of massively parallel DNA pyrosequencing

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Error rates were estimated for the Roche GS20 massively parallel pyrosequencing system, and several factors were identified that can be used to remove low-quality reads, improving the accuracy to 99.75% or better.

Abstract

Background

Massively parallel pyrosequencing systems have increased the efficiency of DNA sequencing, although the published per-base accuracy of a Roche GS20 is only 96%. In genome projects, highly redundant consensus assemblies can compensate for sequencing errors. In contrast, studies of microbial diversity that catalogue differences between PCR amplicons of ribosomal RNA genes (rDNA) or other conserved gene families cannot take advantage of consensus assemblies to detect and minimize incorrect base calls.

Results

We performed an empirical study of the per-base error rate for the Roche GS20 system using sequences of the V6 hypervariable region from cloned microbial ribosomal DNA (tag sequencing). We calculated a 99.5% accuracy rate in unassembled sequences, and identified several factors that can be used to remove a small percentage of low-quality reads, improving the accuracy to 99.75% or better.

Conclusion

By using objective criteria to eliminate low quality data, the quality of individual GS20 sequence reads in molecular ecological applications can surpass the accuracy of traditional capillary methods.

Related collections

Most cited references 12

Record: found
Abstract: found
Article: not found

Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements.

Josée Dostie, Todd A Richmond, Ramy Arnaout … (2006)

Physical interactions between genetic elements located throughout the genome play important roles in gene regulation and can be identified with the Chromosome Conformation Capture (3C) methodology. 3C converts physical chromatin interactions into specific ligation products, which are quantified individually by PCR. Here we present a high-throughput 3C approach, 3C-Carbon Copy (5C), that employs microarrays or quantitative DNA sequencing using 454-technology as detection methods. We applied 5C to analyze a 400-kb region containing the human beta-globin locus and a 100-kb conserved gene desert region. We validated 5C by detection of several previously identified looping interactions in the beta-globin locus. We also identified a new looping interaction in K562 cells between the beta-globin Locus Control Region and the gamma-beta-globin intergenic region. Interestingly, this region has been implicated in the control of developmental globin gene switching. 5C should be widely applicable for large-scale mapping of cis- and trans- interaction networks of genomic elements and for the study of higher-order chromosome structure.

0 comments Cited 403 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Le piégeage lumineux, moyen d'approche de la faune entomologique d'un grand fleuve (Ephéméroptères, en particulier)

Josette Fontaine (1982)

0 comments Cited 366 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis

J. R. Cole, B. Chai, R. J. Farris … (2004)

The Ribosomal Database Project (RDP-II) provides the research community with aligned and annotated rRNA gene sequences, along with analysis services and a phylogenetically consistent taxonomic framework for these data. Updated monthly, these services are made available through the RDP-II website (http://rdp.cme.msu.edu/). RDP-II release 9.21 (August 2004) contains 101 632 bacterial small subunit rRNA gene sequences in aligned and annotated format. High-throughput tools for initial taxonomic placement, identification of related sequences, probe and primer testing, data navigation and subalignment download are provided. The RDP-II email address for questions or comments is rdpstaff@msu.edu.

0 comments Cited 322 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Genome Biol

Title: Genome Biology

Publisher: BioMed Central

ISSN (Print): 1465-6906

ISSN (Electronic): 1465-6914

Publication date (Print): 2007

Publication date (Electronic): 20 July 2007

Volume: 8

Issue: 7

Page: R143

Affiliations

[1 ]Josephine Bay Paul Center, Marine Biological Laboratory at Woods Hole, MBL Street, Woods Hole, MA 02543, USA

Article

Publisher ID: gb-2007-8-7-r143

DOI: 10.1186/gb-2007-8-7-r143

PMC ID: 2323236

PubMed ID: 17659080

SO-VID: bbe89180-773c-4983-be2e-dd946a477940

License:

This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 1 March 2007

Date revision received : 2 May 2007

Date accepted : 20 July 2007

Comments

Comment on this article

scite_

Cited by 426

See all cited by

Most referenced authors 1,234

See all reference authors

- Version 1

Accuracy and quality of massively parallel DNA pyrosequencing

Read this article at

Abstract

Abstract

Background

Results

Conclusion

Related collections

Genomic Prediction

Most cited references 12

Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements.

Le piégeage lumineux, moyen d'approche de la faune entomologique d'un grand fleuve (Ephéméroptères, en particulier)

The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis

Author and article information

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 13

Cited by 426

Most referenced authors 1,234