470
views
0
recommends
+1 Recommend
0 collections
    36
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Base-Calling of Automated Sequencer Traces UsingPhred.II. Error Probabilities

      ,
      Genome Research
      Cold Spring Harbor Laboratory

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Elimination of the data processing bottleneck in high-throughput sequencing will require both improved accuracy of data processing software and reliable measures of that accuracy. We have developed and implemented in our base-calling program phred the ability to estimate a probability of error for each base-call, as a function of certain parameters computed from the trace data. These error probabilities are shown here to be valid (correspond to actual error rates) and to have high power to discriminate correct base-calls from incorrect ones, for read data collected under several different chemistries and electrophoretic conditions. They play a critical role in our assembly program phrap and our finishing program consed.

          Related collections

          Most cited references6

          • Record: found
          • Abstract: not found
          • Article: not found

          Consed:A Graphical Tool for Sequence Finishing

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            DNA sequencing with dye-labeled terminators and T7 DNA polymerase: effect of dyes and dNTPs on incorporation of dye-terminators and probability analysis of termination fragments.

            The incorporation of fluorescently labeled dideoxynucleotides by T7 DNA polymerase is optimized by the use of Mn2+, fluorescein analogs and four 2'-deoxyribonucleoside 5'-O-(1-thiotriphosphates) (dNTP alpha S's). The one-tube extension protocol was tested on single-stranded templates, as well as PCR fragments which were made single-stranded by digestion with T7 gene 6 exonuclease. Dye primer sequencing using four dNTP alpha S's was shown to give uniform termination patterns which were comparable to four dNTPs. Efficiency of the polymerase also appeared to improve with the dNTP alpha S's. A mathematical model was developed to predict the pattern of termination based on enzyme activity and ratios of ddNTP/dNTPs. This method can be used to optimize sequencing reactions and to estimate enzyme discrimination constants of chain terminators.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              AmpliTaq DNA polymerase, FS dye-terminator sequencing: analysis of peak height patterns.

              Taq DNA polymerases in which the phenylalanine is substituted by a tyrosine at position 667 (Taq F667Y) are members of a new class of DNA polymerases that incorporate chain-terminating dideoxyribonucleoside triphosphates (ddNTPs) much more efficiently than the wild-type Taq DNA polymerase. Improved incorporation of ddNTPs into DNA during cycle sequencing using AmpliTaq DNA polymerase, FS (Taq-FS, a member of the Taq F667Y family), and dye-labeled primers results in nearly uniform peak heights in the sequencing trace. This is not the case when dye-labeled ddNTPs are used in Taq-FS cycle sequencing reactions. While the rate of dye-terminator incorporation is more efficient with Taq-FS, the peak pattern is still highly variable and different from that produced by the wild-type enzyme. We have systematically examined pairs of sequence-tagged sites that vary at only a single nucleotide to determine how base changes influence the peak heights of neighboring bases in sequencing traces generated by the Taq-FS dye-terminator chemistry. In 31 of 64 possible 3-base windows (48%), we find that the peak height of a particular base can be predicted by knowing just one or two bases 5' to the base in question. We have also compared and contrasted the peak patterns produced by the Taq-FS enzyme with those previously identified for the wild-type enzyme. Establishing the patterns in peak heights within local sequence contexts can improve the accuracy of base-calling and the identification of polymorphisms/mutations when using the Taq-FS dye-terminator cycle-sequencing chemistry.
                Bookmark

                Author and article information

                Journal
                Genome Research
                Genome Res.
                Cold Spring Harbor Laboratory
                1088-9051
                1549-5469
                March 01 1998
                March 01 1998
                March 01 1998
                March 01 1998
                : 8
                : 3
                : 186-194
                Article
                10.1101/gr.8.3.186
                9521922
                39f51e7d-d6ff-4697-a889-9c631b109a08
                © 1998
                History

                Comments

                Comment on this article