407
views
0
recommends
+1 Recommend
0 collections
    17
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Sequence-specific error profile of Illumina sequencers

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We identified the sequence-specific starting positions of consecutive miscalls in the mapping of reads obtained from the Illumina Genome Analyser (GA). Detailed analysis of the miscall pattern indicated that the underlying mechanism involves sequence-specific interference of the base elongation process during sequencing. The two major sequence patterns that trigger this sequence-specific error (SSE) are: (i) inverted repeats and (ii) GGC sequences. We speculate that these sequences favor dephasing by inhibiting single-base elongation, by: (i) folding single-stranded DNA and (ii) altering enzyme preference. This phenomenon is a major cause of sequence coverage variability and of the unfavorable bias observed for population-targeted methods such as RNA-seq and ChIP-seq. Moreover, SSE is a potential cause of false single-nucleotide polymorphism (SNP) calls and also significantly hinders de novo assembly. This article highlights the importance of recognizing SSE and its underlying mechanisms in the hope of enhancing the potential usefulness of the Illumina sequencers.

          Related collections

          Most cited references17

          • Record: found
          • Abstract: not found
          • Article: not found

          Identification of common molecular subsequences.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Solexa Ltd.

            Solexa Ltd is developing an integrated system, based on a breakthrough single molecule sequencing technology, to address a US$2 billion market that is expected to grow exponentially alongside and as a consequence of further technological enhancements. The system, software and consumables will initially be sold to research organizations, pharmaceutical companies and diagnostic companies that will sequence large regions of genomic DNA, including whole genomes, at costs several orders of magnitude below current levels. Solexa expects to launch its first product in 2006, and as it continues to make time and cost efficiencies, additional products will be launched into the expanding markets that will have broad applications in basic research through to healthcare management.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A large genome center's improvements to the Illumina sequencing system.

              The Wellcome Trust Sanger Institute is one of the world's largest genome centers, and a substantial amount of our sequencing is performed with 'next-generation' massively parallel sequencing technologies: in June 2008 the quantity of purity-filtered sequence data generated by our Genome Analyzer (Illumina) platforms reached 1 terabase, and our average weekly Illumina production output is currently 64 gigabases. Here we describe a set of improvements we have made to the standard Illumina protocols to make the library preparation more reliable in a high-throughput environment, to reduce bias, tighten insert size distribution and reliably obtain high yields of data.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                July 2011
                July 2011
                14 May 2011
                14 May 2011
                : 39
                : 13
                : e90
                Affiliations
                1Graduate School of Information Science, 2Graduate School of Biological Sciences, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan, 3Biological Science Laboratories, Kao Corporation, 2606 Akabane, Ichikai, Haga, Tochigi 321-3497, 4Department of Bioscience, Tokyo University of Agriculture, 5Genome Research Center, NODAI Research Institute, Tokyo University of Agriculture, 1-1-1 Sakuragaoka Setagaya-ku, Tokyo, 156-8502, Japan and 6Department of Chemical Engineering and Material Science, University of Minnesota, 223 Amundson Hall, 421 Washington Avenue S.E., Minneapolis, MN 55455, USA
                Author notes
                *To whom correspondence should be addressed. Tel: +81 743 72 5396; Fax: +81 743 72 5258; Email: kensuke-nm@ 123456is.naist.jp ; kenske@ 123456mac.com
                Correspondence may also be addressed to Shigehiko Kanaya. Tel: +81 743 72 5952; Fax: +81 743 72 5390; Email: skanaya@ 123456gtc.naist.jp
                Article
                gkr344
                10.1093/nar/gkr344
                3141275
                21576222
                b251b1d4-99e4-423e-b867-9c95b0d2fe08
                © The Author(s) 2011. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 3 February 2011
                : 25 April 2011
                : 26 April 2011
                Page count
                Pages: 13
                Categories
                Methods Online

                Genetics
                Genetics

                Comments

                Comment on this article