54
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      HTQC: a fast quality control toolkit for Illumina sequencing data

      product-review

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Illumina sequencing platform is widely used in genome research. Sequence reads quality assessment and control are needed for downstream analysis. However, software that provides efficient quality assessment and versatile filtration methods is still lacking.

          Results

          We have developed a toolkit named HTQC – abbreviation of High-Throughput Quality Control – for sequence reads quality control, which consists of six programs for reads quality assessment, reads filtration and generation of graphic reports.

          Conclusions

          The HTQC toolkit can generate reads quality assessment faster than existing tools, providing guidance for reads filtration utilities that allow users to choose different strategies to remove low quality reads.

          Related collections

          Most cited references2

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          PIQA: pipeline for Illumina G1 genome analyzer data quality assessment

          Summary: PIQA is a quality analysis pipeline designed to examine genomic reads produced by Next Generation Sequencing technology (Illumina G1 Genome Analyzer). A short statistical summary, as well as tile-by-tile and cycle-by-cycle graphical representation of clusters density, quality scores and nucleotide frequencies allow easy identification of various technical problems including defective tiles, mistakes in sample/library preparations and abnormalities in the frequencies of appearance of sequenced genomic reads. PIQA is written in the R statistical programming language and is compatible with bustard, fastq and scarf Illumina G1 Genome Analyzer data formats. Availability: The PIQA pipeline, installation instructions and examples are available at the supplementary web site (http://bioinfo.uh.edu/PIQA). Contact: yfofanov@bioinfo.uh.edu
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data

            The emergence of next-generation sequencing (NGS) technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Although there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn’t taken into account the sequencing errors when dealing with the duplicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http://bigpre.sourceforge.net/.
              Bookmark

              Author and article information

              Contributors
              Journal
              BMC Bioinformatics
              BMC Bioinformatics
              BMC Bioinformatics
              BioMed Central
              1471-2105
              2013
              31 January 2013
              : 14
              : 33
              Affiliations
              [1 ]CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, NO.1 West Beichen Road, Chaoyang District, Beijing, China
              [2 ]Beijing Institutes of Life Sciences, Chinese Academy of Sciences, NO.1 West Beichen Road, Chaoyang District, Beijing, China
              Article
              1471-2105-14-33
              10.1186/1471-2105-14-33
              3571943
              23363224
              f637a8d7-44ad-48a7-8e86-52ae5e72380c
              Copyright ©2013 Yang et al.; licensee BioMed Central Ltd.

              This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

              History
              : 7 September 2012
              : 27 January 2013
              Categories
              Software

              Bioinformatics & Computational biology
              Bioinformatics & Computational biology

              Comments

              Comment on this article