Blog
About

21
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Genetic Algorithm for Diploid Genome Reconstruction Using Paired-End Sequencing

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The genome of many species in the biosphere is a diploid consisting of paternal and maternal haplotypes. The differences between these two haplotypes range from single nucleotide polymorphisms (SNPs) to large-scale structural variations (SVs). Existing genome assemblers for next-generation sequencing platforms attempt to reconstruct one consensus sequence, which is a mosaic of two parental haplotypes. Reconstructing paternal and maternal haplotypes is an important task in linkage analysis and association studies. This study designs and implemented HapSVAssembler on the basis of Genetic Algorithm (GA) and paired-end sequencing. The proposed method builds a consensus sequence, identifies various types of heterozygous variants, and reconstructs the paternal and maternal haplotypes by solving an optimization problem with a GA algorithm. Experimental results indicate that the HapSVAssembler has high accuracy and contiguity under various sequencing coverage, error rates, and insert sizes. The program is tested on pilot sequencing of a highly heterozygous genome, and 12,781 heterozygous SNPs and 602 hemizygous SVs are identified. We observe that, although the number of SVs is much less than that of SNPs, the genomic regions occupied by SVs are much larger, implying the heterozygosity computed using SNPs or k-mer spectrum may be under-estimated.

          Related collections

          Most cited references 24

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Sequence Alignment/Map format and SAMtools

          Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Fast and accurate short read alignment with Burrows–Wheeler transform

            Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

              We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2016
                18 November 2016
                : 11
                : 11
                Affiliations
                [1 ]Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan
                [2 ]Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
                [3 ]Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan, Taiwan
                [4 ]Institute of Biomedical Sciences, National Chung Hsing University, Taichung, Taiwan
                Xiamen University, CHINA
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                • Conceptualization: YTH CKT.

                • Methodology: YTH CKT SYC.

                • Resources: CSL MTC JWC.

                • Software: SYC YTH.

                • Writing – original draft: SYC YTH.

                • Writing – review & editing: YTH CKT.

                PONE-D-16-32260
                10.1371/journal.pone.0166721
                5115803
                27861560
                © 2016 Ting et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                Counts
                Figures: 20, Tables: 3, Pages: 24
                Product
                Funding
                YTH was supported in part by the Ministry of Science and Technology (MOST) with grant numbers 103-2923-E-194-001-MY3 and 104-2221-E-194-048-MY2.
                Categories
                Research Article
                Biology and Life Sciences
                Evolutionary Biology
                Population Genetics
                Haplotypes
                Biology and Life Sciences
                Genetics
                Population Genetics
                Haplotypes
                Biology and Life Sciences
                Population Biology
                Population Genetics
                Haplotypes
                Biology and life sciences
                Molecular biology
                Molecular biology techniques
                DNA construction
                DNA library construction
                Genomic Library Construction
                Research and analysis methods
                Molecular biology techniques
                DNA construction
                DNA library construction
                Genomic Library Construction
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Sequence Assembly Tools
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Sequence Assembly Tools
                Biology and Life Sciences
                Developmental Biology
                Genomic Imprinting
                Biology and Life Sciences
                Genetics
                Epigenetics
                Genomic Imprinting
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Genome Sequencing
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Genome Sequencing
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Gene Mapping
                Chromosome Mapping
                Research and Analysis Methods
                Molecular Biology Techniques
                Gene Mapping
                Chromosome Mapping
                Biology and Life Sciences
                Computational Biology
                Genome Complexity
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Complexity
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Sequence Alignment
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Sequence Alignment
                Custom metadata
                All relevant data are within the paper and its Supporting Information files.

                Uncategorized

                Comments

                Comment on this article