9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MinCall - MinION end2end convolutional deep learning basecaller

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Oxford Nanopore Technologies's MinION is the first portable DNA sequencing device. It is capable of producing long reads, over 100 kBp were reported. However, it has significantly higher error rate than other methods. In this study, we present MinCall, an end2end basecaller model for the MinION. The model is based on deep learning and uses convolutional neural networks (CNN) in its implementation. For extra performance, it uses cutting edge deep learning techniques and architectures, batch normalization and Connectionist Temporal Classification (CTC) loss. The best performing deep learning model achieves 91.4% median match rate on E. Coli dataset using R9 pore chemistry and 1D reads.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: found
          • Article: not found

          A first look at the Oxford Nanopore MinION sequencer.

          Oxford Nanopore's third-generation single-molecule sequencing platform promises to decrease costs for reagents and instrumentation. After a 2-year hiatus following the initial announcement, the first devices have been released as part of an early access program. We explore the performance of this platform by resequencing the lambda phage genome, and amplicons from a snake venom gland transcriptome. Although the handheld MinION sequencer can generate more than 150 megabases of raw data in one run, at most a quarter of the resulting reads map to the reference, with less than average 10% identity. Much of the sequence consists of insertion/deletion errors, or is seemingly without similarity to the template. Using the lambda phage data as an example, although the reads are long, averaging 5 kb, at best 890 ± 1932 bases per mapped read could be matched to the reference without soft clipping. In the course of a 36 h run on the MinION, it was possible to resequence the 48 kb lambda phage reference at 16× coverage. Currently, substantially larger projects would not be feasible using the MinION. Without increases in accuracy, which would be required for applications such as genome scaffolding and phasing, the current utility of the MinION appears limited. Library preparation requires access to a molecular laboratory, and is of similar complexity and cost to that of other next-generation sequencing platforms. The MinION is an exciting step in a new direction for single-molecule sequencing, though it will require dramatic decreases in error rates before it lives up to its promise.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            De novo sequencing and variant calling with nanopores using PoreSeq

            The single-molecule accuracy of nanopore sequencing has been an area of rapid academic and commercial advancement, but remains challenging for the de novo analysis of genomes. We introduce here a novel algorithm for the error correction of nanopore data, utilizing statistical models of the physical system in order to obtain high accuracy de novo sequences at a range of coverage depths. We demonstrate the technique by sequencing M13 bacteriophage DNA to 99% accuracy at moderate coverage as well as its use in an assembly pipeline by sequencing E. coli and λ DNA at a range of coverages. We also show the algorithm’s ability to accurately classify sequence variants at far lower coverage than existing methods.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              DNA base-calling from a nanopore using a Viterbi algorithm.

              Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (~98%), even with a poor signal/noise ratio.
                Bookmark

                Author and article information

                Journal
                22 April 2019
                Article
                1904.10337
                6254ccc2-ab41-4374-b9a0-fbc0e354a4b8

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                2nd international workshop on deep learning for precision medicine, ECML-PKDD 2017
                q-bio.GN cs.LG

                Artificial intelligence,Genetics
                Artificial intelligence, Genetics

                Comments

                Comment on this article