21
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Scheduling Algorithm for Computational Grids that Minimizes Centralized Processing in Genome Assembly of Next-Generation Sequencing Data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Improvements in genome sequencing techniques have resulted in generation of huge volumes of data. As a consequence of this progress, the genome assembly stage demands even more computational power, since the incoming sequence files contain large amounts of data. To speed up the process, it is often necessary to distribute the workload among a group of machines. However, this requires hardware and software solutions specially configured for this purpose. Grid computing try to simplify this process of aggregate resources, but do not always offer the best performance possible due to heterogeneity and decentralized management of its resources. Thus, it is necessary to develop software that takes into account these peculiarities. In order to achieve this purpose, we developed an algorithm aimed to optimize the functionality of de novo assembly software ABySS in order to optimize its operation in grids. We run ABySS with and without the algorithm we developed in the grid simulator SimGrid. Tests showed that our algorithm is viable, flexible, and scalable even on a heterogeneous environment, which improved the genome assembly time in computational grids without changing its quality.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: found
          • Article: not found

          Complete genome sequence of Corynebacterium pseudotuberculosis I19, a strain isolated from a cow in Israel with bovine mastitis.

          This work reports the completion and annotation of the genome sequence of Corynebacterium pseudotuberculosis I19, isolated from an Israeli dairy cow with severe clinical mastitis. To present the whole-genome sequence, a de novo assembly approach using 33 million short (25-bp) mate-paired SOLiD reads only was applied. Furthermore, the automatic, functional, and manual annotations were attained with the use of several algorithms in a multistep process.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            On the efficacy, efficiency and emergent behavior of task replication in large distributed systems

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A special-purpose processor for gene sequence analysis.

              Advances in computational biology have occurred primarily in the areas of software and algorithm development; new designs of hardware to support biological computing are extremely scarce. This is due, we believe, to the presence of a non-trivial knowledge gap between molecular biologists and computer designers. The existence of this gap is unfortunate, as it has long been known that for certain problems, special-purpose computers can achieve significant cost/performance gains over general-purpose machines. We describe one such computer here: a custom accelerator for gene sequence analysis. The accelerator implements a version of the Needleman-Wunsch algorithm for nucleotide sequence alignment. Sequence lengths are constrained only by available memory; the product of sequence lengths in the current implementation can be up to 2(22). The machine is implemented as two NuBus boards connected to a Mac IIf/x, using a mixture of TTL and FPGA technology clocked at 10 MHz. The boards are completely functional, and yield a 15-fold performance improvement over an unassisted host.
                Bookmark

                Author and article information

                Journal
                Front Genet
                Front Genet
                Front. Gene.
                Frontiers in Genetics
                Frontiers Research Foundation
                1664-8021
                19 March 2012
                2012
                : 3
                : 38
                Affiliations
                [1] 1simpleInstitute of Exact and Natural Sciences, Federal University of Pará Pará, Brazil
                [2] 2simpleInstitute of Biological Sciences, Federal University of Pará Pará, Brazil
                [3] 3simpleInstitute of Biological Sciences, Federal University of Minas Gerais Belo Horizonte, Brazil
                Author notes

                Edited by: Raya Khanin, Memorial Sloan-Kettering Cancer Center, USA

                Reviewed by: Mario Inostroza-Ponta, Universidad de Santiago de Chile, Chile; Jan Aerts, Leuven University, Belgium

                *Correspondence: Vasco Azevedo, Institute of Biological Sciences, Federal University of Minas Gerais, Av. Antônio Carlos, 6627 – Pampulha, CEP 31270-901, Belo Horizonte, Minas Gerais, Brazil. e-mail: vascoariston@ 123456gmail.com

                This article was submitted to Frontiers in Bioinformatics and Computational Biology, a specialty of Frontiers in Genetics.

                Article
                10.3389/fgene.2012.00038
                3306921
                22461785
                bacbb57c-a8f4-46d6-afde-f2eb8e85f43d
                Copyright © 2012 Lima, Cerdeira, Bol, Schneider, Silva, Azevedo and Abelém.

                This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.

                History
                : 16 November 2011
                : 27 February 2012
                Page count
                Figures: 4, Tables: 0, Equations: 0, References: 15, Pages: 4, Words: 2913
                Categories
                Genetics
                Original Research

                Genetics
                genome assembly,task scheduling,ngs,computational grids
                Genetics
                genome assembly, task scheduling, ngs, computational grids

                Comments

                Comment on this article