54
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise.

          Results

          We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud ( http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic.

          Conclusions

          This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the value added to the research community through the suite of services and resources provided by our implementation.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences

          Increased reliance on computational approaches in the life sciences has revealed grave concerns about how accessible and reproducible computation-reliant results truly are. Galaxy http://usegalaxy.org, an open web-based platform for genomic research, addresses these problems. Galaxy automatically tracks and manages data provenance and provides support for capturing the context and intent of computational methods. Galaxy Pages are interactive, web-based documents that provide users with a medium to communicate a complete computational analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The generic genome browser: a building block for a model organism system database.

            The Generic Model Organism System Database Project (GMOD) seeks to develop reusable software components for model organism system databases. In this paper we describe the Generic Genome Browser (GBrowse), a Web-based application for displaying genomic annotations and other features. For the end user, features of the browser include the ability to scroll and zoom through arbitrary regions of a genome, to enter a region of the genome by searching for a landmark or performing a full text search of all features, and the ability to enable and disable tracks and change their relative order and appearance. The user can upload private annotations to view them in the context of the public ones, and publish those annotations to the community. For the data provider, features of the browser software include reliance on readily available open source components, simple installation, flexible configuration, and easy integration with other components of a model organism system Web site. GBrowse is freely available under an open source license. The software, its documentation, and support are available at http://www.gmod.org.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data

              A large number of computational methods have been developed for analyzing differential gene expression in RNA-seq data. We describe a comprehensive evaluation of common methods using the SEQC benchmark dataset and ENCODE data. We consider a number of key features, including normalization, accuracy of differential expression detection and differential expression analysis when one condition has no detectable expression. We find significant differences among the methods, but note that array-based methods adapted to RNA-seq data perform comparably to methods designed for RNA-seq. Our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                26 October 2015
                2015
                : 10
                : 10
                : e0140829
                Affiliations
                [1 ]Victorian Life Sciences Computation Initiative (VLSCI), University of Melbourne, Melbourne, Victoria, Australia
                [2 ]Department of Biology, Johns Hopkins University, Baltimore, Maryland, United States of America
                [3 ]Centre for Computing and Informatics (CIR), Rudjer Boskovic Institute (RBI), Zagreb, Croatia
                [4 ]Research Computing Centre, University of Queensland, Brisbane, Queensland, Australia
                [5 ]Queensland Facility for Advanced Bioinformatics (QFAB), University of Queensland, Brisbane, Queensland, Australia
                CNRS UMR7622 & University Paris 6 Pierre-et-Marie-Curie, FRANCE
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: EA CS NG IM SG MP AL. Performed the experiments: EA CS NG IM DB MC SG YK MP RH AL. Analyzed the data: EA CS NG IM DB MC SG YK MP RH AL. Contributed reagents/materials/analysis tools: EA CS NG IM DB SG YK MP. Wrote the paper: EA CS NG AL.

                Article
                PONE-D-15-22212
                10.1371/journal.pone.0140829
                4621043
                26501966
                e10911a7-1246-442c-b0eb-dd2301c99beb
                Copyright @ 2015

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

                History
                : 21 May 2015
                : 29 September 2015
                Page count
                Figures: 5, Tables: 2, Pages: 20
                Funding
                This work was supported by a grant from The National eResearch Collaboration Tools and Resources project (NeCTAR; http://nectar.org.au). NeCTAR is an Australian Government Super Science project, financed by the Education Investment Fund. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript..
                Categories
                Research Article
                Custom metadata
                All relevant data are within the paper.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article