661
views
0
recommends
+1 Recommend
3 collections
    16
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed. The Investigation/Study/Assay (ISA), Nanopublications (NP), and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP, and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler.

          Results

          Executable workflows were developed using Galaxy, which reproduced results that were consistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP, and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables, and findings. The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata.

          Availability

          SOAPdenovo2 scripts, data, and results are available through the GigaScience Database: http://dx.doi.org/10.5524/100044; the workflows are available from GigaGalaxy: http://galaxy.cbiit.cuhk.edu.hk; and the representations using the ISA, NP, and RO models are available through the SOAPdenovo2 case study website http://isa-tools.github.io/soapdenovo2/. Contact: philippe.rocca-serra@ 123456oerc.ox.ac.uk and susanna-assunta.sansone@ 123456oerc.ox.ac.uk.

          Related collections

          Most cited references27

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences

          Increased reliance on computational approaches in the life sciences has revealed grave concerns about how accessible and reproducible computation-reliant results truly are. Galaxy http://usegalaxy.org, an open web-based platform for genomic research, addresses these problems. Galaxy automatically tracks and manages data provenance and provides support for capturing the context and intent of computational methods. Galaxy Pages are interactive, web-based documents that provide users with a medium to communicate a complete computational analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Reproducible research in computational science.

            Roger Peng (2011)
            Computational science has led to exciting new developments, but the nature of the work has exposed limitations in our ability to evaluate published findings. Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data

              MetaboLights (http://www.ebi.ac.uk/metabolights) is the first general-purpose, open-access repository for metabolomics studies, their raw experimental data and associated metadata, maintained by one of the major open-access data providers in molecular biology. Metabolomic profiling is an important tool for research into biological functioning and into the systemic perturbations caused by diseases, diet and the environment. The effectiveness of such methods depends on the availability of public open data across a broad range of experimental methods and conditions. The MetaboLights repository, powered by the open source ISA framework, is cross-species and cross-technique. It will cover metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and raw data from metabolic experiments. Studies automatically receive a stable unique accession number that can be used as a publication reference (e.g. MTBLS1). At present, the repository includes 15 submitted studies, encompassing 93 protocols for 714 assays, and span over 8 different species including human, Caenorhabditis elegans, Mus musculus and Arabidopsis thaliana. Eight hundred twenty-seven of the metabolites identified in these studies have been mapped to ChEBI. These studies cover a variety of techniques, including NMR spectroscopy and mass spectrometry.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2015
                8 July 2015
                : 10
                : 7
                : e0127612
                Affiliations
                [1 ]Oxford e-Research Centre, University of Oxford, 7 Keble Road, OX1 3QG, United Kingdom
                [2 ]GigaScience, BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong Kong, People’s Republic of China
                [3 ]InfoLab21, Lancaster University, Bailrigg, Lancaster, LA1 4WA, United Kingdom
                [4 ]Nuffield Department of Medicine, Experimental Medicine Division, John Radcliffe Hospital, Headley Way, Headington, Oxford, OX3 9DU, United Kingdom
                [5 ]Department of Human Genetics, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
                [6 ]HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Pokfulam, Hong Kong, People’s Republic of China
                [7 ]School of Biomedical Sciences and CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, Hong Kong, People’s Republic of China
                University of Illinois-Chicago, UNITED STATES
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: SAS AGB PRS JZ MR. Performed the experiments: AGB PRS PL. Analyzed the data: AGB PRS PL. Contributed reagents/materials/analysis tools: RL TWL TLL PL. Wrote the paper: PRS AGB SAS PL JZ MR SCE MSAG. Proposed the idea after an initial meeting with JZ, MR, AGB, and PRS: SAS. Selected the publication and worked with its authors (RL, TWL): SCE PL. Did ISA-Tab, linkedISA RDF, NPs representation and SPARQL queries over linkedISA and NPs: PRS AGB. Reviewed the NPs: MR MT. Submitted terms to OBI: PRS. Wrote linkedISA, NanoMaton software and prepared dedicated website and triple store: AGB. Re-implemented the published SOAPdenovo2 analyses as Galaxy workflows with help from SCE, RL, and TWL: PL TLL. Created the Research Object with input from MSAG and PRS: JZ. Wrote the manuscript first draft: PRS. Contributed to the final version, read it, and approved it: PRS AGB PL JZ MSAG MR MT EH RK RL TLL TWL SCE SAS. Contributed to the review of the nanopublications produced by PRS and AGB: EH RK.

                Article
                PONE-D-14-54465
                10.1371/journal.pone.0127612
                4495984
                26154165
                f688505d-7fd5-4eff-a37c-2cec4455273c
                Copyright @ 2015

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

                History
                : 4 December 2014
                : 16 April 2015
                Page count
                Figures: 3, Tables: 2, Pages: 20
                Funding
                SAS, PRS, and AGB received funding from the European Union Coordination of Standards in Metabolomics (COSMOS) FP7 E9RXDC00, the British Biotechnology and Biological Science Research Council BB/L024101/1, BB/I025840/1, and the University of Oxford e-Research Centre. The work done by PL, SCE, and TLL on GigaGalaxy and the implementation of the SOAPdenovo2 workflows were supported by funding from the joint Chinese University of Hong Kong (CUHK)/The Beijing Genome Institute (BGI) Innovation Institute of Trans-omics and School of Biomedical Sciences, The Chinese University of Hong Kong (CUHK) and the China National GeneBank (CNGB). MSAG and JZ are supported by the European Union Workflow4ever project (EU Wf4Ever STREP, 270129), funded under European Union Framework Program 7 (EU-FP7 ICT-2009.4.1). MR, MT, EH, and RK are supported by the European Union Workflow4ever project (EU Wf4Ever STREP, 270129) funded under European Union Framework Program 7 (EU-FP7 ICT-2009.4.1), the Innovative Medicines Initiative Joint Undertaking (IMI-JU) project Open PHACTS (grant agreement no. 115191), and the European Union RD-Connect (EU FP7/2007-2013, grant agreement no. 305,444). SOAPdenovo2 was developed with the support of the State Key Development Program for Basic Research of China-973 Program (2011CB809203); National High Technology Research and Development Program of China-863 program (2012AA02A201); the National Natural Science Foundation of China (90612019); the Shenzhen Key Laboratory of Trans-omics Biotechnologies (CXB201108250096A); and the Shenzhen Municipal Government of China (JC201005260191A and CXB201108250096A). Tak-Wah Lam was partially supported by RGC General Research Fund 10612042.
                Categories
                Research Article
                Custom metadata
                All data are available from github ( http://isa-tools.github.io/soapdenovo2/), github code repository ( https://github.com/ISA-tools/soapdenovo2), github/zenodo ( http://dx.doi.org/10.5281/zenodo.18403), and GigaScience’s GigaDB ( http://dx.doi.org/10.5524/100148).

                Uncategorized
                Uncategorized

                Comments

                Comment on this article