Better reporting for better research: a checklist for reproducibility

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

How easy is it to reproduce or replicate the findings of a published paper? In 2013 one researcher, Phil Bourne, asked just this. How easy would it be to reproduce the results of a computational biology paper? [1] The answer: 280 h. Such a number is surprising, given the theoretical reproducibility of computational research and given Bourne was attempting to reproduce work done in his own lab. Now at the National Institutes of Health (NIH) as Associate Director of Data Sciences, Bourne is concerned with the reproducibility of all NIH funded work, not just his own—and the problem is large. In addition to work in computational biology (which theoretically should be more easily reproducible than “wet lab” work), hallmark papers in cancer through to psychology have been flagged as largely unreproducible [2, 3]. Closer to home, GigaScience has carried out similar work to quantify reproducibility in their content. Despite being scrutinized and tested by seven referees, it still took about half a man-month worth of resources to reproduce the results reported in just one of the tables [4]. “Reproducibility” is now increasingly on the radar of funders and is making its rounds in the wider media as well, with concerns of reproducibility making headlines at The Economist [5] and New York Times [6], amongst other outlets. Why is this important? It is critical to note that irreproducible work doesn’t necessarily mean fraud occurred, nor even that the findings are incorrect; likewise, reproducible research can still be incorrect. While this key point is well-understood by most scientists, this is not always easy to explain to the general public. However, as most research is paid for through tax payers, public trust in research is essential. We—researchers, funders, and publishers—must do a better job at communicating this message to the public. We must better explain that science is an activity that continually builds on and verifies itself. But we also must develop policies that better support this process—policies, for example, that promote transparency and allow for improved verification of research. Clearly important for clinical research, verification is equally important for preclinical research, something we all have an equal stake in. No one can innovate new drugs overnight, no matter how rich they are, no matter which doctor they see. Better, more robust preclinical research benefits us alla. Our ability to rely on published data for potential therapeutics is critical, and recently its reliability has been called into question [7]. One well-publicised example of this was brought to light in an oncology study of preclinical research findings in which researchers were able to confirm only 11% of the findings [8, 9]. Although the relevance of more robust research is clear in the area of oncology, it is also important for more exploratory research that might never make it to the preclinical setting. Funding and time are both increasingly limited, and the waste generated from follow-up work based on irreproducible research is high. A recent study by Freedman et al. estimated this at approximately $28 billion a year for preclinical research in the United States alone [10]. Funder update The NIH have recently taken bold steps to begin to tackle the need for better design, more appropriate analysis, and greater transparency in the conduct and reporting of research. In January 2014 the NIH announced they would fund more training for scientists in data management and restructure their grant review process to better value other research objects, such as data [11]. But it is peer review and the editorial policies and practices of journals that have come under the greatest scrutiny, and in June 2014 a set of guidelines for reporting preclinical research were proposed by the NIH to meet the perceived need for more stringent standards [12]. These guidelines ask journals to ensure, for example, that authors have included a minimum set of information on study design, that statistical checks have been carried out by reviewers, and that authors have been given enough information to enable animal strains, cell lines, reagents, and so on, to be uniquely identified reagents (for a full list of requirements, see the NIH Principles and Guidelines for Reporting Preclinical Research). BioMed Central author and reviewer checklist Journals clearly have an important part to play in helping to ensure as far as possible that experimental design and analysis are appropriate, and that reporting standards are met. This month BioMed Central will launch a trial checklist for authors and referees with these explicit aims. BioMed Central has long supported transparency in reporting for both biology and medicine, even working with Editorial Board Members developing and endorsing standards such as MIQE-precis [13], and the EQUATOR Network guidelines, such PRISMA [14]. The trial checklist builds on these accepted standards and the principles behind them, formalising, tailoring and standardising these efforts across journals. The checklist addresses three areas of reporting: experimental design and statistics, resources, and availability of data and materials [15]. Some of the NIH Guidelines were straightforward to implement, given they were policies long in place at BioMed Central. However, we used these new guidelines as an opportunity to ensure that these as well as our long-standing policies already in place had the best chance of being adhered to by authors and by reviewers by integrating them into our internal systems and workflows. Authors will be asked on submission to confirm that they have included the information asked for in the checklist or give reasons for any instances where it is not made available or not applicableb. Likewise, reviewers will be asked to confirm the information has been satisfactorily reported and reviewed. This also has the aim of making editors’ jobs more straightforward. With a clear and simple checklist on what information to include in the manuscript, less time should be spent liaising with authors. Plans are also in place to integrate our new checklist into BioMed Central Roadshows and Author Workshops (http://roadshow.biomedcentral.com/), helping to ensure researchers are made aware of the reporting standards before publication. BioMed Central is not the first to implement reporting guidelines, with the Center for Open Science [16]c and our colleagues at Nature [17] also recently announcing similar initiatives. Implementing reporting guidelines, rather through a checklist or another means, is not simple. Exploratory research that does not have the immediate practical implications of preclinical research often does not easily adhere to the criteria of reproducibility. For this reason we are implementing this first as a trial, for which we will collect feedback and monitor its success. In the first instance, the checklist will be rolled out on a small group of select journals: BMC Biology, BMC Neuroscience, Genome Biology, and GigaScience. In 6 months’ time, we plan to review the data we have collected around this trial, checking whether reporting has increased and collating author, editor, and reviewer feedback on the trial, with the aim to roll out the checklist (with any revisions) across all BioMed Central journals. We have designed the checklist to act as an aid to authors, editors, and reviewers rather than a burden to submission and look forward to hearing your thoughts as the trial progresses. Endnotes aFor further discussion of this around clinical trial transparency and reliability, see Ben Goldacre’s Bad Pharma. bTo better support our authors in adhering to this checklist, we have also recently revised our section on data availability, detailing where authors can deposit their data and how to cite their data in their manuscript. We also have in-house staff available to work with authors to find a home for their data. http://www.biomedcentral.com/about/editorialpolicies#DataandMaterialRelease. cThe Center for Open Science with stakeholders from research have recently devised an easy to use set of guidelines based on eight standards and three levels of adherence. With this checklist, all journals will adhere to level 2 requirements. At present, all BioMed Central journals adhere to level 1 requirements. http://www.sciencemag.org/content/348/6242/1422.figures-only.

Related collections

Most cited references 3

Record: found
Abstract: found
Article: found

Is Open Access

MIQE précis: Practical implementation of minimum standard guidelines for fluorescence-based quantitative real-time PCR experiments

Stephen A. Bustin, Jean-François Beaulieu, Jim Huggett … (2010)

The conclusions of thousands of peer-reviewed publications rely on data obtained using fluorescence-based quantitative real-time PCR technology. However, the inadequate reporting of experimental detail, combined with the frequent use of flawed protocols is leading to the publication of papers that may not be technically appropriate. We take the view that this problem requires the delineation of a more transparent and comprehensive reporting policy from scientific journals. This editorial aims to provide practical guidance for the incorporation of absolute minimum standards encompassing the key assay parameters for accurate design, documentation and reporting of qPCR experiments (MIQE précis) and guidance on the publication of pure 'reference gene' articles.

0 comments Cited 227 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome

Daniel Garijo, Sarah Kinnings, Li Xie … (2013)

How easy is it to reproduce the results found in a typical computational biology paper? Either through experience or intuition the reader will already know that the answer is with difficulty or not at all. In this paper we attempt to quantify this difficulty by reproducing a previously published paper for different classes of users (ranging from users with little expertise to domain experts) and suggest ways in which the situation might be improved. Quantification is achieved by estimating the time required to reproduce each of the steps in the method described in the original paper and make them part of an explicit workflow that reproduces the original results. Reproducing the method took several months of effort, and required using new versions and new software that posed challenges to reconstructing and validating the results. The quantification leads to “reproducibility maps” that reveal that novice researchers would only be able to reproduce a few of the steps in the method, and that only expert researchers with advance knowledge of the domain would be able to reproduce the method in its entirety. The workflow itself is published as an online resource together with supporting software and data. The paper concludes with a brief discussion of the complexities of requiring reproducibility in terms of cost versus benefit, and a desiderata with our observations and guidelines for improving reproducibility. This has implications not only in reproducing the work of others from published papers, but reproducing work from one’s own laboratory.

0 comments Cited 41 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics

Alejandra González-Beltrán, Peter Li, Jun Zhao … (2015)

Motivation Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed. The Investigation/Study/Assay (ISA), Nanopublications (NP), and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP, and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler. Results Executable workflows were developed using Galaxy, which reproduced results that were consistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP, and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables, and findings. The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata. Availability SOAPdenovo2 scripts, data, and results are available through the GigaScience Database: http://dx.doi.org/10.5524/100044; the workflows are available from GigaGalaxy: http://galaxy.cbiit.cuhk.edu.hk; and the representations using the ISA, NP, and RO models are available through the SOAPdenovo2 case study website http://isa-tools.github.io/soapdenovo2/. Contact: philippe.rocca-serra@oerc.ox.ac.uk and susanna-assunta.sansone@oerc.ox.ac.uk.

0 comments Cited 13 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Amye Kenall: amye.kenall@biomedcentral.com

Scott Edmunds: scott@gigasciencejournal.com

Laurie Goodman: laurie@gigasciencejournal.com

Liz Bal: liz.bal@biomedcentral.com

Louisa Flintoft: Louisa.Flintoft@biomedcentral.com

Daniel R Shanahan: Daniel.Shanahan@biomedcentral.com

Tim Shipley: Tim.Shipley@biomedcentral.com

Journal

Journal ID (nlm-ta): BMC Neurosci

Journal ID (iso-abbrev): BMC Neurosci

Title: BMC Neuroscience

Publisher: BioMed Central (London )

ISSN (Electronic): 1471-2202

Publication date (Electronic): 23 July 2015

Publication date PMC-release: 23 July 2015

Publication date Collection: 2015

Volume: 16

Electronic Location Identifier: 44

Affiliations

[ ]BioMed Central, London, UK

[ ]BGI, Hong Kong, Hong Kong

[ ]Genome Biology, BioMed Central, London, UK

[ ]BMC Neuroscience, BioMed Central, London, UK

Article

Publisher ID: 177

DOI: 10.1186/s12868-015-0177-z

PMC ID: 4512017

SO-VID: 75b9dd54-1031-4623-9015-26352cc32cbb

License:

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

History

Date received : 30 June 2015

Date accepted : 30 June 2015

Custom metadata

ScienceOpen disciplines: Neurosciences

Data availability:

ScienceOpen disciplines: Neurosciences

Comments

Comment on this article

scite_

Cited by 3

See all cited by

Most referenced authors 52

See all reference authors

Better reporting for better research: a checklist for reproducibility

Read this article at

Abstract

Related collections

Serotonin and BEYOND

Most cited references 3

MIQE précis: Practical implementation of minimum standard guidelines for fluorescence-based quantitative real-time PCR experiments

Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome

From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 17

Cited by 3

Most referenced authors 52