150
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Ten simple rules for making research software more robust

      editorial
      1 , * , 2
      PLoS Computational Biology
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Software produced for research, published and otherwise, suffers from a number of common problems that make it difficult or impossible to run outside the original institution or even off the primary developer’s computer. We present ten simple rules to make such software robust enough to be run by anyone, anywhere, and thereby delight your users and collaborators.

          Author summary

          Many researchers have found out the hard way that there’s a world of difference between “works for me on my machine” and “works for other people on theirs.” Many common challenges can be avoided by following a few simple rules; doing so not only improves reproducibility but can accelerate research.

          Related collections

          Most cited references17

          • Record: found
          • Abstract: found
          • Article: not found

          Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown

          High-throughput sequencing of messenger RNA (RNA-seq) has become the standard method for measuring and comparing the levels of gene expression in a wide variety of species and conditions. RNA-seq experiments generate very large, complex data sets that demand fast, accurate, and flexible software to reduce the raw read data to comprehensible results. HISAT, StringTie, and Ballgown are free, open-source software tools for comprehensive analysis of RNA-seq experiments. Together, they allow scientists to align reads to a genome, assemble transcripts including novel splice variants, compute the abundance of these transcripts in each sample, and compare experiments to identify differentially expressed genes and transcripts. This protocol describes all the steps necessary to process a large set of raw sequencing reads and create lists of gene transcripts, expression levels, and differentially expressed genes and transcripts. The protocol’s execution time depends on the computing resources, but typically takes under 45 minutes of computer time. Pertea et al. describe a protocol to analyze RNA-seq data using HISAT, StringTie, and Ballgown (the “new Tuxedo” package). The protocol can be used for assembly of transcripts, quantification of gene expression levels and differential expression analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The Bioperl toolkit: Perl modules for the life sciences.

            The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Best Practices for Scientific Computing

              Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort. We describe a set of best practices for scientific software development that have solid foundations in research and experience, and that improve scientists' productivity and the reliability of their software.
                Bookmark

                Author and article information

                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                April 2017
                13 April 2017
                : 13
                : 4
                : e1005412
                Affiliations
                [1 ]Genome Sequence Informatics, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
                [2 ]Software Carpentry Foundation, Austin, Texas, United States of America
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                http://orcid.org/0000-0003-0677-6902
                http://orcid.org/0000-0001-8659-8979
                Article
                PCOMPBIOL-D-16-01683
                10.1371/journal.pcbi.1005412
                5390961
                28407023
                a5db6498-b8f5-4e28-8f5b-591a361273a9
                © 2017 Taschuk, Wilson

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                Page count
                Figures: 0, Tables: 0, Pages: 10
                Funding
                This work was partially funded by the Ontario Institute for Cancer Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Editorial
                Computer and Information Sciences
                Computer Software
                Computer and Information Sciences
                Software Engineering
                Software Development
                Engineering and Technology
                Software Engineering
                Software Development
                Computer and Information Sciences
                Software Engineering
                Software Tools
                Engineering and Technology
                Software Engineering
                Software Tools
                Computer and Information Sciences
                Software Engineering
                Engineering and Technology
                Software Engineering
                Research and Analysis Methods
                Research Assessment
                Reproducibility
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Research and Analysis Methods
                Database and Informatics Methods
                Bioinformatics
                Sequence Analysis
                Sequence Alignment
                Computer and Information Sciences
                Operating Systems

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article