5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SPEAQeasy: a scalable pipeline for expression analysis and quantification for R/bioconductor-powered RNA-seq analyses

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          RNA sequencing (RNA-seq) is a common and widespread biological assay, and an increasing amount of data is generated with it. In practice, there are a large number of individual steps a researcher must perform before raw RNA-seq reads yield directly valuable information, such as differential gene expression data. Existing software tools are typically specialized, only performing one step–such as alignment of reads to a reference genome–of a larger workflow. The demand for a more comprehensive and reproducible workflow has led to the production of a number of publicly available RNA-seq pipelines. However, we have found that most require computational expertise to set up or share among several users, are not actively maintained, or lack features we have found to be important in our own analyses.

          Results

          In response to these concerns, we have developed a Scalable Pipeline for Expression Analysis and Quantification (SPEAQeasy), which is easy to install and share, and provides a bridge towards R/Bioconductor downstream analysis solutions. SPEAQeasy is portable across computational frameworks (SGE, SLURM, local, docker integration) and different configuration files are provided ( http://research.libd.org/SPEAQeasy/).

          Conclusions

          SPEAQeasy is user-friendly and lowers the computational-domain entry barrier for biologists and clinicians to RNA-seq data processing as the main input file is a table with sample names and their corresponding FASTQ files. The goal is to provide a flexible pipeline that is immediately usable by researchers, regardless of their technical background or computing environment.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12859-021-04142-3.

          Related collections

          Most cited references54

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

          In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Trimmomatic: a flexible trimmer for Illumina sequence data

            Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: usadel@bio1.rwth-aachen.de Supplementary information: Supplementary data are available at Bioinformatics online.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The Sequence Alignment/Map format and SAMtools

              Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
                Bookmark

                Author and article information

                Contributors
                lcolladotor@gmail.com
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                1 May 2021
                1 May 2021
                2021
                : 22
                : 224
                Affiliations
                [1 ]GRID grid.429552.d, Lieber Institute for Brain Development, ; Johns Hopkins Medical Campus, Baltimore, MD 21205 USA
                [2 ]Winter Genomics, Salaverry 874 int 100, Lindavista, CDMX 07300 Mexico
                [3 ]QuestBridge Scholar, Palo Alto, CA 94303 USA
                [4 ]GRID grid.21107.35, ISNI 0000 0001 2171 9311, Department of Neuroscience, , Johns Hopkins School of Medicine, ; Baltimore, MD 21205 USA
                [5 ]GRID grid.147455.6, ISNI 0000 0001 2097 0344, Computational Biology Department, School of Computer Science, , Carnegie Mellon University, ; Pittsburgh, PA 15213 USA
                [6 ]GRID grid.21925.3d, ISNI 0000 0004 1936 9000, Medical Scientist Training Program, School of Medicine, , University of Pittsburgh, ; Pittsburgh, PA 15213 USA
                [7 ]GRID grid.418275.d, ISNI 0000 0001 2165 8782, Instituto Politécnico Nacional, , Escuela Nacional de Ciencias Biológicas, ; Mexico City, CDMX 11340 Mexico
                [8 ]GRID grid.452651.1, ISNI 0000 0004 0627 7633, Department of Supercomputing, , Instituto Nacional de Medicina Genómica (INMEGEN), ; Mexico City, CDMX 14610 Mexico
                [9 ]GRID grid.21107.35, ISNI 0000 0001 2171 9311, Center for Computational Biology, , Johns Hopkins University, ; Baltimore, MD 21205 USA
                [10 ]GRID grid.21107.35, ISNI 0000 0001 2171 9311, Department of Biostatistics, , Johns Hopkins Bloomberg School of Public Health, ; Baltimore, MD 21205 USA
                [11 ]GRID grid.21107.35, ISNI 0000 0001 2171 9311, Department of Genetic Medicine, McKusick-Nathans Institute of Genetic Medicine, , Johns Hopkins University School of Medicine, ; Baltimore, MD 21205 USA
                [12 ]GRID grid.21107.35, ISNI 0000 0001 2171 9311, Department of Psychiatry and Behavioral Sciences, , Johns Hopkins School of Medicine, ; Baltimore, MD 21205 USA
                [13 ]GRID grid.21107.35, ISNI 0000 0001 2171 9311, Department of Mental Health, , Johns Hopkins Bloomberg School of Public Health, ; Baltimore, MD 21205 USA
                Author information
                http://orcid.org/0000-0003-2140-308X
                Article
                4142
                10.1186/s12859-021-04142-3
                8088074
                33932985
                02df56ae-11eb-45d1-94c4-5155b1c884ad
                © The Author(s) 2021

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 29 January 2021
                : 21 April 2021
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000025, National Institute of Mental Health;
                Award ID: R21MH120497-0
                Award Recipient :
                Categories
                Software
                Custom metadata
                © The Author(s) 2021

                Bioinformatics & Computational biology
                rna-seq,pipeline,bioconductor
                Bioinformatics & Computational biology
                rna-seq, pipeline, bioconductor

                Comments

                Comment on this article