Blog
About

471
views
0
recommends
+1 Recommend
0 collections
    20
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

      , ,

      Genome Biology

      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html.

          Electronic supplementary material

          The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references 36

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          featureCounts: An efficient general-purpose program for assigning sequence reads to genomic features

           ,  ,   (2013)
          Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Small-sample estimation of negative binomial dispersion, with applications to SAGE data.

            We derive a quantile-adjusted conditional maximum likelihood estimator for the dispersion parameter of the negative binomial distribution and compare its performance, in terms of bias, to various other methods. Our estimation scheme outperforms all other methods in very small samples, typical of those from serial analysis of gene expression studies, the motivating data for this study. The impact of dispersion estimation on hypothesis testing is studied. We derive an "exact" test that outperforms the standard approximate asymptotic tests.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Moderated statistical tests for assessing differences in tag abundance.

              Digital gene expression (DGE) technologies measure gene expression by counting sequence tags. They are sensitive technologies for measuring gene expression on a genomic scale, without the need for prior knowledge of the genome sequence. As the cost of sequencing DNA decreases, the number of DGE datasets is expected to grow dramatically. Various tests of differential expression have been proposed for replicated DGE data using binomial, Poisson, negative binomial or pseudo-likelihood (PL) models for the counts, but none of the these are usable when the number of replicates is very small. We develop tests using the negative binomial distribution to model overdispersion relative to the Poisson, and use conditional weighted likelihood to moderate the level of overdispersion across genes. Not only is our strategy applicable even with the smallest number of libraries, but it also proves to be more powerful than previous strategies when more libraries are available. The methodology is equally applicable to other counting technologies, such as proteomic spectral counts. An R package can be accessed from http://bioinf.wehi.edu.au/resources/
                Bookmark

                Author and article information

                Contributors
                mlove@jimmy.harvard.edu
                whuber@embl.de
                sanders@fs.tum.de
                Journal
                Genome Biol
                Genome Biology
                BioMed Central (London )
                1465-6906
                1465-6914
                5 December 2014
                2014
                : 15
                : 12
                Affiliations
                [ ]Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute and Department of Biostatistics, Harvard School of Public Health, 450 Brookline Avenue, Boston, 02215 MA USA
                [ ]Genome Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, Heidelberg, 69117 Germany
                [ ]Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-7314195, Berlin, Germany
                Article
                550
                10.1186/s13059-014-0550-8
                4302049
                25516281
                © Love et al.; licensee BioMed Central. 2014

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                Categories
                Method
                Custom metadata
                © The Author(s) 2014

                Genetics

                Comments

                Comment on this article