2,896
views
0
recommends
+1 Recommend
0 collections
    50
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      edgeR: a Bioconductor package for differential expression analysis of digital gene expression data

      , ,
      Bioinformatics
      Oxford University Press (OUP)

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org). Contact: mrobinson@wehi.edu.au

          Related collections

          Most cited references7

          • Record: found
          • Abstract: found
          • Article: not found

          Small-sample estimation of negative binomial dispersion, with applications to SAGE data.

          We derive a quantile-adjusted conditional maximum likelihood estimator for the dispersion parameter of the negative binomial distribution and compare its performance, in terms of bias, to various other methods. Our estimation scheme outperforms all other methods in very small samples, typical of those from serial analysis of gene expression studies, the motivating data for this study. The impact of dispersion estimation on hypothesis testing is studied. We derive an "exact" test that outperforms the standard approximate asymptotic tests.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Moderated statistical tests for assessing differences in tag abundance.

            Digital gene expression (DGE) technologies measure gene expression by counting sequence tags. They are sensitive technologies for measuring gene expression on a genomic scale, without the need for prior knowledge of the genome sequence. As the cost of sequencing DNA decreases, the number of DGE datasets is expected to grow dramatically. Various tests of differential expression have been proposed for replicated DGE data using binomial, Poisson, negative binomial or pseudo-likelihood (PL) models for the counts, but none of the these are usable when the number of replicates is very small. We develop tests using the negative binomial distribution to model overdispersion relative to the Poisson, and use conditional weighted likelihood to moderate the level of overdispersion across genes. Not only is our strategy applicable even with the smallest number of libraries, but it also proves to be more powerful than previous strategies when more libraries are available. The methodology is equally applicable to other counting technologies, such as proteomic spectral counts. An R package can be accessed from http://bioinf.wehi.edu.au/resources/
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Comparative Analysis of Human Gut Microbiota by Barcoded Pyrosequencing

              Humans host complex microbial communities believed to contribute to health maintenance and, when in imbalance, to the development of diseases. Determining the microbial composition in patients and healthy controls may thus provide novel therapeutic targets. For this purpose, high-throughput, cost-effective methods for microbiota characterization are needed. We have employed 454-pyrosequencing of a hyper-variable region of the 16S rRNA gene in combination with sample-specific barcode sequences which enables parallel in-depth analysis of hundreds of samples with limited sample processing. In silico modeling demonstrated that the method correctly describes microbial communities down to phylotypes below the genus level. Here we applied the technique to analyze microbial communities in throat, stomach and fecal samples. Our results demonstrate the applicability of barcoded pyrosequencing as a high-throughput method for comparative microbial ecology.
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                Bioinformatics
                Oxford University Press (OUP)
                1367-4803
                1460-2059
                December 22 2009
                January 01 2010
                November 11 2009
                January 01 2010
                : 26
                : 1
                : 139-140
                Article
                10.1093/bioinformatics/btp616
                e0ed6e34-2392-4c05-83ba-831af6661b56
                © 2010
                History

                Comments

                Comment on this article