485
views
0
recommends
+1 Recommend
0 collections
    5
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A scaling normalization method for differential expression analysis of RNA-seq data

      research-article
      1 , 2 , , 1 ,
      Genome Biology
      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A novel and empirical method for normalization of RNA-seq data is presented

          Abstract

          The fine detail provided by sequencing-based transcriptome surveys suggests that RNA-seq is likely to become the platform of choice for interrogating steady state RNA. In order to discover biologically important changes in expression, we show that normalization continues to be an essential step in the analysis. We outline a simple and effective method for performing normalization and show dramatically improved results for inferring differential expression in simulated and publicly available data sets.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Article: not found

          Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

          Y. H. Yang (2002)
          There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is often used to force the distribution of the intensity log ratios to have a median of zero for each slide. However, such global normalization approaches are not adequate in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments. The selection of appropriate controls for normalization is discussed and a novel set of controls (microarray sample pool, MSP) is introduced to aid in intensity-dependent normalization. Lastly, to allow for comparisons of expression levels across slides, a robust method based on maximum likelihood estimation is proposed to adjust for scale differences among slides.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Small-sample estimation of negative binomial dispersion, with applications to SAGE data.

            We derive a quantile-adjusted conditional maximum likelihood estimator for the dispersion parameter of the negative binomial distribution and compare its performance, in terms of bias, to various other methods. Our estimation scheme outperforms all other methods in very small samples, typical of those from serial analysis of gene expression studies, the motivating data for this study. The impact of dispersion estimation on hypothesis testing is studied. We derive an "exact" test that outperforms the standard approximate asymptotic tests.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Moderated statistical tests for assessing differences in tag abundance.

              Digital gene expression (DGE) technologies measure gene expression by counting sequence tags. They are sensitive technologies for measuring gene expression on a genomic scale, without the need for prior knowledge of the genome sequence. As the cost of sequencing DNA decreases, the number of DGE datasets is expected to grow dramatically. Various tests of differential expression have been proposed for replicated DGE data using binomial, Poisson, negative binomial or pseudo-likelihood (PL) models for the counts, but none of the these are usable when the number of replicates is very small. We develop tests using the negative binomial distribution to model overdispersion relative to the Poisson, and use conditional weighted likelihood to moderate the level of overdispersion across genes. Not only is our strategy applicable even with the smallest number of libraries, but it also proves to be more powerful than previous strategies when more libraries are available. The methodology is equally applicable to other counting technologies, such as proteomic spectral counts. An R package can be accessed from http://bioinf.wehi.edu.au/resources/
                Bookmark

                Author and article information

                Journal
                Genome Biol
                Genome Biology
                BioMed Central
                1465-6906
                1465-6914
                2010
                2 March 2010
                : 11
                : 3
                : R25
                Affiliations
                [1 ]Bioinformatics Division, Walter and Eliza Hall Institute, 1G Royal Parade, Parkville 3052, Australia
                [2 ]Epigenetics Laboratory, Cancer Program, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, NSW 2010, Australia
                Article
                gb-2010-11-3-r25
                10.1186/gb-2010-11-3-r25
                2864565
                20196867
                65042c84-27f2-41e6-8d94-efa011d40fb5
                Copyright ©2010 Robinson and Oshlack; licensee BioMed Central Ltd.

                This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 19 November 2009
                : 28 January 2010
                : 2 March 2010
                Categories
                Method

                Genetics
                Genetics

                Comments

                Comment on this article