83
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Bayesian model accounting for within-class biological variability in Serial Analysis of Gene Expression (SAGE)

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error.

          Results

          We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it.

          Conclusion

          Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: not found
          • Article: not found

          R: A Language for Data Analysis and Graphics

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Serial analysis of gene expression.

            The characteristics of an organism are determined by the genes expressed within it. A method was developed, called serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts. To demonstrate this strategy, short diagnostic sequence tags were isolated from pancreas, concatenated, and cloned. Manual sequencing of 1000 tags revealed a gene expression pattern characteristic of pancreatic function. New pancreatic transcripts corresponding to novel tags were identified. SAGE should provide a broadly applicable means for the quantitative cataloging and comparison of expressed genes in a variety of normal, developmental, and disease states.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays.

              We describe a novel sequencing approach that combines non-gel-based signature sequencing with in vitro cloning of millions of templates on separate 5 microm diameter microbeads. After constructing a microbead library of DNA templates by in vitro cloning, we assembled a planar array of a million template-containing microbeads in a flow cell at a density greater than 3x10(6) microbeads/cm2. Sequences of the free ends of the cloned templates on each microbead were then simultaneously analyzed using a fluorescence-based signature sequencing method that does not require DNA fragment separation. Signature sequences of 16-20 bases were obtained by repeated cycles of enzymatic cleavage with a type IIs restriction endonuclease, adaptor ligation, and sequence interrogation by encoded hybridization probes. The approach was validated by sequencing over 269,000 signatures from two cDNA libraries constructed from a fully sequenced strain of Saccharomyces cerevisiae, and by measuring gene expression levels in the human cell line THP-1. The approach provides an unprecedented depth of analysis permitting application of powerful statistical techniques for discovery of functional relationships among genes, whether known or unknown beforehand, or whether expressed at high or very low levels.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                2004
                31 August 2004
                : 5
                : 119
                Affiliations
                [1 ]Statistics Department, Instituto de Matemática e Estatística – Universidade de São Paulo, Rua do Matão 1010, 05508-090 São Paulo, BRAZIL
                [2 ]BIOINFO-USP – Núcleo de Pesquisas em Bioinformática da Universidade de São Paulo, Rua do Matão 1010, 05508-090 São Paulo, BRAZIL
                [3 ]Ludwig Institute for Cancer Research – São Paulo Branch, Rua Prof. Antônio Prudente 109, 01519-010 São Paulo, BRAZIL
                [4 ]Hospital do Câncer A.C. Camargo, Rua Prof. Antônio Prudente 109, 01519-010 São Paulo, BRAZIL
                Article
                1471-2105-5-119
                10.1186/1471-2105-5-119
                517707
                15339345
                57d94fc7-6a50-49a4-aa01-f64f28bbe5b5
                Copyright © 2004 Vêncio et al; licensee BioMed Central Ltd.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 14 May 2004
                : 31 August 2004
                Categories
                Methodology Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article