24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Improved analysis of bacterial CGH data beyond the log-ratio paradigm

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Existing methods for analyzing bacterial CGH data from two-color arrays are based on log-ratios only, a paradigm inherited from expression studies. We propose an alternative approach, where microarray signals are used in a different way and sequence identity is predicted using a supervised learning approach.

          Results

          A data set containing 32 hybridizations of sequenced versus sequenced genomes have been used to test and compare methods. A ROC-analysis has been performed to illustrate the ability to rank probes with respect to Present/Absent calls. Classification into Present and Absent is compared with that of a gaussian mixture model.

          Conclusion

          The results indicate our proposed method is an improvement of existing methods with respect to ranking and classification of probes, especially for multi-genome arrays.

          Related collections

          Most cited references17

          • Record: found
          • Abstract: found
          • Article: not found

          The meaning and use of the area under a receiver operating characteristic (ROC) curve.

          A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect differences in the accuracy of diagnostic techniques.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Normalization of cDNA microarray data.

            Normalization means to adjust microarray data for effects which arise from variation in the technology rather than from biological differences between the RNA samples or between the printed probes. This paper describes normalization methods based on the fact that dye balance typically varies with spot intensity and with spatial position on the array. Print-tip loess normalization provides a well-tested general purpose normalization method which has given good results on a wide range of arrays. The method may be refined by using quality weights for individual spots. The method is best combined with diagnostic plots of the data which display the spatial and intensity trends. When diagnostic plots show that biases still remain in the data after normalization, further normalization steps such as plate-order normalization or scale-normalization between the arrays may be undertaken. Composite normalization may be used when control spots are available which are known to be not differentially expressed. Variations on loess normalization include global loess normalization and two-dimensional normalization. Detailed commands are given to implement the normalization techniques using freely available software.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Microarrays reveal that each of the ten dominant lineages of Staphylococcus aureus has a unique combination of surface-associated and regulatory genes.

              Staphylococcus aureus is the most common cause of hospital-acquired infection. In healthy hosts outside of the health care setting, S. aureus is a frequent colonizer of the human nose but rarely causes severe invasive infection such as bacteremia, endocarditis, or osteomyelitis. To identify genes associated with community-acquired invasive isolates, regions of genomic variability, and the S. aureus population structure, we compared 61 community-acquired invasive isolates of S. aureus and 100 nasal carriage isolates from healthy donors using a microarray spotted with PCR products representing every gene from the seven S. aureus sequencing projects. The core genes common to all strains were identified, and 10 dominant lineages of S. aureus were clearly discriminated. Each lineage carried a unique combination of hundreds of "core variable" (CV) genes scattered throughout the chromosome, suggesting a common ancestor but early evolutionary divergence. Many CV genes are regulators of virulence genes or known or predicted to be expressed on the bacterial surface and to interact with the host during nasal colonization and infection. Within each lineage, isolates showed substantial variation in the carriage of mobile genetic elements and their associated virulence and resistance genes, indicating frequent horizontal transfer. However, we were unable to identify any association between lineage or gene and invasive isolates. We suggest that the S. aureus gene combinations necessary for invasive disease may also be necessary for nasal colonization and that community-acquired invasive disease is strongly dependent on host factors.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2009
                19 March 2009
                : 10
                : 91
                Affiliations
                [1 ]Biostatistics, Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, Ås, Norway
                [2 ]Laboratory of Microbial Gene Technology, Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, Ås, Norway
                Article
                1471-2105-10-91
                10.1186/1471-2105-10-91
                2679023
                19298668
                516d4159-784e-4611-bfb4-25a93362ae7c
                Copyright © 2009 Snipen et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 12 November 2008
                : 19 March 2009
                Categories
                Methodology Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article