35
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Detection of divergent genes in microbial aCGH experiments

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Array-based comparative genome hybridization (aCGH) is a tool for rapid comparison of genomes from different bacterial strains. The purpose of such analysis is to detect highly divergent or absent genes in a sample strain compared to an index strain. Development of methods for analyzing aCGH data has primarily focused on copy number abberations in cancer research. In microbial aCGH analyses, genes are typically ranked by log-ratios, and classification into divergent or present is done by choosing a cutoff log-ratio, either manually or by statistics calculated from the log-ratio distribution. As experimental settings vary considerably, it is not possible to develop a classical discriminant or statistical learning approach.

          Methods

          We introduce a more efficient method for analyzing microbial aCGH data using a finite mixture model and a data rotation scheme. Using the average posterior probabilities from the model fitted to log-ratios before and after rotation, we get a score for each gene, and demonstrate its advantages for ranking and detecting divergent genes with enlarged specificity and sensitivity.

          Results

          The procedure is tested and compared to other approaches on simulated data sets, as well as on four experimental validation data sets for aCGH analysis on fully sequenced strains of Staphylococcus aureus and Streptococcus pneumoniae.

          Conclusion

          When tested on simulated data as well as on four different experimental validation data sets from experiments with only fully sequenced strains, our procedure out-competes the standard procedures of using a simple log-ratio cutoff for classification into present and divergent genes.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: not found

          The meaning and use of the area under a receiver operating characteristic (ROC) curve.

          A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect differences in the accuracy of diagnostic techniques.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            In silico prediction of protein-protein interactions in human macrophages

            Background: Protein-protein interaction (PPI) network analyses are highly valuable in deciphering and understanding the intricate organisation of cellular functions. Nevertheless, the majority of available protein-protein interaction networks are context-less, i.e. without any reference to the spatial, temporal or physiological conditions in which the interactions may occur. In this work, we are proposing a protocol to infer the most likely protein-protein interaction (PPI) network in human macrophages. Results: We integrated the PPI dataset from the Agile Protein Interaction DataAnalyzer (APID) with different meta-data to infer a contextualized macrophage-specific interactome using a combination of statistical methods. The obtained interactome is enriched in experimentally verified interactions and in proteins involved in macrophage-related biological processes (i.e. immune response activation, regulation of apoptosis). As a case study, we used the contextualized interactome to highlight the cellular processes induced upon Mycobacterium tuberculosis infection. Conclusion: Our work confirms that contextualizing interactomes improves the biological significance of bioinformatic analyses. More specifically, studying such inferred network rather than focusing at the gene expression level only, is informative on the processes involved in the host response. Indeed, important immune features such as apoptosis are solely highlighted when the spotlight is on the protein interaction level.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays.

              Gene dosage variations occur in many diseases. In cancer, deletions and copy number increases contribute to alterations in the expression of tumour-suppressor genes and oncogenes, respectively. Developmental abnormalities, such as Down, Prader Willi, Angelman and Cri du Chat syndromes, result from gain or loss of one copy of a chromosome or chromosomal region. Thus, detection and mapping of copy number abnormalities provide an approach for associating aberrations with disease phenotype and for localizing critical genes. Comparative genomic hybridization (CGH) was developed for genome-wide analysis of DNA sequence copy number in a single experiment. In CGH, differentially labelled total genomic DNA from a 'test' and a 'reference' cell population are cohybridized to normal metaphase chromosomes, using blocking DNA to suppress signals from repetitive sequences. The resulting ratio of the fluorescence intensities at a location on the 'cytogenetic map', provided by the chromosomes, is approximately proportional to the ratio of the copy numbers of the corresponding DNA sequences in the test and reference genomes. CGH has been broadly applied to human and mouse malignancies. The use of metaphase chromosomes, however, limits detection of events involving small regions (of less than 20 Mb) of the genome, resolution of closely spaced aberrations and linking ratio changes to genomic/genetic markers. Therefore, more laborious locus-by-locus techniques have been required for higher resolution studies. Hybridization to an array of mapped sequences instead of metaphase chromosomes could overcome the limitations of conventional CGH (ref. 6) if adequate performance could be achieved. Copy number would be related to the test/reference fluorescence ratio on the array targets, and genomic resolution could be determined by the map distance between the targets, or by the length of the cloned DNA segments. We describe here our implementation of array CGH. We demonstrate its ability to measure copy number with high precision in the human genome, and to analyse clinical specimens by obtaining new information on chromosome 20 aberrations in breast cancer.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                2006
                30 March 2006
                : 7
                : 181
                Affiliations
                [1 ]Biostatistics, Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, N-1432 Ås, Norway
                [2 ]Department of Biology and Biochemistry/Bioinformatics, University of Potsdam, Germany
                [3 ]Microbial Gene Technology, Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, Ås, Norway
                [4 ]Institute of Medical Biometry and Statistics, University at Lübeck, Germany
                Article
                1471-2105-7-181
                10.1186/1471-2105-7-181
                1563484
                16573812
                f81bd412-ff46-4400-9933-8c0d886573c6
                Copyright © 2006 Snipen et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 12 October 2005
                : 30 March 2006
                Categories
                Research Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article