23
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Comparative analysis of microbiome measurement platforms using latent variable structural equation modeling

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Culture-independent phylogenetic analysis of 16S ribosomal RNA (rRNA) gene sequences has emerged as an incisive method of profiling bacteria present in a specimen. Currently, multiple techniques are available to enumerate the abundance of bacterial taxa in specimens, including the Sanger sequencing, the ‘next generation’ pyrosequencing, microarrays, quantitative PCR, and the rapidly emerging, third generation sequencing, and fourth generation sequencing methods. An efficient statistical tool is in urgent need for the followings tasks: (1) to compare the agreement between these measurement platforms, (2) to select the most reliable platform(s), and (3) to combine different platforms of complementary strengths, for a unified analysis.

          Results

          We present the latent variable structural equation modeling (SEM) as a novel statistical application for the comparative analysis of measurement platforms. The latent variable SEM model treats the true (unknown) relative frequency of a given bacterial taxon in a specimen as the latent (unobserved) variable and estimates the reliabilities of, and similarities between, different measurement platforms, and subsequently weighs those measurements optimally for a unified analysis of the microbiome composition. The latent variable SEM contains the repeated measures ANOVA (both the univariate and the multivariate models) as special cases and, as a more general and realistic modeling approach, yields superior goodness-of-fit and more reliable analysis results, as demonstrated by a microbiome study of the human inflammatory bowel diseases.

          Conclusions

          Given the rapid evolution of modern biotechnologies, the measurement platform comparison, selection and combination tasks are here to stay and to grow – and the latent variable SEM method is readily applicable to any other biological settings, aside from the microbiome study presented here.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          16S ribosomal DNA amplification for phylogenetic study.

          A set of oligonucleotide primers capable of initiating enzymatic amplification (polymerase chain reaction) on a phylogenetically and taxonomically wide range of bacteria is described along with methods for their use and examples. One pair of primers is capable of amplifying nearly full-length 16S ribosomal DNA (rDNA) from many bacterial genera; the additional primers are useful for various exceptional sequences. Methods for purification of amplified material, direct sequencing, cloning, sequencing, and transcription are outlined. An obligate intracellular parasite of bovine erythrocytes, Anaplasma marginale, is used as an example; its 16S rDNA was amplified, cloned, sequenced, and phylogenetically placed. Anaplasmas are related to the genera Rickettsia and Ehrlichia. In addition, 16S rDNAs from several species were readily amplified from material found in lyophilized ampoules from the American Type Culture Collection. By use of this method, the phylogenetic study of extremely fastidious or highly pathogenic bacterial species can be carried out without the need to culture them. In theory, any gene segment for which polymerase chain reaction primer design is possible can be derived from a readily obtainable lyophilized bacterial culture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Pyrosequencing enumerates and contrasts soil microbial diversity.

            Estimates of the number of species of bacteria per gram of soil vary between 2000 and 8.3 million (Gans et al., 2005; Schloss and Handelsman, 2006). The highest estimate suggests that the number may be so large as to be impractical to test by amplification and sequencing of the highly conserved 16S rRNA gene from soil DNA (Gans et al., 2005). Here we present the use of high throughput DNA pyrosequencing and statistical inference to assess bacterial diversity in four soils across a large transect of the western hemisphere. The number of bacterial 16S rRNA sequences obtained from each site varied from 26,140 to 53,533. The most abundant bacterial groups in all four soils were the Bacteroidetes, Betaproteobacteria and Alphaproteobacteria. Using three estimators of diversity, the maximum number of unique sequences (operational taxonomic units roughly corresponding to the species level) never exceeded 52,000 in these soils at the lowest level of dissimilarity. Furthermore, the bacterial diversity of the forest soil was phylum rich compared to the agricultural soils, which are species rich but phylum poor. The forest site also showed far less diversity of the Archaea with only 0.009% of all sequences from that site being from this group as opposed to 4%-12% of the sequences from the three agricultural sites. This work is the most comprehensive examination to date of bacterial diversity in soil and suggests that agricultural management of soil may significantly influence the diversity of bacteria and archaea.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses.

              Although the applicability of small subunit ribosomal RNA (16S rRNA) sequences for bacterial classification is now well accepted, the general use of these molecules has been hindered by the technical difficulty of obtaining their sequences. A protocol is described for rapidly generating large blocks of 16S rRNA sequence data without isolation of the 16S rRNA or cloning of its gene. The 16S rRNA in bulk cellular RNA preparations is selectively targeted for dideoxynucleotide-terminated sequencing by using reverse transcriptase and synthetic oligodeoxynucleotide primers complementary to universally conserved 16S rRNA sequences. Three particularly useful priming sites, which provide access to the three major 16S rRNA structural domains, routinely yield 800-1000 nucleotides of 16S rRNA sequence. The method is evaluated with respect to accuracy, sensitivity to modified nucleotides in the template RNA, and phylogenetic usefulness, by examination of several 16S rRNAs whose gene sequences are known. The relative simplicity of this approach should facilitate a rapid expansion of the 16S rRNA sequence collection available for phylogenetic analyses.
                Bookmark

                Author and article information

                Contributors
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2013
                5 March 2013
                : 14
                : 79
                Affiliations
                [1 ]Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
                [2 ]Division of Infectious Diseases, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
                [3 ]Department of Medicine, Stony Brook University, Stony Brook, NY, USA
                [4 ]Department of Medicine, Washington University, St. Louis, MO, USA
                [5 ]Department of Pediatrics, University of North Carolina, Chapel Hill, NC, USA
                Article
                1471-2105-14-79
                10.1186/1471-2105-14-79
                3608994
                23497007
                23ba3906-fe7f-4a82-9769-6a056c5afb80
                Copyright ©2013 Wu et al.; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 10 July 2012
                : 3 February 2013
                Categories
                Research Article

                Bioinformatics & Computational biology
                bioinformatics,latent variable structural equation modeling,measurement model,reliability,repeated measures anova

                Comments

                Comment on this article