40
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      ShrinkBayes: a versatile R-package for analysis of count-based sequencing data in complex study designs

      product-review

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Complex designs are common in (observational) clinical studies. Sequencing data for such studies are produced more and more often, implying challenges for the analysis, such as excess of zeros, presence of random effects and multi-parameter inference. Moreover, when sample sizes are small, inference is likely to be too liberal when, in a Bayesian setting, applying a non-appropriate prior or to lack power when not carefully borrowing information across features.

          Results

          We show on microRNA sequencing data from a clinical cancer study how our software ShrinkBayes tackles the aforementioned challenges. In addition, we illustrate its comparatively good performance on multi-parameter inference for groups using a data-based simulation. Finally, in the small sample size setting, we demonstrate its high power and improved FDR estimation by use of Gaussian mixture priors that include a point mass.

          Conclusion

          ShrinkBayes is a versatile software package for the analysis of count-based sequencing data, which is particularly useful for studies with small sample sizes or complex designs.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: not found
          • Article: not found

          Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors.

            Next generation sequencing is quickly replacing microarrays as a technique to probe different molecular levels of the cell, such as DNA or RNA. The technology provides higher resolution, while reducing bias. RNA sequencing results in counts of RNA strands. This type of data imposes new statistical challenges. We present a novel, generic approach to model and analyze such data. Our approach aims at large flexibility of the likelihood (count) model and the regression model alike. Hence, a variety of count models is supported, such as the popular NB model, which accounts for overdispersion. In addition, complex, non-balanced designs and random effects are accommodated. Like some other methods, our method provides shrinkage of dispersion-related parameters. However, we extend it by enabling joint shrinkage of parameters, including those for which inference is desired. We argue that this is essential for Bayesian multiplicity correction. Shrinkage is effectuated by empirically estimating priors. We discuss several parametric (mixture) and non-parametric priors and develop procedures to estimate (parameters of) those. Inference is provided by means of local and Bayesian false discovery rates. We illustrate our method on several simulations and two data sets, also to compare it with other methods. Model- and data-based simulations show substantial improvements in the sensitivity at the given specificity. The data motivate the use of the ZI-NB as a powerful alternative to the NB, which results in higher detection rates for low-count data. Finally, compared with other methods, the results on small sample subsets are more reproducible when validated on their large sample complements, illustrating the importance of the type of shrinkage.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The pea aphid genome sequence brings theories of insect defense into question

              The pea aphid life cycle The ecology, physiology and evolution of the hemipteran insect pea aphid (Acyrthosiphon pisum) has been well studied because of its fascinating phenotypic plasticity, its heritable symbiotic associations and its impact on agriculture. Aphids are soft-bodied sap-feeding insects that act as vectors for plant viruses and cause worldwide crop damage. Sequencing and analysis of the pea aphid genome by the International Aphid Genomics Consortium (IAGC) [1] has provided new insights into aphid development and their interactions and coevolution with obligate and facultative symbiotic bacteria. Among the studies enabled by the genome project is the characterization of genes involved in the pea aphid immune and defense systems, published in this issue of Genome Biology [2]. The genome of the pea aphid is the first to be sequenced of the hemimetabolous group of insects, characterized by life cycles with incomplete metamorphosis from juvenile to adult stages. The annual aphid life cycle is particularly interesting because it includes a single sexual generation that alternates with several consecutive all-female parthenogenic generations (reviewed in [3]). The sexual males and females mate in the autumn, producing diapausing eggs that overwinter and hatch in the spring to produce the first all-female generation. The reduction division of meiosis I does not occur in the asexual females, allowing parthenogenesis. The embryos develop within their asexual mothers and can even contain embryos themselves. Several rapidly developing generations of asexual females are produced until autumn, when the shortened photoperiod induces the last asexual generation to give rise to sexual females and sexual males, completing the cycle. Sex determination in pea aphid is XX/XO, with males being XO. The males are produced by removal of one X chromosome during meiosis II. Given that all sperm carry an X chromosome, the following sexually produced generation is all female [3]. Rapid reproduction during the asexual phase of the life cycle allows aphids to adapt quickly to new environments and host plants, and it has contributed to the development of alternative phenotypes (polyphenisms) among individuals with identical genotypes. These polyphenisms, such as asexual versus sexual females, winged versus wingless asexual females and morphs specialized to resist extreme environments or defend the colony, make the pea aphid a good system for investigating the effect of environmental cues on development [3]. Indeed, Miura et al. [3] found that the development of asexual and sexual embryos was highly divergent, despite being controlled by identical genomes in clonally produced individuals. The pea aphid genome sequence shows remarkably extensive gene duplication, with more than 2,000 gene families that are expanded compared with the published genomes of other insects, suggesting that the unusual developmental patterns may be facilitated by duplications of genes related to development and cell cycle [1]. For example, lineage-specific duplications in several mitotic regulators and mitosis-related genes may contribute to plasticity of the cell cycle [1]. Symbiosis In addition to providing a model for phenotypic plasticity, the pea aphid is the best-studied model for maternally transmitted symbionts (reviewed in [4,5]). Pea aphids have coevolved with the obligate intracellular symbiont Buchnera aphidicola for over 100 million years. Buchnera are Gram-negative bacteria that exist only within specialized cells of pea aphids called bacteriocytes and are transferred vertically from mother to embryos. In addition to the obligate symbiont, pea aphids have more recent associations with vertically transmitted facultative symbionts, including the Gram-negative bacteria Regiella insecticola, Serratia symbiotica and Hamiltonella defensa (reviewed in [6]). Although they are not required for host vitality, they confer benefits such as protection against parasitoid wasps, fungal pathogens and heat [6]. Nutritional, physiological and functional studies (reviewed in [5,7]), in addition to a completely sequenced genome of the Buchnera strain that infects the pea aphid [8], have provided clues about the nature of the interdependency between host and symbiont. Annotation of the Buchnera genome [8] supports previous studies indicating that although Buchnera has a dramatically reduced gene repertoire, it provides amino acids that the host cannot produce. The Buchnera genome includes genes involved in biosysnthesis of the nine amino acids that are known to be essential to animals (histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan and valine), but very few genes involved in synthesis of non-essential amino acids [8]. Manual annotation of the pea aphid genome indeed shows that it lacks machinery to synthesize the nine amino acids that are essential to other animals [1]. In addition, pea aphid cannot synthesize arginine due to the complete lack of urea cycle genes [1]. Previous studies (for example, [8]) have suggested that the host provides what the symbiont cannot produce. The IAGC [1] confirmed the coordination of metabolism between host and symbiont. For example, rather than excreting nitrogenous waste, pea aphid recycles amino groups as glutamine, which Buchnera then incorporates into the production of arginine [1,8]. Remarkably, annotation of the pea aphid genome suggests that several additional amino acid and purine metabolism pathways include steps encoded across the two genomes (see Figure 9 in [1]). The availability of host and symbiont genomes facilitates the investigation of lateral gene transfer. The previously sequenced genomes of Buchnera (for example, [8]) have shown no evidence of gene uptake from the host [5]. Now, the IAGC has been able to perform the first exhaustive search for lateral gene transfer in the genome of a eukaryotic host that has heritable associations with symbiotic bacteria. They found 12 genes or gene fragments of bacterial origin [1]. Although some of these genes had been found previously to be highly expressed in bacteriocytes so may function in the regulation of the symbiosis [9], overall there was little transfer of bacterial genes to the host genome [1]. Immunity and defense Adding to the complexity of the pea aphid system are associations with enemies such as pathogenic fungi and parasitic wasps, which leads to the question of how aphid defense mechanisms operate. Gerardo et al. [2] begin to address that question by manually annotating the pea aphid genome to determine the presence or absence of immune- and stress-related genes found in other insects, such as Drosophila, then performing RNA and protein expression analyses of pathogen-challenged and uninfected aphids. They systematically sought genes related to microbial recognition, signaling pathways and response. Their results show that pea aphids are missing many immune- and stress-related genes found in all other insects with published genomes, and that their RNA and protein expression responses to infection are limited [2]. The most striking differences in microbial recognition genes between pea aphid and other studied insects are the lack of peptidoglycan receptor proteins (PGRPs), class C scavenger receptors and epidermal growth factor (EGF)-repeat-containing genes in pea aphids [2]. Drosophila PGRPs recognize peptidoglycans in the cell walls of Gram-negative and Gram-positive bacteria, and this leads to the activation of the Toll and immunodeficiency/c-Jun N-terminal kinase (JNK) pathways. The recognition of Gram-positive bacteria in Drosophila is preceded by the formation of a complex between Gram-negative binding proteins (GNBPs) and PGRPs and hydrolysis of peptidoglycans into small fragments by GNBPs. The authors found it surprising that pea aphids have two GNBP paralogs, despite lacking PGRPs, and suggested that GNBPs may have a different role in pea aphids [2]. Pea aphids have no class C scavenger receptors [2], which facilitate phagocytosis in Drosophila. The pea aphid genome also lacks EGF repeats, which are found in members of the Nimrod superfamily, thought to serve as receptors in phagocytosis and bacterial binding in other insects [2]. As for signaling pathways, Gerardo et al. [2] found the Toll and Janus kinase/signal transducer (JAK/STAT) pathways to be intact. These are both thought to be involved in development and innate immunity. On the other hand, they could not identify many components of the immunodeficiency (IMD) signaling pathway, which is critical for fighting Gram-negative bacteria in Drosophila and may also have a role in defense against Gram-positive bacteria and fungi (see Figure 1 in [2]). The IMD pathway genes missing in pea aphid have conserved one-to-one orthologs in most other published insect genome sequences [2]. Since the IMD pathway triggers the JNK pathway in Drosophila, the authors found it surprising that the pea aphid genome does include most components of the JNK pathway [2]. Pea aphids differ extensively in their defense response genes compared with those known in other insects [2]. They are missing many of the antimicrobial peptides (AMPs) that are conserved in other insects (see [2] for a complete list). Notably, pea aphids lack defensins, which have been found in all insect genomes sequenced so far. Similar to the red flour beetle (Tribolium castaneum) but unlike any other sequenced insect genome, the pea aphid genome contains plant-like thaumatin homologs, which have anti-fungal properties in plants. The authors [2] suggest that these are ancient defense genes that have been lost in many insect species. Another striking finding is that pea aphid lacks C-type lysozymes, which are the most common class of lysozyme in metazoa and which have been found in all other sequenced insect genomes [2]. Lysozymes are a family of enzymes that degrade bacterial cell walls. Pea aphids do have three i-type (invertebrate) lysozymes [2]. In addition, two genes that were found to be of bacterial origin encode bacteriolytic enzymes similar to lysozymes [1]. Gerardo et al. [2] then went on to investigate expression of 23 of the recognition, signaling and response genes in aphids that had been subjected to infection and stress treatments and, remarkably, found no upregulation of AMPs in infected aphids. Similarly, in expressed sequence tag (EST)-based experiments comparing cDNA libraries synthesized from guts of infected and uninfected aphids, they did not detect any standard immune related genes. They then used suppression subtractive hybridization (SSH) to compare cDNA from infected and uninfected aphids. Briefly, SSH is a technique in which PCR amplification of cDNAs that are common between two samples is selectively suppressed, so that only differentially expressed cDNAs are amplified and subsequently cloned and sequenced. Optimizing the control and experimental sample ratio ensures that cDNAs more abundantly expressed in the experimental sample (in this case infected aphids) are selectively amplified. The infected versus uninfected aphid SSH library included few immune-related genes, and again, no AMPs. Finally, high performance liquid chromatography (HPLC) peptide analyses targeting small peptides, such as AMPs, were run on the hemolymph of infected aphids and also suggested a lack of AMP response [2]. The findings of Gerardo et al. [2] suggest that pea aphids, and possibly other hemimetabolous insects, have a defense system that differs greatly from other well-studied insects, most of which are holometabolous, bringing the authors to question the generality of the accepted insect model of immunity. Their functional analyses agree with a previous SSH study investigating wound-mediated expression in aphid, which also found no AMPs to be present in hemolymph [10]. Gerardo et al. [2] revisit hypotheses proposed by Altincicek et al. [10] to explain the seemingly deficient antimicrobial defenses in pea aphid and suggest that both increased reproduction following infection and symbiont-mediated host protection may contribute to the aphid's defenses. In summary, I have highlighted a few of the outcomes of the pea aphid genome analysis, which revealed new perspectives on questions related to aphid phenotypic plasticity, symbiosis and defense mechanisms. As the first genome of a hemimetabolous insect, it will reveal the diversity of biological mechanisms among insects and expand our traditional models of fundamental processes, such as immunity and stress response. Combined with the sequences of several symbiont genomes, the pea aphid genome will advance the study of coevolution and encourage a multi-organismal systems biology approach.
                Bookmark

                Author and article information

                Contributors
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2014
                26 April 2014
                : 15
                : 116
                Affiliations
                [1 ]Department of Epidemiology and Biostatistics, VU University medical center, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
                [2 ]Department of Mathematics, VU University, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
                [3 ]Department of Medical Oncology, VU University medical center, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
                [4 ]Department of Pathology, VU University medical center, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
                Article
                1471-2105-15-116
                10.1186/1471-2105-15-116
                4098777
                24766777
                40fb9bd3-58a0-46ea-997b-8f661a84df16
                Copyright © 2014 van de Wiel et al.; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 24 September 2013
                : 11 April 2014
                Categories
                Software

                Bioinformatics & Computational biology
                differential expression,shrinkage,sequencing,bayesian analysis

                Comments

                Comment on this article