20
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DRIMSeq: a Dirichlet-multinomial framework for multivariate count outcomes in genomics

      methods-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          There are many instances in genomics data analyses where measurements are made on a multivariate response. For example, alternative splicing can lead to multiple expressed isoforms from the same primary transcript. There are situations where differences (e.g. between normal and disease state) in the relative ratio of expressed isoforms may have significant phenotypic consequences or lead to prognostic capabilities. Similarly, knowledge of single nucleotide polymorphisms (SNPs) that affect splicing, so-called splicing quantitative trait loci (sQTL) will help to characterize the effects of genetic variation on gene expression. RNA sequencing (RNA-seq) has provided an attractive toolbox to carefully unravel alternative splicing outcomes and recently, fast and accurate methods for transcript quantification have become available. We propose a statistical framework based on the Dirichlet-multinomial distribution that can discover changes in isoform usage between conditions and SNPs that affect relative expression of transcripts using these quantifications. The Dirichlet-multinomial model naturally accounts for the differential gene expression without losing information about overall gene abundance and by joint modeling of isoform expression, it has the capability to account for their correlated nature. The main challenge in this approach is to get robust estimates of model parameters with limited numbers of replicates. We approach this by sharing information and show that our method improves on existing approaches in terms of standard statistical performance metrics. The framework is applicable to other multivariate scenarios, such as Poly-A-seq or where beta-binomial models have been applied (e.g., differential DNA methylation). Our method is available as a Bioconductor R package called DRIMSeq.

          Related collections

          Most cited references39

          • Record: found
          • Abstract: found
          • Article: not found

          Near-optimal probabilistic RNA-seq quantification.

          We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            featureCounts: An efficient general-purpose program for assigning sequence reads to genomic features

            , , (2013)
            Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Differential analysis of gene regulation at transcript resolution with RNA-seq.

              Differential analysis of gene and transcript expression using high-throughput RNA sequencing (RNA-seq) is complicated by several sources of measurement variability and poses numerous statistical challenges. We present Cuffdiff 2, an algorithm that estimates expression at transcript-level resolution and controls for variability evident across replicate libraries. Cuffdiff 2 robustly identifies differentially expressed transcripts and genes and reveals differential splicing and promoter-preference changes. We demonstrate the accuracy of our approach through differential analysis of lung fibroblasts in response to loss of the developmental transcription factor HOXA1, which we show is required for lung fibroblast and HeLa cell cycle progression. Loss of HOXA1 results in significant expression level changes in thousands of individual transcripts, along with isoform switching events in key regulators of the cell cycle. Cuffdiff 2 performs robust differential analysis in RNA-seq experiments at transcript resolution, revealing a layer of regulation not readily observable with other high-throughput technologies.
                Bookmark

                Author and article information

                Journal
                F1000Res
                F1000Res
                F1000Research
                F1000Research
                F1000Research (London, UK )
                2046-1402
                6 December 2016
                2016
                : 5
                : 1356
                Affiliations
                [1 ]Institute for Molecular Life Sciences, University of Zurich, Zurich, 8057, Switzerland
                [2 ]SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland
                [1 ]Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona, Spain
                [1 ]Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona, Spain
                University of Zurich, Switzerland
                [1 ]Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
                University of Zurich, Switzerland
                Author notes

                MN drafted the manuscript, designed the analyses, analyzed the data and implemented the DRIMSeq R package. MDR drafted the manuscript and designed the overall study. All authors read and approved the final manuscript and have agreed to the content.

                Competing interests: No competing interests were disclosed.

                Competing interests: No competing interests were disclosed.

                Competing interests: No competing interests were disclosed.

                Competing interests: No competing interests were disclosed.

                Competing interests: No competing interests were disclosed.

                Competing interests: No competing interests were disclosed.

                Article
                10.12688/f1000research.8900.2
                5200948
                28105305
                43f0a90c-bf50-45f0-93aa-344a65399721
                Copyright: © 2016 Nowicka M and Robinson MD

                This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 1 December 2016
                Funding
                Funded by: Swiss National Science Foundation
                Award ID: 143883
                MN acknowledges the funding from a Swiss Institute of Bioinformatics (SIB) Fellowship. MDR would like to acknowledge funding from an Swiss National Science Foundation (SNSF) Project Grant (143883).
                Categories
                Method Article
                Articles
                Bioinformatics
                Genomics
                Protein Chemistry & Proteomics
                Theory & Simulation

                drimseq,genomics,single nucleotide polymorphism,rna-seq,splicing,statistical framework

                Comments

                Comment on this article