51
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      RNA-Seq Alignment to Individualized Genomes Improves Transcript Abundance Estimates in Multiparent Populations

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Massively parallel RNA sequencing (RNA-seq) has yielded a wealth of new insights into transcriptional regulation. A first step in the analysis of RNA-seq data is the alignment of short sequence reads to a common reference genome or transcriptome. Genetic variants that distinguish individual genomes from the reference sequence can cause reads to be misaligned, resulting in biased estimates of transcript abundance. Fine-tuning of read alignment algorithms does not correct this problem. We have developed Seqnature software to construct individualized diploid genomes and transcriptomes for multiparent populations and have implemented a complete analysis pipeline that incorporates other existing software tools. We demonstrate in simulated and real data sets that alignment to individualized transcriptomes increases read mapping accuracy, improves estimation of transcript abundance, and enables the direct estimation of allele-specific expression. Moreover, when applied to expression QTL mapping we find that our individualized alignment strategy corrects false-positive linkage signals and unmasks hidden associations. We recommend the use of individualized diploid genomes over reference sequence alignment for all applications of high-throughput sequencing technology in genetically diverse populations.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: found
          • Article: not found

          The transcriptional landscape of the yeast genome defined by RNA sequencing.

          The identification of untranslated regions, introns, and coding regions within an organism remains challenging. We developed a quantitative sequencing-based method called RNA-Seq for mapping transcribed regions, in which complementary DNA fragments are subjected to high-throughput sequencing and mapped to the genome. We applied RNA-Seq to generate a high-resolution transcriptome map of the yeast genome and demonstrated that most (74.5%) of the nonrepetitive sequence of the yeast genome is transcribed. We confirmed many known and predicted introns and demonstrated that others are not actively used. Alternative initiation codons and upstream open reading frames also were identified for many yeast genes. We also found unexpected 3'-end heterogeneity and the presence of many overlapping genes. These results indicate that the yeast transcriptome is more complex than previously appreciated.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genetics of gene expression and its effect on disease.

            Common human diseases result from the interplay of many genes and environmental factors. Therefore, a more integrative biology approach is needed to unravel the complexity and causes of such diseases. To elucidate the complexity of common human diseases such as obesity, we have analysed the expression of 23,720 transcripts in large population-based blood and adipose tissue cohorts comprehensively assessed for various phenotypes, including traits related to clinical obesity. In contrast to the blood expression profiles, we observed a marked correlation between gene expression in adipose tissue and obesity-related traits. Genome-wide linkage and association mapping revealed a highly significant genetic component to gene expression traits, including a strong genetic effect of proximal (cis) signals, with 50% of the cis signals overlapping between the two tissues profiled. Here we demonstrate an extensive transcriptional network constructed from the human adipose data that exhibits significant overlap with similar network modules constructed from mouse adipose data. A core network module in humans and mice was identified that is enriched for genes involved in the inflammatory and immune response and has been found to be causally associated to obesity-related traits.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Streaming fragment assignment for real-time analysis of sequencing experiments

              We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods.
                Bookmark

                Author and article information

                Journal
                Genetics
                Genetics
                genetics
                genetics
                genetics
                Genetics
                Genetics Society of America
                0016-6731
                1943-2631
                September 2014
                01 September 2014
                01 September 2014
                : 198
                : 1
                : 59-73
                Affiliations
                [* ]The Jackson Laboratory, Bar Harbor, Maine 04609
                []University of Wisconsin, Madison, Wisconsin 53705
                []Trinity University, San Antonio, Texas 78212
                Author notes
                [1 ]Corresponding author: The Jackson Laboratory, 600 Main St., Bar Harbor, ME 04609. E-mail: gary.churchill@ 123456jax.org
                Author information
                http://orcid.org/0000-0002-8458-1871
                Article
                165886
                10.1534/genetics.114.165886
                4174954
                25236449
                03214fc1-73c2-473b-beb8-8eb5eaecf0bd
                Copyright © 2014 by the Genetics Society of America

                Available freely online through the author-supported open access option.

                History
                : 05 May 2014
                : 04 June 2014
                Page count
                Pages: 15
                Categories
                Multiparental Populations
                Custom metadata
                v1

                Genetics
                rna-seq,expression qtl,diversity outbred mice,diversity outbred (do),qtl mapping,haplotype reconstruction,high-density genotyping,mixed models,multiparent advanced generation inter-cross (magic),multiparental populations,mpp

                Comments

                Comment on this article