8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SeqOthello: querying RNA-seq experiments at scale

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We present SeqOthello, an ultra-fast and memory-efficient indexing structure to support arbitrary sequence query against large collections of RNA-seq experiments. It takes SeqOthello only 5 min and 19.1 GB memory to conduct a global survey of 11,658 fusion events against 10,113 TCGA Pan-Cancer RNA-seq datasets. The query recovers 92.7% of tier-1 fusions curated by TCGA Fusion Gene Database and reveals 270 novel occurrences, all of which are present as tumor-specific. By providing a reference-free, alignment-free, and parameter-free sequence search system, SeqOthello will enable large-scale integrative studies using sequence-level data, an undertaking not previously practicable for many individual labs.

          Electronic supplementary material

          The online version of this article (10.1186/s13059-018-1535-9) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: not found
          • Article: not found

          Reproducible RNA-seq analysis using recount2

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments

            Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of ‘baseline’ expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful ‘contrasts’, i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Network Applications of Bloom Filters: A Survey

                Bookmark

                Author and article information

                Contributors
                ye.yu@uky.edu
                jinpeng.liu@uky.edu
                xinan.liu@uky.edu
                yi.zhang@uky.edu
                eamonn.magner5@uky.edu
                erik.lehnert@sbgenomics.com
                cqian12@ucsc.edu
                liuj@cs.uky.edu
                Journal
                Genome Biol
                Genome Biol
                Genome Biology
                BioMed Central (London )
                1474-7596
                1474-760X
                19 October 2018
                19 October 2018
                2018
                : 19
                : 167
                Affiliations
                [1 ]ISNI 0000 0004 1936 8438, GRID grid.266539.d, Department of Computer Science, , University of Kentucky, ; 301 Rose St, Lexington, KY 40508 USA
                [2 ]GRID grid.492568.4, Seven Bridges Genomics Inc, ; 1 Main St, 5th Floor, Suite 500, Cambridge, MA 02142 USA
                [3 ]ISNI 0000 0001 0740 6917, GRID grid.205975.c, Department of Computer Engineering, , University of California Santa Cruz, ; 1156 High Street, Santa Cruz, CA 95064 USA
                Article
                1535
                10.1186/s13059-018-1535-9
                6194578
                30340508
                641dce01-6a6c-48f0-aa2a-722bb3587a91
                © The Author(s). 2018

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 18 February 2018
                : 11 September 2018
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000145, Division of Information and Intelligent Systems;
                Award ID: 1701681
                Award Recipient :
                Funded by: National Science Foundation (US)
                Award ID: 1054631
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000144, Division of Computer and Network Systems;
                Award ID: 1717948
                Award Recipient :
                Funded by: National Institutes of Health (US)
                Award ID: P30CA177558
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000002, National Institutes of Health;
                Award ID: 1UL1TR001998-01
                Award Recipient :
                Categories
                Method
                Custom metadata
                © The Author(s) 2018

                Genetics
                rna-seq,tcga,gene fusion,pan-cancer,query,compression,othello,seqothello
                Genetics
                rna-seq, tcga, gene fusion, pan-cancer, query, compression, othello, seqothello

                Comments

                Comment on this article