1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Book Chapter: not found

      The Input Side of the Equation

      other
      Humana Press

      Read this book at

      Buy book Bookmark
          There is no author summary for this book yet. Authors can add summaries to their books on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references2

          • Record: found
          • Abstract: found
          • Article: not found

          A space-efficient construction of the Burrows-Wheeler transform for genomic data.

          Algorithms for exact string matching have substantial application in computational biology. Time-efficient data structures which support a variety of exact string matching queries, such as the suffix tree and the suffix array, have been applied to such problems. As sequence databases grow, more space-efficient approaches to exact matching are becoming more important. One such data structure, the compressed suffix array (CSA), based on the Burrows-Wheeler transform, has been shown to require memory which is nearly equal to the memory requirements of the original database, while supporting common sorts of query problems time efficiently. However, building a CSA from a sequence in efficient space and time is challenging. In 2002, the first space-efficient CSA construction algorithm was presented. That implementation used (1+2 log2 |summation|)(1+epsilon) bits per character (where epsilon is a small fraction). The construction algorithm ran in as much as twice that space, in O(| summation|n log(n)) time. We have created an implementation which can also achieve these asymptotic bounds, but for small alphabets, and only uses 1/2 (1+|summation|)(1+epsilon) bits per character, a factor of 2 less space for nucleotide alphabets. We present time and space results for the CSA construction and querying of our implementation on publicly available genome data which demonstrate the practicality of this approach.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The problem of disguised missing data

              Bookmark

              Author and book information

              Book Chapter
              2012
              November 12 2011
              : 191-234
              10.1007/978-1-59745-290-8_5
              89590841-ceb7-4526-80b8-ba840197147f
              History

              Comments

              Comment on this book

              Book chapters

              Similar content1,020