ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

24

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Improved ESP-index: a practical self-index for highly repetitive texts

Preprint

Author(s): Yoshimasa Takabatake , Yasuo Tabei , Hiroshi Sakamoto

Publication date Created: 2014-04-19

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

While several self-indexes for highly repetitive texts exist, developing a practical self-index applicable to real world repetitive texts remains a challenge. ESP-index is a grammar-based self-index on the notion of edit-sensitive parsing (ESP), an efficient parsing algorithm that guarantees upper bounds of parsing discrepancies between different appearances of the same subtexts in a text. Although ESP-index performs efficient top-down searches of query texts, it has a serious issue on binary searches for finding appearances of variables for a query text, which resulted in slowing down the query searches. We present an improved ESP-index (ESP-index-I) by leveraging the idea behind succinct data structures for large alphabets. While ESP-index-I keeps the same types of efficiencies as ESP-index about the top-down searches, it avoid the binary searches using fast rank/select operations. We experimentally test ESP-index-I on the ability to search query texts and extract subtexts from real world repetitive texts on a large-scale, and we show that ESP-index-I performs better that other possible approaches.

Related collections

Most cited references 10

Record: found
Abstract: not found
Book Chapter: not found

A Faster Grammar-Based Self-index

Travis Gagie, Paweł Gawrychowski, Juha Kärkkäinen … (2012)

0 comments Cited 22 times – based on 0 reviews

Record: found
Abstract: not found
Book Chapter: not found

Succinct Representations of Permutations

J. Munro, Rajeev Raman, Venkatesh Raman … (2003)

0 comments Cited 22 times – based on 0 reviews

Record: found
Abstract: not found
Book Chapter: not found

LZ77-Based Self-indexing with Faster Pattern Matching

Travis Gagie, Paweł Gawrychowski, Juha Kärkkäinen … (2014)

0 comments Cited 19 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 2014-04-19

Publication date Updated: 2014-04-27

Article

ArXiV ID: 1404.4972

SO-VID: b301e345-6bb4-4894-a00f-7fa648384872

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments This is the full version of a proceeding accepted to the 11th International Symposium on Experimental Algorithms (SEA2014)

Categories cs.DS

ScienceOpen disciplines: Data structures & Algorithms

Data availability:

ScienceOpen disciplines: Data structures & Algorithms

Comments

Comment on this article

Similar content 100

See all similar

Cited by 2

See all cited by

Most referenced authors 74

See all reference authors

- Version 1