19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      GutenTag: A Multi-Term Caching Optimized Tag Query Processor for Key-Value Based NoSQL Storage Systems

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          NoSQL systems are more and more deployed as back-end infrastructure for large-scale distributed online platforms like Google, Amazon or Facebook. Their applicability results from the fact that most services of online platforms access the stored data objects via their primary key. However, NoSQL systems do not efficiently support services referring more than one data object, e.g. the term-based search for data objects. To address this issue we propose our architecture based on an inverted index on top of a NoSQL system. For queries comprising more than one term, distributed indices yield a limited performance in large distributed systems. We propose two extensions to cope with this challenge. Firstly, we store index entries not only for single term but also for a selected set of term combinations depending on their popularity derived from a query history. Secondly, we additionally cache popular keys on gateway nodes, which are a common concept in real-world systems, acting as interface for services when accessing data objects in the back end. Our results show that we can significantly reduces the bandwidth consumption for processing queries, with an acceptable, marginal increase in the load of the gateway nodes.

          Related collections

          Most cited references8

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          A picture of search

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            The impact of caching on search engines

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Analysis of the query logs of a Web site search engine

                Bookmark

                Author and article information

                Journal
                23 May 2011
                Article
                1105.4452
                e64a0917-c734-44c2-8498-6885b9ffcb7e

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                22 pages, 21 figures, 11 tables
                cs.IR cs.DB

                Comments

                Comment on this article