157
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Conference Proceedings: found
      Is Open Access

      Focused Retrieval Using Topical Language and Structure

      BCS IRSG Symposium: Future Directions in Information Access 2007 (FDIA)

      Future Directions in Information Access

      28-29 August 2007

      Focused Retrieval, Web Retrieval, Language Modeling, Relevance Feedback

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We investigate focused retrieval techniques that deal with the increasing amount of structure on the web. Our approach is to combine multiple representations of web information in a common framework based on statistical language models. In this framework, it will be possible to derive a topical language model of the actual language-use on web pages on a certain topic—such as arts, business, entertainment, education, etc.—using the unigrams and bigrams taken from the plain text of the web pages. Similarly, it will be possible to derive models of the structure of web pages to distinguish between blogs, FAQs, personal web pages, etc. Structural characteristics of a web page include, amongst others, tagname statistics and parent-child tags. We will build a multiple level language model to exploit the information contained in the topical language and structure models. The .GOV2 corpus will be used as a test collection on which queries will be run on different topical categories and on web pages with different structures. We plan to develop so-called parsimonious models to derive a compact representation and to handle dependencies between representations of the data.

          Related collections

          Most cited references 2

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Information retrieval as statistical translation

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Model-based feedback in the language modeling approach to information retrieval

              Bookmark

              Author and article information

              Contributors
              Conference
              August 2007
              August 2007
              : 1-6
              Affiliations
              Archives and Information Studies, University of Amsterdam

              Turfdraagsterpad 9, 1012 XT Amsterdam, The Netherlands
              Article
              10.14236/ewic/FDIA2007.9
              © A.M. Kaptein. Published by BCS Learning and Development Ltd. BCS IRSG Symposium: Future Directions in Information Access 2007, Glasgow

              This work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

              BCS IRSG Symposium: Future Directions in Information Access 2007
              FDIA
              Glasgow
              28-29 August 2007
              Electronic Workshops in Computing (eWiC)
              Future Directions in Information Access
              Product
              Product Information: 1477-9358BCS Learning & Development
              Self URI (journal page): https://ewic.bcs.org/
              Categories
              Electronic Workshops in Computing

              Comments

              Comment on this article