19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Estimating search engine index size variability: a 9-year longitudinal study

      research-article
      , ,
      Scientometrics
      Springer Netherlands
      Search engine index, Webometrics, Longitudinal study

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel method of estimating the size of a Web search engine’s index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing’s indices over a nine-year period, from March 2006 until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find that much, if not all of this variability can be explained by changes in the indexing and ranking infrastructure of Google and Bing. This casts further doubt on whether Web search engines can be used reliably for cross-sectional webometric studies.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: not found
          • Article: not found

          Accessibility of information on the web.

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Graph structure in the Web

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Search engine coverage bias: evidence and possible causes

                Bookmark

                Author and article information

                Contributors
                +31 24 3611649 , a.vandenbosch@let.ru.nl
                toine@hum.aau.dk
                maurice@dekunder.nl
                Journal
                Scientometrics
                Scientometrics
                Scientometrics
                Springer Netherlands (Dordrecht )
                0138-9130
                9 February 2016
                9 February 2016
                2016
                : 107
                : 839-856
                Affiliations
                [ ]Centre for Language Studies, Radboud University, PO Box 9103, 6500 HD Nijmegen, The Netherlands
                [ ]Department of Communication, Aalborg University Copenhagen, A.C. Meyers Vænge 15, 2450 Copenhagen, Denmark
                [ ]De Kunder Internet Media, Toernooiveld 100, 6525 EC Nijmegen, The Netherlands
                Author information
                http://orcid.org/0000-0003-2493-656X
                Article
                1863
                10.1007/s11192-016-1863-z
                4833824
                27122648
                082fc1f8-4707-48f4-8452-3a5a562f80b4
                © The Author(s) 2016

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

                History
                : 27 July 2015
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100003246, Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NL);
                Award ID: 277-70-004
                Award Recipient :
                Categories
                Article
                Custom metadata
                © Akadémiai Kiadó, Budapest, Hungary 2016

                Computer science
                search engine index,webometrics,longitudinal study
                Computer science
                search engine index, webometrics, longitudinal study

                Comments

                Comment on this article