9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The meta book and size-dependent properties of written language

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Evidence is given for a systematic text-length dependence of the power-law index gamma of a single book. The estimated gamma values are consistent with a monotonic decrease from 2 to 1 with increasing length of a text. A direct connection to an extended Heap's law is explored. The infinite book limit is, as a consequence, proposed to be given by gamma = 1 instead of the value gamma=2 expected if the Zipf's law was ubiquitously applicable. In addition we explore the idea that the systematic text-length dependence can be described by a meta book concept, which is an abstract representation reflecting the word-frequency structure of a text. According to this concept the word-frequency distribution of a text, with a certain length written by a single author, has the same characteristics as a text of the same length pulled out from an imaginary complete infinite corpus written by the same author.

          Related collections

          Author and article information

          Journal
          24 September 2009
          Article
          10.1088/1367-2630/11/12/123015
          0909.4385
          0bab22dd-4d4b-4166-9025-59f199877b69

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          New J. Phys. 11 (2009) 123015
          7 pages, 6 figures, 1 table
          physics.soc-ph cs.CL physics.data-an

          Comments

          Comment on this article