50,218
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Unsupervised text mining methods for literature analysis: a case study for Thomas Pynchon's V.

      Orbit: Writing Around Pynchon

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We demonstrate the use of unsupervised text mining methods for the analysis of prose literature works, using Thomas Pynchon's novel V. as an example. Our results suggest that such methods may be employed to reveal meaningful information regarding the novel’s structure. We report results using a wide variety of clustering algorithms, several distinct distance functions, and different visualization techniques. The application of a simple topic model is also demonstrated. We discuss the meaningfulness of our results along with the limitations of our approach, and we suggest some possible paths for further study.

          Related collections

          Most cited references54

          • Record: found
          • Abstract: found
          • Article: not found

          Finding scientific topics.

          A first step in identifying the content of a document is determining which topics that document addresses. We describe a generative model for documents, introduced by Blei, Ng, and Jordan [Blei, D. M., Ng, A. Y. & Jordan, M. I. (2003) J. Machine Learn. Res. 3, 993-1022], in which each document is generated by choosing a distribution over topics and then choosing each word in the document from a topic selected according to this distribution. We then present a Markov chain Monte Carlo algorithm for inference in this model. We use this algorithm to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics. We show that the extracted topics capture meaningful structure in the data, consistent with the class designations provided by the authors of the articles, and outline further applications of this analysis, including identifying "hot topics" by examining temporal dynamics and tagging abstracts to illustrate semantic content.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            topicmodels: AnRPackage for Fitting Topic Models

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              A correlated topic model of Science

                Bookmark

                Author and article information

                Journal
                Orbit: Writing Around Pynchon
                Orbit: Writing Around Pynchon
                2044-4095
                2013
                : 1
                : 2
                Affiliations
                Institute of Technology Blanchardstown, Dublin, Ireland and Supreme Joint War College, Thessaloniki, Greece
                Article
                10.7766/orbit.v1.2.44
                25257ebe-6f1a-4a25-abf6-6a3b599c2645
                Copyright © 2013, Christos Iraklis Tsatsoulis

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History

                Literary studies,History
                Literary studies, History

                Comments

                Comment on this article