53
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts

      ,
      Political Analysis
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have hindered their use in political science research. Here lies the promise of automated text analysis: it substantially reduces the costs of analyzing large collections of text. We provide a guide to this exciting new area of research and show how, in many instances, the methods have already obtained part of their promise. But there are pitfalls to using automated methods—they are no substitute for careful thought and close reading and require extensive and problem-specific validation. We survey a wide range of new methods, provide guidance on how to validate the output of the models, and clarify misconceptions and errors in the literature. To conclude, we argue that for automated text methods to become a standard tool for political scientists, methodologists must contribute new methods and new methods of validation.

          Related collections

          Most cited references10

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Thumbs up?

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The Political Economy of Benefits and Costs: A Neoclassical Approach to Distributive Politics

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Fightin' Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict

              Entries in the burgeoning “text-as-data” movement are often accompanied by lists or visualizations of how word (or other lexical feature) usage differs across some pair or set of documents. These are intended either to establish some target semantic concept (like the content of partisan frames) to estimate word-specific measures that feed forward into another analysis (like locating parties in ideological space) or both. We discuss a variety of techniques for selecting words that capture partisan, or other, differences in political speech and for evaluating the relative importance of those words. We introduce and emphasize several new approaches based on Bayesian shrinkage and regularization. We illustrate the relative utility of these approaches with analyses of partisan, gender, and distributive speech in the U.S. Senate.
                Bookmark

                Author and article information

                Journal
                applab
                Political Analysis
                Polit. anal.
                Oxford University Press (OUP)
                1047-1987
                1476-4989
                2013
                January 2017
                : 21
                : 03
                : 267-297
                Article
                10.1093/pan/mps028
                8ea24ee7-a407-4aa5-95fd-10f4b078d465
                © 2013
                History

                Comments

                Comment on this article