An ERC-funded project, QUANTESS aimed to develop innovative methods and tools for the quantitative analysis of textual data in the social sciences. These methods are sharply distinguished by more traditional content analysis schemes for analyzing texts, whether computer-assisted or not, by their explicit treatment of words as pure data, from which inductive statistical procedures may be used to estimate latent traits. Besides unlocking features of the texts not possible through interpretative methods, the 'text as data' approach also allows rapid analysis of huge volumes of text in any language, providing a means for researchers to deal with the ubiquitous textual data now available. QUANTESS advanced the state of this field through a combination of methodological innovations; applications to texts in applied fields, especially political science; the development of a large suite of computer programming tools to implement these methods and applications; and extensive dissemination of all of these through conferences, workshops, teaching, and online tutorials and websites. The largest of these tool sets, the quanteda R package, has been downloaded over 100,000 times and has empowered researchers in dozens of countries, at all career levels, and in fields spanning industry, government, and academia.
This work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/