• Record: found
  • Abstract: found
  • Article: found
Is Open Access

Diffusion of Lexical Change in Social Media

Read this article at

      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


      Computer-mediated communication is driving fundamental changes in the nature of written language. We investigate these changes by statistical analysis of a dataset comprising 107 million Twitter messages (authored by 2.7 million unique user accounts). Using a latent vector autoregressive model to aggregate across thousands of words, we identify high-level patterns in diffusion of linguistic change over the United States. Our model is robust to unpredictable changes in Twitter's sampling rate, and provides a probabilistic characterization of the relationship of macro-scale linguistic influence to a set of demographic and geographic predictors. The results of this analysis offer support for prior arguments that focus on geographical proximity and population size. However, demographic similarity – especially with regard to race – plays an even more central role, as cities with similar racial demographics are far more likely to share linguistic influence. Rather than moving towards a single unified “netspeak” dialect, language evolution in computer-mediated communication reproduces existing fault lines in spoken American English.

      Related collections

      Most cited references 15

      • Record: found
      • Abstract: not found
      • Article: not found

      Social science. Computational social science.

        • Record: found
        • Abstract: found
        • Article: not found

        A 61-million-person experiment in social influence and political mobilization.

        Human behaviour is thought to spread through face-to-face social networks, but it is difficult to identify social influence effects in observational studies, and it is unknown whether online social networks operate in the same way. Here we report results from a randomized controlled trial of political mobilization messages delivered to 61 million Facebook users during the 2010 US congressional elections. The results show that the messages directly influenced political self-expression, information seeking and real-world voting behaviour of millions of people. Furthermore, the messages not only influenced the users who received them but also the users' friends, and friends of friends. The effect of social transmission on real-world voting was greater than the direct effect of the messages themselves, and nearly all the transmission occurred between 'close friends' who were more likely to have a face-to-face relationship. These results suggest that strong ties are instrumental for spreading both online and real-world behaviour in human social networks.
          • Record: found
          • Abstract: found
          • Article: not found

          Identifying influential and susceptible members of social networks.

           S Aral,  Dylan Walker (2012)
          Identifying social influence in networks is critical to understanding how behaviors spread. We present a method that uses in vivo randomized experimentation to identify influence and susceptibility in networks while avoiding the biases inherent in traditional estimates of social contagion. Estimation in a representative sample of 1.3 million Facebook users showed that younger users are more susceptible to influence than older users, men are more influential than women, women influence men more than they influence other women, and married individuals are the least susceptible to influence in the decision to adopt the product offered. Analysis of influence and susceptibility together with network structure revealed that influential individuals are less susceptible to influence than noninfluential individuals and that they cluster in the network while susceptible individuals do not, which suggests that influential people with influential friends may be instrumental in the spread of this product in the network.

            Author and article information

            [1 ]School of Interactive Computing, Georgia Institute of Technology, Atlanta, Georgia, United States of America
            [2 ]School of Computer Science, University of Massachusetts, Amherst, Massachusetts, United States of America
            [3 ]School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
            Massachusetts Institute of Technology, United States of America
            Author notes

            Competing Interests: BO and NAS were supported by Google's support of the Reading is Believing project at Carnegie Mellon University. This study was also supported by a computing resources award from Amazon Web Services. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.

            Conceived and designed the experiments: JE BO NAS EPX. Performed the experiments: JE BO. Analyzed the data: JE BO. Wrote the paper: JE BO NAS.

            Role: Editor
            PLoS One
            PLoS ONE
            PLoS ONE
            Public Library of Science (San Francisco, USA )
            19 November 2014
            : 9
            : 11
            25409166 4237389 PONE-D-14-28899 10.1371/journal.pone.0113114

            This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

            Pages: 13
            This work was supported by National Science Foundation grants IIS-1111142 and IIS-1054319, by Google's support of the Reading is Believing project at Carnegie Mellon University, a computing resources award from Amazon Web Services. This work was supported by computing resources from the Open Source Data Cloud (OSDC), which is an Open Cloud Consortium (OCC)-sponsored project. OSDC usage was supported in part by grants from Gordon and Betty Moore Foundation and the National Science Foundation, and by major contributions from OCC members like the University of Chicago. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
            Research Article
            Biology and Life Sciences
            Cognitive Science
            Artificial Intelligence
            Machine Learning
            Computer and Information Sciences
            Physical sciences
            Statistics (mathematics)
            Statistical methods
            Monte Carlo method
            Research and Analysis Methods
            Mathematical and Statistical Techniques
            Social Sciences
            Computational Linguistics
            Linguistic Geography
            Custom metadata
            The authors confirm that, for approved reasons, some access restrictions apply to the data underlying the findings. The text data in this paper was acquired from Twitter's streaming API, and redistribution of the raw text is prohibited by their terms of service (TOS). A complete word list and the associated annotations are provided as a supporting document. Public dissemination of the Tweet IDs will enable other researchers to obtain this data from Twitter's API, except for messages which have been deleted by their authors. Tweet IDs can be obtained by emailing the corresponding author.



            Comment on this article