37
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Crowdsourcing Dialect Characterization through Twitter

      research-article
      1 , 2 , * , 3
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We perform a large-scale analysis of language diatopic variation using geotagged microblogging datasets. By collecting all Twitter messages written in Spanish over more than two years, we build a corpus from which a carefully selected list of concepts allows us to characterize Spanish varieties on a global scale. A cluster analysis proves the existence of well defined macroregions sharing common lexical properties. Remarkably enough, we find that Spanish language is split into two superdialects, namely, an urban speech used across major American and Spanish citites and a diverse form that encompasses rural areas and small towns. The latter can be further clustered into smaller varieties with a stronger regional character.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Twitter of Babel: Mapping World Languages through Microblogging Platforms

          Large scale analysis and statistics of socio-technical systems that just a few short years ago would have required the use of consistent economic and human resources can nowadays be conveniently performed by mining the enormous amount of digital data produced by human activities. Although a characterization of several aspects of our societies is emerging from the data revolution, a number of questions concerning the reliability and the biases inherent to the big data “proxies” of social life are still open. Here, we survey worldwide linguistic indicators and trends through the analysis of a large-scale dataset of microblogging posts. We show that available data allow for the study of language geography at scales ranging from country-level aggregation to specific city neighborhoods. The high resolution and coverage of the data allows us to investigate different indicators such as the linguistic homogeneity of different countries, the touristic seasonal patterns within countries and the geographical distribution of different languages in multilingual regions. This work highlights the potential of geolocalized studies of open data sources to improve current analysis and develop indicators for major social phenomena in specific communities.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Structural and Dynamical Patterns on Online Social Networks: The Spanish May 15th Movement as a Case Study

            The number of people using online social networks in their everyday life is continuously growing at a pace never saw before. This new kind of communication has an enormous impact on opinions, cultural trends, information spreading and even in the commercial success of new products. More importantly, social online networks have revealed as a fundamental organizing mechanism in recent country-wide social movements. In this paper, we provide a quantitative analysis of the structural and dynamical patterns emerging from the activity of an online social network around the ongoing May 15th (15M) movement in Spain. Our network is made up by users that exchanged tweets in a time period of one month, which includes the birth and stabilization of the 15M movement. We characterize in depth the growth of such dynamical network and find that it is scale-free with communities at the mesoscale. We also find that its dynamics exhibits typical features of critical systems such as robustness and power-law distributions for several quantities. Remarkably, we report that the patterns characterizing the spreading dynamics are asymmetric, giving rise to a clear distinction between information sources and sinks. Our study represents a first step towards the use of data from online social media to comprehend modern societal dynamics.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Structural and Dynamical Patterns on Online Social Networks: the Spanish May 15th Movement as a case study

              , , (2011)
              The number of people using online social networks in their everyday life is continuously growing at a pace never saw before. This new kind of communication has an enormous impact on opinions, cultural trends, information spreading and even in the commercial success of new products. More importantly, social online networks have revealed as a fundamental organizing mechanism in recent country-wide social movements. In this paper, we provide a quantitative analysis of the structural and dynamical patterns emerging from the activity of an online social network around the ongoing May 15th (15M) movement in Spain. Our network is made up by users that exchanged tweets in a time period of one month, which includes the birth and stabilization of the 15M movement. We characterize in depth the growth of such dynamical network and find that it is scale-free with communities at the mesoscale. We also find that its dynamics exhibits typical features of critical systems such as robustness and power-law distributions for several quantities. Remarkably, we report that the patterns characterizing the spreading dynamics are asymmetric, giving rise to a clear distinction between information sources and sinks. Our study represent a first step towards the use of data from online social media to comprehend modern societal dynamics.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2014
                19 November 2014
                : 9
                : 11
                : e112074
                Affiliations
                [1 ]Aix-Marseille Université, CNRS, CPT, UMR 7332, 13288 Marseille, France
                [2 ]Université de Toulon, CNRS, CPT, UMR 7332, 83957 La Garde, France
                [3 ]Instituto de Física Interdisciplinar y Sistemas Complejos IFISC (UIB–CSIC), E-07122 Palma de Mallorca, Spain
                University of Warwick, United Kingdom
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: BG DS. Performed the experiments: BG. Analyzed the data: BG. Wrote the paper: BG DS.

                Article
                PONE-D-14-33777
                10.1371/journal.pone.0112074
                4237322
                25409174
                eb90549c-f6b4-48f6-871a-278dbeea7429
                Copyright @ 2014

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 28 July 2014
                : 27 September 2014
                Page count
                Pages: 6
                Funding
                The authors have no support or funding to report.
                Categories
                Research Article
                Computer and Information Sciences
                Network Analysis
                Scale-Free Networks
                Social Networks
                Systems Science
                Complex Systems
                Physical Sciences
                Mathematics
                Social Sciences
                Linguistics
                Sociolinguistics
                Dialectology
                Computational Linguistics
                Linguistic Geography
                Custom metadata
                The authors confirm that all data underlying the findings are fully available without restriction. All relevant data are publicly available. Documentation on how to query the Twitter API can be found here: https://dev.twitter.com/overview/documentation.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article