132
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Gender identity and lexical variation in social media

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We present a study of the relationship between gender, linguistic style, and social networks, using a novel corpus of 14,000 Twitter users. Prior quantitative work on gender often treats this social variable as a female/male binary; we argue for a more nuanced approach. By clustering Twitter users, we find a natural decomposition of the dataset into various styles and topical interests. Many clusters have strong gender orientations, but their use of linguistic resources sometimes directly conflicts with the population-level language statistics. We view these clusters as a more accurate reflection of the multifaceted nature of gendered language styles. Previous corpus-based work has also had little to say about individuals whose linguistic styles defy population-level gender patterns. To identify such individuals, we train a statistical classifier, and measure the classifier confidence for each individual in the dataset. Examining individuals whose language does not match the classifier's model for their gender, we find that they have social networks that include significantly fewer same-gender social connections and that, in general, social network homophily is correlated with the use of same-gender language markers. Pairing computational methods and social theory thus offers a new perspective on how gender emerges as individuals position themselves relative to audiences, topics, and mainstream gender norms.

          Related collections

          Most cited references13

          • Record: found
          • Abstract: not found
          • Article: not found

          Multiple Comparisons among Means

            Bookmark
            • Record: found
            • Abstract: not found
            • Book: not found

            Variation across speech and writing

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              The stance triangle

                Bookmark

                Author and article information

                Journal
                16 October 2012
                2014-05-12
                Article
                10.1111/josl.12080
                1210.4567
                afa20416-1e1e-4e43-adbb-ee991cc4cda0

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                Journal of Sociolinguistics 18 (2014) 135-160
                submission version
                cs.CL

                Comments

                Comment on this article