The Hidden Shape of Stories Reveals Positivity Bias and Gender Bias

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

To capture the shape of stories is crucial for understanding the mind of human beings. In this research, we use word emdeddings methods, a widely used tool in natural language processing and machine learning, in order to quantify and compare emotional arcs of stories over time. Based on trained Google News word2vec vectors and film scripts corpora (N =1109), we form the fundamental building blocks of story emotional trajectories. The results demonstrate that there exists only one universal pattern of story shapes in movies. Furthermore, there exists a positivity and gender bias in story narratives. More interestingly, the audience reveals a completely different preference from content producers.

Related collections

Most cited references 4

Record: found
Abstract: found
Article: not found

Word embeddings quantify 100 years of gender and ethnic stereotypes

Dan Jurafsky, Nikhil Garg, James Zou … (2018)

Word embeddings are a popular machine-learning method that represents each English word by a vector, such that the geometry between these vectors captures semantic relations between the corresponding words. We demonstrate that word embeddings can be used as a powerful tool to quantify historical trends and social change. As specific applications, we develop metrics based on word embeddings to characterize how gender stereotypes and attitudes toward ethnic minorities in the United States evolved during the 20th and 21st centuries starting from 1910. Our framework opens up a fruitful intersection between machine learning and quantitative social science. Word embeddings are a powerful machine-learning framework that represents each English word by a vector. The geometric relationship between these vectors captures meaningful semantic relationships between the corresponding words. In this paper, we develop a framework to demonstrate how the temporal dynamics of the embedding helps to quantify changes in stereotypes and attitudes toward women and ethnic minorities in the 20th and 21st centuries in the United States. We integrate word embeddings trained on 100 y of text data with the US Census to show that changes in the embedding track closely with demographic and occupation shifts over time. The embedding captures societal shifts—e.g., the women’s movement in the 1960s and Asian immigration into the United States—and also illuminates how specific adjectives and occupations became more closely associated with certain populations over time. Our framework for temporal analysis of word embedding opens up a fruitful intersection between machine learning and quantitative social science.

0 comments Cited 154 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

Filipe Ribeiro, Matheus Araujo, Pollyanna Gonçalves … (2016)

0 comments Cited 84 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Is Open Access

Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter

Peter Sheridan Dodds, Kameron Decker Harris, Isabel M. Kloumann … (2011)

Individual happiness is a fundamental societal metric. Normally measured through self-report, happiness has often been indirectly characterized and overshadowed by more readily quantifiable economic indicators such as gross domestic product. Here, we examine expressions made on the online, global microblog and social networking service Twitter, uncovering and explaining temporal variations in happiness and information levels over timescales ranging from hours to years. Our data set comprises over 46 billion words contained in nearly 4.6 billion expressions posted over a 33 month span by over 63 million unique users. In measuring happiness, we use a real-time, remote-sensing, non-invasive, text-based approach---a kind of hedonometer. In building our metric, made available with this paper, we conducted a survey to obtain happiness evaluations of over 10,000 individual words, representing a tenfold size improvement over similar existing word sets. Rather than being ad hoc, our word list is chosen solely by frequency of usage and we show how a highly robust metric can be constructed and defended.

0 comments Cited 69 times – based on 0 reviews

Preprint

     Review now

Bookmark

All references

Author and article information

Journal

Publication date Created: 12 November 2018

Article

ArXiV ID: 1811.04599

SO-VID: ca008e40-8dfe-4a56-a2ef-2a4f2bb3bf44

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.CL

ScienceOpen disciplines: Theoretical computer science

Data availability:

ScienceOpen disciplines: Theoretical computer science

The Hidden Shape of Stories Reveals Positivity Bias and Gender Bias

Read this article at

Abstract

Related collections

Blockchain in Healthcare Today

Most cited references 4

Word embeddings quantify 100 years of gender and ethnic stereotypes

SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 29

Most referenced authors 48