8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Combining Character and Word Embeddings for Affect in Arabic Informal Social Media Microblogs

      chapter-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Word representation models have been successfully applied in many natural language processing tasks, including sentiment analysis. However, these models do not always work effectively in some social media contexts. When considering the use of Arabic in microblogs like Twitter, it is important to note that a variety of different linguistic domains are involved. This is mainly because social media users employ various dialects in their communications. While training word-level models with such informal text can lead to words being captured that have the same meanings, these models cannot capture all words that can be encountered in the real world due to out-of-vocabulary (OOV) words. The inability to identify words is one of the main limitations of this word-level model. In contrast, character-level embeddings can work effectively with this problem through their ability to learn the vectors of character n-grams or parts of words. We take advantage of both character- and word-level models to discover more effective methods to represent Arabic affect words in tweets. We evaluate our embeddings by incorporating them into a supervised learning framework for a range of affect tasks. Our models outperform the state-of-the-art Arabic pre-trained word embeddings in these tasks. Moreover, they offer improved state-of-the-art results for the task of Arabic emotion intensity, outperforming the top-performing systems that employ a combination of deep neural networks and several other features.

          Related collections

          Most cited references6

          • Record: found
          • Abstract: not found
          • Article: not found

          CROWDSOURCING A WORD-EMOTION ASSOCIATION LEXICON

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found
            Is Open Access

            AraVec: A set of Arabic Word Embedding Models for use in Arabic NLP

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found
              Is Open Access

              Role of Text Pre-processing in Twitter Sentiment Analysis

                Bookmark

                Author and article information

                Contributors
                elisabeth.metais@cnam.fr
                f.meziane@salford.ac.uk
                helmut.horacek@dfki.de
                cimiano@cit-ec.uni-bielefeld.de
                aia784@cs.bham.ac.uk , aamalharbe@kau.edu.sa
                m.g.lee@cs.bham.ac.uk
                Journal
                978-3-030-51310-8
                10.1007/978-3-030-51310-8
                Natural Language Processing and Information Systems
                Natural Language Processing and Information Systems
                25th International Conference on Applications of Natural Language to Information Systems, NLDB 2020, Saarbrücken, Germany, June 24–26, 2020, Proceedings
                978-3-030-51309-2
                978-3-030-51310-8
                26 May 2020
                : 12089
                : 213-224
                Affiliations
                [8 ]GRID grid.36823.3c, ISNI 0000 0001 2185 090X, Laboratoire Cédric, , Conservatoire National des Arts et Métiers, ; Paris, France
                [9 ]GRID grid.8752.8, ISNI 0000 0004 0460 5971, School of Science, Engineering and Environment, , University of Salford, ; Salford, UK
                [10 ]GRID grid.17272.31, ISNI 0000 0004 0621 750X, Language Technology, , German Research Center for Artificial Intelligence, ; Saarbrücken, Germany
                [11 ]GRID grid.7491.b, ISNI 0000 0001 0944 9128, Semantic Computing Group, , Bielefeld University, ; Bielefeld, Germany
                [12 ]GRID grid.6572.6, ISNI 0000 0004 1936 7486, School of Computer Science, , University of Birmingham, ; Birmingham, UK
                [13 ]GRID grid.412125.1, ISNI 0000 0001 0619 1117, Faculty of Computing and Information Technology, , King Abdulaziz University, ; Rabigh, Kingdom of Saudi Arabia
                Article
                20
                10.1007/978-3-030-51310-8_20
                7298193
                8dbbe6fc-d5b1-4322-bcaf-94d711b9c89b
                © Springer Nature Switzerland AG 2020

                This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.

                History
                Categories
                Article
                Custom metadata
                © Springer Nature Switzerland AG 2020

                word-level embeddings,character-level embeddings,arabic affect tweets

                Comments

                Comment on this article