16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Improving the performance of lexicon-based review sentiment analysis method by reducing additional introduced sentiment bias

      research-article
      , * , , ,
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Sentiment analysis is widely studied to extract opinions from user generated content (UGC), and various methods have been proposed in recent literature. However, these methods are likely to introduce sentiment bias, and the classification results tend to be positive or negative, especially for the lexicon-based sentiment classification methods. The existence of sentiment bias leads to poor performance of sentiment analysis. To deal with this problem, we propose a novel sentiment bias processing strategy which can be applied to the lexicon-based sentiment analysis method. Weight and threshold parameters learned from a small training set are introduced into the lexicon-based sentiment scoring formula, and then the formula is used to classify the reviews. In this paper, a completed sentiment classification framework is proposed. SentiWordNet (SWN) is used as the experimental sentiment lexicon, and review data of four products collected from Amazon are used as the experimental datasets. Experimental results show that the bias processing strategy reduces polarity bias rate (PBR) and improves performance of the lexicon-based sentiment analysis method.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Article: not found

          Lexicon-Based Methods for Sentiment Analysis

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Document-level sentiment classification: An empirical comparison between SVM and ANN

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Sentiment classification: The contribution of ensemble learning

                Bookmark

                Author and article information

                Contributors
                Role: Writing – original draft
                Role: Funding acquisition
                Role: Supervision
                Role: Funding acquisition
                Role: Writing – review & editing
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2018
                24 August 2018
                : 13
                : 8
                : e0202523
                Affiliations
                [001] College of Computer Science and Technology, Harbin Engineering University, Harbin, Heilongjiang Province, China
                Tampere University of Technology, FINLAND
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Article
                PONE-D-18-07380
                10.1371/journal.pone.0202523
                6108458
                30142154
                5eb10d69-5c50-4c5d-b4cf-33e8a6465eec
                © 2018 Han et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 9 March 2018
                : 3 August 2018
                Page count
                Figures: 2, Tables: 3, Pages: 11
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;
                Award ID: 61672179
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;
                Award ID: 61370083
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;
                Award ID: 61402126
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100005046, Natural Science Foundation of Heilongjiang Province;
                Award ID: F2015030
                Funded by: Youth Science Foundation of Heilongjiang Province of China
                Award ID: QC2016083
                Funded by: funder-id http://dx.doi.org/10.13039/501100004027, Postdoctoral Foundation of Hei Long Jiang Province;
                Award ID: LBH-Z14071
                Award Recipient :
                This paper is supported by (1) the National Natural Science Foundation of China ( http://www.nsfc.gov.cn/) under Grant nos. 61672179 to JY, 61370083 to JY and 61402126 to YZ; (2) the Natural Science Foundation of Heilongjiang Province ( http://www.hljkjt.gov.cn) under Grant no. F2015030; (3) the Youth Science Foundation of Heilongjiang Province of China ( http://www.hljkjt.gov.cn) under Grant no. QC2016083; and (4) Heilongjiang postdoctoral Fund ( http://www.hljbsh.org/gywm.asp) no. LBH-Z14071 to YZ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Social Sciences
                Linguistics
                Lexicons
                Engineering and Technology
                Management Engineering
                Decision Analysis
                Decision Trees
                Decision Tree Learning
                Research and Analysis Methods
                Decision Analysis
                Decision Trees
                Decision Tree Learning
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Decision Tree Learning
                Social Sciences
                Linguistics
                Grammar
                Morphology (Linguistics)
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Computer and Information Sciences
                Data Acquisition
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Machine Learning Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Machine Learning Algorithms
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Machine Learning Algorithms
                Social Sciences
                Linguistics
                Semantics
                Computer and Information Sciences
                Software Engineering
                Preprocessing
                Engineering and Technology
                Software Engineering
                Preprocessing
                Custom metadata
                Data cannot be made available by the authors in the submission files or in a public data repository because they are from a third party. The four datasets (Books, DVD, Electronics, and Kitchen datasets) are available from the Multi-Domain Sentiment Dataset (version 2.0) ( https://www.cs.jhu.edu/~mdredze/datasets/sentiment/unprocessed.tar.gz)( https://www.cs.jhu.edu/~mdredze/datasets/sentiment/). The authors did not have special access privileges to the data. All interested researchers are able to access the data in the same manner as the authors.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article