16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis

      methods-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          With the growth of online social network platforms and applications, large amounts of textual user-generated content are created daily in the form of comments, reviews, and short-text messages. As a result, users often find it challenging to discover useful information or more on the topic being discussed from such content. Machine learning and natural language processing algorithms are used to analyze the massive amount of textual social media data available online, including topic modeling techniques that have gained popularity in recent years. This paper investigates the topic modeling subject and its common application areas, methods, and tools. Also, we examine and compare five frequently used topic modeling methods, as applied to short textual social data, to show their benefits practically in detecting important topics. These methods are latent semantic analysis, latent Dirichlet allocation, non-negative matrix factorization, random projection, and principal component analysis. Two textual datasets were selected to evaluate the performance of included topic modeling methods based on the topic quality and some standard statistical evaluation metrics, like recall, precision, F-score, and topic coherence. As a result, latent Dirichlet allocation and non-negative matrix factorization methods delivered more meaningful extracted topics and obtained good results. The paper sheds light on some common topic modeling methods in a short-text context and provides direction for researchers who seek to apply these methods.

          Related collections

          Most cited references64

          • Record: found
          • Abstract: not found
          • Book: not found

          Principal Component Analysis

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Indexing by latent semantic analysis

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Thumbs up or thumbs down?

                Bookmark

                Author and article information

                Contributors
                Journal
                Front Artif Intell
                Front Artif Intell
                Front. Artif. Intell.
                Frontiers in Artificial Intelligence
                Frontiers Media S.A.
                2624-8212
                14 July 2020
                2020
                : 3
                : 42
                Affiliations
                [1] 1School of Information Technology and Engineering, University of Ottawa , Ottawa, ON, Canada
                [2] 2Telfer School of Management, University of Ottawa , Ottawa, ON, Canada
                Author notes

                Edited by: Anis Yazidi, OsloMet—Oslo Metropolitan University, Norway

                Reviewed by: Lei Jiao, University of Agder, Norway; Ashish Rauniyar, University of Oslo, Norway, in Collaboration With Reviewer LJ; Imen Ben Sassi, Tallinn University of Technology, Estonia; Desta Haileselassie Hagos, Oslo Metropolitan University, Norway, in Collaboration With Reviewer IS

                *Correspondence: Rania Albalawi ralba028@ 123456uottawa.ca

                This article was submitted to Machine Learning and Artificial Intelligence, a section of the journal Frontiers in Artificial Intelligence

                Article
                10.3389/frai.2020.00042
                7861298
                33733159
                d8ab0579-f0af-495d-a893-fc726aebdcf3
                Copyright © 2020 Albalawi, Yeap and Benyoucef.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 28 February 2020
                : 14 May 2020
                Page count
                Figures: 6, Tables: 6, Equations: 6, References: 71, Pages: 14, Words: 10254
                Categories
                Artificial Intelligence
                Methods

                natural language processing,topic modeling,short text,user-generated content,online social networks

                Comments

                Comment on this article