9
views
0
recommends
+1 Recommend
2 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System

      1 , 2 , 3 , 2 , 4 , 2
      Complexity
      Hindawi Limited

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Twitter is a virtual social network where people share their posts and opinions about the current situation, such as the coronavirus pandemic. It is considered the most significant streaming data source for machine learning research in terms of analysis, prediction, knowledge extraction, and opinions. Sentiment analysis is a text analysis method that has gained further significance due to social networks’ emergence. Therefore, this paper introduces a real-time system for sentiment prediction on Twitter streaming data for tweets about the coronavirus pandemic. The proposed system aims to find the optimal machine learning model that obtains the best performance for coronavirus sentiment analysis prediction and then uses it in real-time. The proposed system has been developed into two components: developing an offline sentiment analysis and modeling an online prediction pipeline. The system has two components: the offline and the online components. For the offline component of the system, the historical tweets’ dataset was collected in duration 23/01/2020 and 01/06/2020 and filtered by #COVID-19 and #Coronavirus hashtags. Two feature extraction methods of textual data analysis were used, n-gram and TF-ID, to extract the dataset’s essential features, collected using coronavirus hashtags. Then, five regular machine learning algorithms were performed and compared: decision tree, logistic regression, k-nearest neighbors, random forest, and support vector machine to select the best model for the online prediction component. The online prediction pipeline was developed using Twitter Streaming API, Apache Kafka, and Apache Spark. The experimental results indicate that the RF model using the unigram feature extraction method has achieved the best performance, and it is used for sentiment prediction on Twitter streaming data for coronavirus.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: not found
          • Article: not found

          Random forest classifier for remote sensing classification

          M. Pal (2005)
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Sentiment analysis algorithms and applications: A survey

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Sentiment Analysis and Opinion Mining

              Bing Liu (2012)
                Bookmark

                Author and article information

                Contributors
                Journal
                Complexity
                Complexity
                Hindawi Limited
                1099-0526
                1076-2787
                December 19 2020
                December 19 2020
                : 2020
                : 1-10
                Affiliations
                [1 ]School of Mathematics and Statistics, Yulin University, Yulin 719000, China
                [2 ]Faculty of Computers and Information, Minia University, Minya, Egypt
                [3 ]Faculty of Computers and Artificial Intelligence, South Valley University, Hurghada, Egypt
                [4 ]Faculty of Computer Science and Engineering, Hodeidah University, Al Hudaydah, Yemen
                Article
                10.1155/2020/6688912
                aaa4345f-7372-4fc0-ae4e-dcf35d888b6f
                © 2020

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article