1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Cyberbullying detection: advanced preprocessing techniques & deep learning architecture for Roman Urdu data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Social media have become a very viable medium for communication, collaboration, exchange of information, knowledge, and ideas. However, due to anonymity preservation, the incidents of hate speech and cyberbullying have been diversified across the globe. This intimidating problem has recently sought the attention of researchers and scholars worldwide and studies have been undertaken to formulate solution strategies for automatic detection of cyberaggression and hate speech, varying from machine learning models with vast features to more complex deep neural network models and different SN platforms. However, the existing research is directed towards mature languages and highlights a huge gap in newly embraced resource poor languages. One such language that has been recently adopted worldwide and more specifically by south Asian countries for communication on social media is Roman Urdu i-e Urdu language written using Roman scripting. To address this research gap, we have performed extensive preprocessing on Roman Urdu microtext. This typically involves formation of Roman Urdu slang- phrase dictionary and mapping slangs after tokenization. We have also eliminated cyberbullying domain specific stop words for dimensionality reduction of corpus. The unstructured data were further processed to handle encoded text formats and metadata/non-linguistic features. Furthermore, we performed extensive experiments by implementing RNN-LSTM, RNN-BiLSTM and CNN models varying epochs executions, model layers and tuning hyperparameters to analyze and uncover cyberbullying textual patterns in Roman Urdu. The efficiency and performance of models were evaluated using different metrics to present the comparative analysis. Results highlight that RNN-LSTM and RNN-BiLSTM performed best and achieved validation accuracy of 85.5 and 85% whereas F1 score was 0.7 and 0.67 respectively over aggression class.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found

          Predictions for COVID-19 with Deep Learning Models of LSTM, GRU and Bi-LSTM

          COVID-19, responsible of infecting billions of people and economy across the globe, requires detailed study of the trend it follows to develop adequate short-term prediction models for forecasting the number of future cases. In this perspective, it is possible to develop strategic planning in the public health system to avoid deaths as well as managing patients. In this paper, proposed forecast models comprising autoregressive integrated moving average (ARIMA), support vector regression (SVR), long shot term memory (LSTM), bidirectional long short term memory (Bi-LSTM) are assessed for time series prediction of confirmed cases, deaths and recoveries in ten major countries affected due to COVID-19. The performance of models is measured by mean absolute error, root mean square error and r2_score indices. In the majority of cases, Bi-LSTM model outperforms in terms of endorsed indices. Models ranking from good performance to the lowest in entire scenarios is Bi-LSTM, LSTM, GRU, SVR and ARIMA. Bi-LSTM generates lowest MAE and RMSE values of 0.0070 and 0.0077, respectively, for deaths in China. The best r2_score value is 0.9997 for recovered cases in China. On the basis of demonstrated robustness and enhanced prediction accuracy, Bi-LSTM can be exploited for pandemic prediction for better planning and management.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The impact of preprocessing on text classification

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Cyberbullying and Psychological Well-being in Young Adolescence: The Potential Protective Mediation Effects of Social Support from Family, Friends, and Teachers

              In the current study, we tested the relations between cyberbullying roles and several psychological well-being outcomes, as well as the potential mediation effect of perceived social support from family, friends, and teachers in school. This was investigated in a cross-sectional sample of 1707 young adolescents (47.5% girls, aged 10–13 years, self-reporting via a web questionnaire) attending community and private schools in a mid-sized municipality in Sweden. We concluded from our results that the Cyberbully-victim group has the highest levels of depressive symptoms, and the lowest of subjective well-being and family support. We also observed higher levels of anxiety symptoms in both the Cyber-victims and the Cyberbully-victims. Moreover, we conclude that some types of social support seem protective in the way that it mediates the relationship between cyberbullying and psychological well-being. More specifically, perceived social support from family and from teachers reduce the probability of depressive and anxiety symptoms, and higher levels of social support from the family increase the probability of higher levels of subjective well-being among youths being a victim of cyberbullying (i.e., cyber-victim) and being both a perpetrator and a victim of cyber bullying (i.e., cyberbully-victim). Potential implications for prevention strategies are discussed.
                Bookmark

                Author and article information

                Contributors
                amirita@faculty.muet.edu.pk
                mohsin.memon@faculty.muet.edu.pk
                sania.bhatti@faculty.muet.edu.pk
                Journal
                J Big Data
                J Big Data
                Journal of Big Data
                Springer International Publishing (Cham )
                2196-1115
                22 December 2021
                22 December 2021
                2021
                : 8
                : 1
                : 160
                Affiliations
                GRID grid.444814.9, ISNI 0000 0001 0376 1014, Institute of Information and Communication Technologies, Department of Software Engineering, , Mehran University of Engineering & Technology, ; Jamshoro, Sindh Pakistan
                Author information
                http://orcid.org/0000-0002-3816-3644
                http://orcid.org/0000-0003-2638-4252
                http://orcid.org/0000-0002-0887-8083
                Article
                550
                10.1186/s40537-021-00550-7
                8693595
                1325c304-8dee-4731-8c0c-eb3d23dbc1cb
                © The Author(s) 2021

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 3 September 2021
                : 10 December 2021
                Categories
                Research
                Custom metadata
                © The Author(s) 2021

                advanced preprocessing,big data,deep learning,hate speech detection,cyberbullying

                Comments

                Comment on this article