32
views
0
recommends
+1 Recommend
2 collections
    0
    shares

      To submit your manuscript to JMIR, please click here

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          In public health surveillance, measuring how information enters and spreads through online communities may help us understand geographical variation in decision making associated with poor health outcomes.

          Objective

          Our aim was to evaluate the use of community structure and topic modeling methods as a process for characterizing the clustering of opinions about human papillomavirus (HPV) vaccines on Twitter.

          Methods

          The study examined Twitter posts (tweets) collected between October 2013 and October 2015 about HPV vaccines. We tested Latent Dirichlet Allocation and Dirichlet Multinomial Mixture (DMM) models for inferring topics associated with tweets, and community agglomeration (Louvain) and the encoding of random walks (Infomap) methods to detect community structure of the users from their social connections. We examined the alignment between community structure and topics using several common clustering alignment measures and introduced a statistical measure of alignment based on the concentration of specific topics within a small number of communities. Visualizations of the topics and the alignment between topics and communities are presented to support the interpretation of the results in context of public health communication and identification of communities at risk of rejecting the safety and efficacy of HPV vaccines.

          Results

          We analyzed 285,417 Twitter posts (tweets) about HPV vaccines from 101,519 users connected by 4,387,524 social connections. Examining the alignment between the community structure and the topics of tweets, the results indicated that the Louvain community detection algorithm together with DMM produced consistently higher alignment values and that alignments were generally higher when the number of topics was lower. After applying the Louvain method and DMM with 30 topics and grouping semantically similar topics in a hierarchy, we characterized 163,148 (57.16%) tweets as evidence and advocacy, and 6244 (2.19%) tweets describing personal experiences. Among the 4548 users who posted experiential tweets, 3449 users (75.84%) were found in communities where the majority of tweets were about evidence and advocacy.

          Conclusions

          The use of community detection in concert with topic modeling appears to be a useful way to characterize Twitter communities for the purpose of opinion surveillance in public health applications. Our approach may help identify online communities at risk of being influenced by negative opinions about public health interventions such as HPV vaccines.

          Related collections

          Most cited references64

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Modularity and community structure in networks

          M. Newman (2006)
          Many networks of interest in the sciences, including a variety of social and biological networks, are found to divide naturally into communities or modules. The problem of detecting and characterizing this community structure has attracted considerable recent attention. One of the most sensitive detection methods is optimization of the quality function known as "modularity" over the possible divisions of a network, but direct application of this method using, for instance, simulated annealing is computationally costly. Here we show that the modularity can be reformulated in terms of the eigenvectors of a new characteristic matrix for the network, which we call the modularity matrix, and that this reformulation leads to a spectral algorithm for community detection that returns results of better quality than competing methods in noticeably shorter running times. We demonstrate the algorithm with applications to several network data sets.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Community structure in social and biological networks

            A number of recent studies have focused on the statistical properties of networked systems such as social networks and the World-Wide Web. Researchers have concentrated particularly on a few properties which seem to be common to many networks: the small-world property, power-law degree distributions, and network transitivity. In this paper, we highlight another property which is found in many networks, the property of community structure, in which network nodes are joined together in tightly-knit groups between which there are only looser connections. We propose a new method for detecting such communities, built around the idea of using centrality indices to find community boundaries. We test our method on computer generated and real-world graphs whose community structure is already known, and find that it detects this known structure with high sensitivity and reliability. We also apply the method to two networks whose community structure is not well-known - a collaboration network and a food web - and find that it detects significant and informative community divisions in both cases.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found
              Is Open Access

              Community detection in graphs

              The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i. e. the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Such clusters, or communities, can be considered as fairly independent compartments of a graph, playing a similar role like, e. g., the tissues or the organs in the human body. Detecting communities is of great importance in sociology, biology and computer science, disciplines where systems are often represented as graphs. This problem is very hard and not yet satisfactorily solved, despite the huge effort of a large interdisciplinary community of scientists working on it over the past few years. We will attempt a thorough exposition of the topic, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.
                Bookmark

                Author and article information

                Contributors
                Journal
                J Med Internet Res
                J. Med. Internet Res
                JMIR
                Journal of Medical Internet Research
                JMIR Publications (Toronto, Canada )
                1439-4456
                1438-8871
                August 2016
                29 August 2016
                : 18
                : 8
                : e232
                Affiliations
                [1] 1Centre for Health Informatics Australian Institute of Health Innovation Macquarie University North Ryde, New South WalesAustralia
                [2] 2Department of Computing Faculty of Science and Engineering Macquarie University North Ryde, New South WalesAustralia
                Author notes
                Corresponding Author: Didi Surian didi.surian@ 123456mq.edu.au
                Author information
                http://orcid.org/0000-0003-2299-2971
                http://orcid.org/0000-0001-8214-2878
                http://orcid.org/0000-0003-2331-0530
                http://orcid.org/0000-0003-4809-8441
                http://orcid.org/0000-0002-6444-6584
                http://orcid.org/0000-0002-1720-8209
                Article
                v18i8e232
                10.2196/jmir.6045
                5020315
                27573910
                c1f5206b-cc1c-4579-ba20-4e6e352abf6b
                ©Didi Surian, Dat Quoc Nguyen, Georgina Kennedy, Mark Johnson, Enrico Coiera, Adam G Dunn. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 29.08.2016.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

                History
                : 30 May 2016
                : 20 July 2016
                : 3 August 2016
                Categories
                Original Paper
                Original Paper

                Medicine
                topic modelling,graph algorithms analysis,social media,public health surveillance
                Medicine
                topic modelling, graph algorithms analysis, social media, public health surveillance

                Comments

                Comment on this article