28
views
0
recommends
+1 Recommend
2 collections
    0
    shares

      Submit your digital health research with an established publisher
      - celebrating 25 years of open access

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Enhancing Seasonal Influenza Surveillance: Topic Analysis of Widely Used Medicinal Drugs Using Twitter Data

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Uptake of medicinal drugs (preventive or treatment) is among the approaches used to control disease outbreaks, and therefore, it is of vital importance to be aware of the counts or frequencies of most commonly used drugs and trending topics about these drugs from consumers for successful implementation of control measures. Traditional survey methods would have accomplished this study, but they are too costly in terms of resources needed, and they are subject to social desirability bias for topics discovery. Hence, there is a need to use alternative efficient means such as Twitter data and machine learning (ML) techniques.

          Objective

          Using Twitter data, the aim of the study was to (1) provide a methodological extension for efficiently extracting widely consumed drugs during seasonal influenza and (2) extract topics from the tweets of these drugs and to infer how the insights provided by these topics can enhance seasonal influenza surveillance.

          Methods

          From tweets collected during the 2012-13 flu season, we first identified tweets with mentions of drugs and then constructed an ML classifier using dependency words as features. The classifier was used to extract tweets that evidenced consumption of drugs, out of which we identified the mostly consumed drugs. Finally, we extracted trending topics from each of these widely used drugs’ tweets using latent Dirichlet allocation (LDA).

          Results

          Our proposed classifier obtained an F 1 score of 0.82, which significantly outperformed the two benchmark classifiers (ie, P<.001 with the lexicon-based and P=.048 with the 1-gram term frequency [TF]). The classifier extracted 40,428 tweets that evidenced consumption of drugs out of 50,828 tweets with mentions of drugs. The most widely consumed drugs were influenza virus vaccines that had around 76.95% (31,111/40,428) share of the total; other notable drugs were Theraflu, DayQuil, NyQuil, vitamins, acetaminophen, and oseltamivir. The topics of each of these drugs exhibited common themes or experiences from people who have consumed these drugs. Among these were the enabling and deterrent factors to influenza drugs uptake, which are keys to mitigating the severity of seasonal influenza outbreaks.

          Conclusions

          The study results showed the feasibility of using tweets of widely consumed drugs to enhance seasonal influenza surveillance in lieu of the traditional or conventional surveillance approaches. Public health officials and other stakeholders can benefit from the findings of this study, especially in enhancing strategies for mitigating the severity of seasonal influenza outbreaks. The proposed methods can be extended to the outbreaks of other diseases.

          Related collections

          Most cited references57

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          The Stanford CoreNLP Natural Language Processing Toolkit

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic

            Social media have been proposed as a data source for influenza surveillance because they have the potential to offer real-time access to millions of short, geographically localized messages containing information regarding personal well-being. However, accuracy of social media surveillance systems declines with media attention because media attention increases “chatter” – messages that are about influenza but that do not pertain to an actual infection – masking signs of true influenza prevalence. This paper summarizes our recently developed influenza infection detection algorithm that automatically distinguishes relevant tweets from other chatter, and we describe our current influenza surveillance system which was actively deployed during the full 2012-2013 influenza season. Our objective was to analyze the performance of this system during the most recent 2012–2013 influenza season and to analyze the performance at multiple levels of geographic granularity, unlike past studies that focused on national or regional surveillance. Our system’s influenza prevalence estimates were strongly correlated with surveillance data from the Centers for Disease Control and Prevention for the United States (r = 0.93, p < 0.001) as well as surveillance data from the Department of Health and Mental Hygiene of New York City (r = 0.88, p < 0.001). Our system detected the weekly change in direction (increasing or decreasing) of influenza prevalence with 85% accuracy, a nearly twofold increase over a simpler model, demonstrating the utility of explicitly distinguishing infection tweets from other chatter.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

              The UMLS Metathesaurus, the largest thesaurus in the biomedical domain, provides a representation of biomedical knowledge consisting of concepts classified by semantic type and both hierarchical and non-hierarchical relationships among the concepts. This knowledge has proved useful for many applications including decision support systems, management of patient records, information retrieval (IR) and data mining. Gaining effective access to the knowledge is critical to the success of these applications. This paper describes MetaMap, a program developed at the National Library of Medicine (NLM) to map biomedical text to the Metathesaurus or, equivalently, to discover Metathesaurus concepts referred to in text. MetaMap uses a knowledge intensive approach based on symbolic, natural language processing (NLP) and computational linguistic techniques. Besides being applied for both IR and data mining applications, MetaMap is one of the foundations of NLM's Indexing Initiative System which is being applied to both semi-automatic and fully automatic indexing of the biomedical literature at the library.
                Bookmark

                Author and article information

                Contributors
                Journal
                J Med Internet Res
                J. Med. Internet Res
                JMIR
                Journal of Medical Internet Research
                JMIR Publications (Toronto, Canada )
                1439-4456
                1438-8871
                September 2017
                12 September 2017
                : 19
                : 9
                : e315
                Affiliations
                [1] 1 School of Management and Economics Beijing Institute of Technology Beijing China
                [2] 2 Sustainable Development Research Institute for Economy and Society of Beijing Beijing China
                [3] 3 School of Life Science Department of Biomedical Engineering Beijing Institute of Technology Beijing China
                Author notes
                Corresponding Author: Zhijun Yan yanzhijun@ 123456bit.edu.cn
                Author information
                http://orcid.org/0000-0003-2990-6319
                http://orcid.org/0000-0003-1727-1176
                http://orcid.org/0000-0002-3278-6402
                Article
                v19i9e315
                10.2196/jmir.7393
                5617904
                28899847
                b0326df6-03aa-461b-ac2b-0dcd9aadc1fd
                ©Ireneus Kagashe, Zhijun Yan, Imran Suheryani. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 12.09.2017.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

                History
                : 25 January 2017
                : 18 February 2017
                : 9 June 2017
                : 26 July 2017
                Categories
                Original Paper
                Original Paper

                Medicine
                machine learning,twitter messaging,social media,disease outbreaks,influenza,public health surveillance,natural language processing,influenza vaccines

                Comments

                Comment on this article