82
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The goal of influenza-like illness (ILI) surveillance is to determine the timing, location and magnitude of outbreaks by monitoring the frequency and progression of clinical case incidence. Advances in computational and information technology have allowed for automated collection of higher volumes of electronic data and more timely analyses than previously possible. Novel surveillance systems, including those based on internet search query data like Google Flu Trends (GFT), are being used as surrogates for clinically-based reporting of influenza-like-illness (ILI). We investigated the reliability of GFT during the last decade (2003 to 2013), and compared weekly public health surveillance with search query data to characterize the timing and intensity of seasonal and pandemic influenza at the national (United States), regional (Mid-Atlantic) and local (New York City) levels. We identified substantial flaws in the original and updated GFT models at all three geographic scales, including completely missing the first wave of the 2009 influenza A/H1N1 pandemic, and greatly overestimating the intensity of the A/H3N2 epidemic during the 2012/2013 season. These results were obtained for both the original (2008) and the updated (2009) GFT algorithms. The performance of both models was problematic, perhaps because of changes in internet search behavior and differences in the seasonality, geographical heterogeneity and age-distribution of the epidemics between the periods of GFT model-fitting and prospective use. We conclude that GFT data may not provide reliable surveillance for seasonal or pandemic influenza and should be interpreted with caution until the algorithm can be improved and evaluated. Current internet search query data are no substitute for timely local clinical and laboratory surveillance, or national surveillance based on local data collection. New generation surveillance systems such as GFT should incorporate the use of near-real time electronic health data and computational methods for continued model-fitting and ongoing evaluation and improvement.

          Author Summary

          In November 2008, Google Flu Trends was launched as an open tool for influenza surveillance in the United States. Engineered as a system for early detection and daily monitoring of the intensity of seasonal influenza epidemics, Google Flu Trends uses internet search data and a proprietary algorithm to provide a surrogate measure of influenza-like illness in the population. During its first season of operation, the novel A/H1N1-pdm influenza virus emerged, heterogeneously causing sporadic outbreaks in the spring and summer of 2009 across many parts of the United States. During the autumn 2009 pandemic wave, Google updated their model with a new algorithm and case definition; the updated model has run prospectively since. Our study asks whether Google Flu Trends provides accurate detection and monitoring of influenza at the national, regional and local geographic scales. Reliable local surveillance is important to reduce uncertainty and improve situational awareness during seasonal epidemics and pandemics. We found substantial flaws with the original and updated Google Flu Trends models, including missing the emergence of the 2009 pandemic and overestimating the 2012/2013 influenza season epidemic. Our work supports the development of local near-real time computerized syndromic surveillance systems, and collaborative regional, national and international networks.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Severe respiratory disease concurrent with the circulation of H1N1 influenza.

          In the spring of 2009, an outbreak of severe pneumonia was reported in conjunction with the concurrent isolation of a novel swine-origin influenza A (H1N1) virus (S-OIV), widely known as swine flu, in Mexico. Influenza A (H1N1) subtype viruses have rarely predominated since the 1957 pandemic. The analysis of epidemic pneumonia in the absence of routine diagnostic tests can provide information about risk factors for severe disease from this virus and prospects for its control. From March 24 to April 29, 2009, a total of 2155 cases of severe pneumonia, involving 821 hospitalizations and 100 deaths, were reported to the Mexican Ministry of Health. During this period, of the 8817 nasopharyngeal specimens that were submitted to the National Epidemiological Reference Laboratory, 2582 were positive for S-OIV. We compared the age distribution of patients who were reported to have severe pneumonia with that during recent influenza epidemics to document an age shift in rates of death and illness. During the study period, 87% of deaths and 71% of cases of severe pneumonia involved patients between the ages of 5 and 59 years, as compared with average rates of 17% and 32%, respectively, in that age group during the referent periods. Features of this epidemic were similar to those of past influenza pandemics in that circulation of the new influenza virus was associated with an off-season wave of disease affecting a younger population. During the early phase of this influenza pandemic, there was a sudden increase in the rate of severe pneumonia and a shift in the age distribution of patients with such illness, which was reminiscent of past pandemics and suggested relative protection for persons who were exposed to H1N1 strains during childhood before the 1957 pandemic. If resources or vaccine supplies are limited, these findings suggest a rationale for focusing prevention efforts on younger populations. 2009 Massachusetts Medical Society
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Using internet searches for influenza surveillance.

            The Internet is an important source of health information. Thus, the frequency of Internet searches may provide information regarding infectious disease activity. As an example, we examined the relationship between searches for influenza and actual influenza occurrence. Using search queries from the Yahoo! search engine ( http://search.yahoo.com ) from March 2004 through May 2008, we counted daily unique queries originating in the United States that contained influenza-related search terms. Counts were divided by the total number of searches, and the resulting daily fraction of searches was averaged over the week. We estimated linear models, using searches with 1-10-week lead times as explanatory variables to predict the percentage of cultures positive for influenza and deaths attributable to pneumonia and influenza in the United States. With use of the frequency of searches, our models predicted an increase in cultures positive for influenza 1-3 weeks in advance of when they occurred (P < .001), and similar models predicted an increase in mortality attributable to pneumonia and influenza up to 5 weeks in advance (P < .001). Search-term surveillance may provide an additional tool for disease surveillance.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Methods for current statistical analysis of excess pneumonia-influenza deaths.

                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                October 2013
                October 2013
                17 October 2013
                : 9
                : 10
                : e1003256
                Affiliations
                [1 ]New York City Department of Health and Mental Hygiene, New York, New York, United States of America
                [2 ]Fogarty International Center, National Institutes of Health, Bethesda, Maryland, United States of America
                [3 ]Department of Global Health, George Washington University, Washington, D.C., United States of America
                Imperial College London, United Kingdom
                Author notes

                LS worked as a contractor for SDI (now IMS) up to 2010 to research and promote the use of electronic claims data for public health surveillance. Otherwise, the authors have declared that no competing interests exist.

                Conceived and designed the experiments: DRO CV LS. Performed the experiments: DRO CV LS. Analyzed the data: DRO CV LS. Contributed reagents/materials/analysis tools: DRO CV LS. Wrote the paper: DRO KJK MP CV LS. Participated in data acquisition: DRO KJK MP. Wrote the first draft of the manuscript: DRO.

                Article
                PCOMPBIOL-D-13-00957
                10.1371/journal.pcbi.1003256
                3798275
                24146603
                f144babb-b8dd-4366-bbbf-ce173ae7413b
                Copyright @ 2013

                This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

                History
                : 29 May 2013
                : 20 August 2013
                Page count
                Pages: 11
                Funding
                DRO acknowledges support from the Markle Foundation, through the Distributed Surveillance Taskforce for Real-time Influenza Burden Tracking and Evaluation (DiSTRIBuTE) Project (081005BP-Q and 101003BP-B), and the Alfred P. Sloan Foundation, Syndromic Surveillance Evaluation Project (NYC DOHMH, 2010-12-14). LS acknowledges support from the RAPIDD (Research and Policy for Infectious Disease Dynamics) program of the Science and Technology Directorate, Department of Homeland Security, and the Fogarty International Center. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article