+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      FluBreaks: Early Epidemic Detection from Google Flu Trends

      , BS 1 , , , BS 1 , , BS(Current) 1 , , BS, PhD, Postdoc 1
      Journal of Medical Internet Research
      Gunther Eysenbach
      Influenza, public health, epidemics, statistical distributions, early response

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          The Google Flu Trends service was launched in 2008 to track changes in the volume of online search queries related to flu-like symptoms. Over the last few years, the trend data produced by this service has shown a consistent relationship with the actual number of flu reports collected by the US Centers for Disease Control and Prevention (CDC), often identifying increases in flu cases weeks in advance of CDC records. However, contrary to popular belief, Google Flu Trends is not an early epidemic detection system. Instead, it is designed as a baseline indicator of the trend, or changes, in the number of disease cases.


          To evaluate whether these trends can be used as a basis for an early warning system for epidemics.


          We present the first detailed algorithmic analysis of how Google Flu Trends can be used as a basis for building a fully automated system for early warning of epidemics in advance of methods used by the CDC. Based on our work, we present a novel early epidemic detection system, called FluBreaks (dritte.org/flubreaks), based on Google Flu Trends data. We compared the accuracy and practicality of three types of algorithms: normal distribution algorithms, Poisson distribution algorithms, and negative binomial distribution algorithms. We explored the relative merits of these methods, and related our findings to changes in Internet penetration and population size for the regions in Google Flu Trends providing data.


          Across our performance metrics of percentage true-positives (RTP), percentage false-positives (RFP), percentage overlap (OT), and percentage early alarms (EA), Poisson- and negative binomial-based algorithms performed better in all except RFP. Poisson-based algorithms had average values of 99%, 28%, 71%, and 76% for RTP, RFP, OT, and EA, respectively, whereas negative binomial-based algorithms had average values of 97.8%, 17.8%, 60%, and 55% for RTP, RFP, OT, and EA, respectively. Moreover, the EA was also affected by the region’s population size. Regions with larger populations (regions 4 and 6) had higher values of EA than region 10 (which had the smallest population) for negative binomial- and Poisson-based algorithms. The difference was 12.5% and 13.5% on average in negative binomial- and Poisson-based algorithms, respectively.


          We present the first detailed comparative analysis of popular early epidemic detection algorithms on Google Flu Trends data. We note that realizing this opportunity requires moving beyond the cumulative sum and historical limits method-based normal distribution approaches, traditionally employed by the CDC, to negative binomial- and Poisson-based algorithms to deal with potentially noisy search query data from regions with varying population and Internet penetrations. Based on our work, we have developed FluBreaks, an early warning system for flu epidemics using Google Flu Trends.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: not found

          Infodemiology: tracking flu-related searches on the web for syndromic surveillance.

          Syndromic surveillance uses health-related data that precede diagnosis and signal a sufficient probability of a case or an outbreak to warrant further public health response. While most syndromic surveillance systems rely on data from clinical encounters with health professionals, I started to explore in 2004 whether analysis of trends in Internet searches can be useful to predict outbreaks such as influenza epidemics and prospectively gathered data on Internet search trends for this purpose. There is an excellent correlation between the number of clicks on a keyword-triggered link in Google with epidemiological data from the flu season 2004/2005 in Canada (Pearson correlation coefficient of current week clicks with the following week influenza cases r=.91). The "Google ad sentinel method" proved to be more timely, more accurate and - with a total cost of Can$365.64 for the entire flu-season - considerably cheaper than the traditional method of reports on influenza-like illnesses observed in clinics by sentinel physicians. Systematically collecting and analyzing health information demand data from the Internet has considerable potential to be used for syndromic surveillance. Tracking web searches on the Internet has the potential to predict population-based events relevant for public health purposes, such as real outbreaks, but may also be confounded by "epidemics of fear". Data from such "infodemiology studies" should also include longitudinal data on health information supply.
            • Record: found
            • Abstract: not found
            • Article: not found

            Early detection of disease outbreaks using the Internet.

              • Record: found
              • Abstract: found
              • Article: not found

              The bioterrorism preparedness and response Early Aberration Reporting System (EARS).

              Data from public health surveillance systems can provide meaningful measures of population risks for disease, disability, and death. Analysis and evaluation of these surveillance data help public health practitioners react to important health events in a timely manner both locally and nationally. Aberration detection methods allow the rapid assessment of changes in frequencies and rates of different health outcomes and the characterization of unusual trends or clusters. The Early Aberration Reporting System (EARS) of the Centers for Disease Control and Prevention allows the analysis of public health surveillance data using available aberration detection methods. The primary purpose of EARS is to provide national, state, and local health departments with several alternative aberration detection methods. EARS helps assist local and state health officials to focus limited resources on appropriate activities during epidemiological investigations of important public health events. Finally, EARS allows end users to select validated aberration detection methods and modify sensitivity and specificity thresholds to values considered to be of public health importance by local and state health departments.

                Author and article information

                J Med Internet Res
                J. Med. Internet Res
                Journal of Medical Internet Research
                Gunther Eysenbach (JMIR Publications Inc., Toronto, Canada )
                Sep-Oct 2012
                04 October 2012
                : 14
                : 5
                : e125
                [1] 1School of Science and Engineering Computer Science Department Lahore University of Management Sciences LahorePakistan
                ©Fahad Pervaiz, Mansoor Pervaiz, Nabeel Abdur Rehman, Umar Saif. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 04.10.2012.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

                : 08 March 2012
                : 29 March 2012
                : 18 May 2012
                : 10 July 2012
                Original Paper

                influenza,public health,epidemics,statistical distributions,early response
                influenza, public health, epidemics, statistical distributions, early response


                Comment on this article