100
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Systematic Review of Re-Identification Attacks on Health Data

      research-article

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Privacy legislation in most jurisdictions allows the disclosure of health data for secondary purposes without patient consent if it is de-identified. Some recent articles in the medical, legal, and computer science literature have argued that de-identification methods do not provide sufficient protection because they are easy to reverse. Should this be the case, it would have significant and important implications on how health information is disclosed, including: (a) potentially limiting its availability for secondary purposes such as research, and (b) resulting in more identifiable health information being disclosed. Our objectives in this systematic review were to: (a) characterize known re-identification attacks on health data and contrast that to re-identification attacks on other kinds of data, (b) compute the overall proportion of records that have been correctly re-identified in these attacks, and (c) assess whether these demonstrate weaknesses in current de-identification methods.

          Methods and Findings

          Searches were conducted in IEEE Xplore, ACM Digital Library, and PubMed. After screening, fourteen eligible articles representing distinct attacks were identified. On average, approximately a quarter of the records were re-identified across all studies (0.26 with 95% CI 0.046–0.478) and 0.34 for attacks on health data (95% CI 0–0.744). There was considerable uncertainty around the proportions as evidenced by the wide confidence intervals, and the mean proportion of records re-identified was sensitive to unpublished studies. Two of fourteen attacks were performed with data that was de-identified using existing standards. Only one of these attacks was on health data, which resulted in a success rate of 0.00013.

          Conclusions

          The current evidence shows a high re-identification rate but is dominated by small-scale studies on data that was not de-identified according to existing standards. This evidence is insufficient to draw conclusions about the efficacy of de-identification methods.

          Related collections

          Most cited references95

          • Record: found
          • Abstract: not found
          • Article: not found

          k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY

            • Record: found
            • Abstract: not found
            • Article: not found

            The file drawer problem and tolerance for null results.

              • Record: found
              • Abstract: found
              • Article: not found

              Random-effects meta-analyses are not always conservative.

              It is widely held that random-effects summary effect estimates are more conservative than fixed-effects summaries in epidemiologic meta-analysis. This view is based on the fact that random-effects summaries have higher estimated variances and, consequently, wider confidence intervals than fixed-effects summaries when there is evidence of appreciable heterogeneity among the results from the individual studies. In such instances, however, the random-effects point estimates are not invariably closer to the null value nor are their p values invariably larger than those of fixed-effects summaries. Thus, random-effects summaries are not predictably conservative according to either of these two connotations of the term. The authors give an example from a meta-analysis of water chlorination and cancer in which the random-effects summaries are less conservative in both of these alternative senses and possibly more biased than the fixed-effects summaries. The discussion of when to use random effects and when to use fixed effects in computing summary estimates should be replaced by a discussion of whether summary estimates should be computed at all when the studies are not methodologically comparable, when their results are discernibly heterogeneous, or when there is evidence of publication bias.

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2011
                2 December 2011
                : 6
                : 12
                : e28071
                Affiliations
                [1 ]Electronic Health Information Laboratory, CHEO Research Institute, Ottawa, Canada
                [2 ]Department of Paediatrics, University of Ottawa, Ottawa, Canada
                [3 ]Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, United States of America
                [4 ]Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
                Johns Hopkins Bloomberg School of Public Health, United States of America
                Author notes

                Conceived and designed the experiments: KEE LA BM. Performed the experiments: KEE EJ LA BM. Analyzed the data: KEE EJ LA BM. Wrote the paper: KEE EJ LA BM.

                Article
                PONE-D-11-14348
                10.1371/journal.pone.0028071
                3229505
                22164229
                1278d4c6-c710-4e9a-9246-70163370a9b2
                El Emam et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 26 July 2011
                : 31 October 2011
                Page count
                Pages: 12
                Categories
                Research Article
                Computer Science
                Computer Security
                Medicine
                Clinical Research Design
                Systematic Reviews
                Non-Clinical Medicine
                Communication in Health Care
                Health Care Policy
                Health Informatics
                Medical Ethics
                Science Policy
                Bioethics
                Research Integrity
                Publication Ethics
                Social and Behavioral Sciences
                Communications

                Uncategorized
                Uncategorized

                Comments

                Comment on this article

                Related Documents Log