63
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Data harmonization and federated analysis of population-based studies: the BioSHaRE project

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Abstracts
          Background

          Individual-level data pooling of large population-based studies across research centres in international research projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence in the European Union) project aims to address these issues by building a collaborative group of investigators and developing tools for data harmonization, database integration and federated data analyses.

          Methods

          Eight population-based studies in six European countries were recruited to participate in the BioSHaRE project. Through workshops, teleconferences and electronic communications, participating investigators identified a set of 96 variables targeted for harmonization to answer research questions of interest. Using each study’s questionnaires, standard operating procedures, and data dictionaries, harmonization potential was assessed. Whenever harmonization was deemed possible, processing algorithms were developed and implemented in an open-source software infrastructure to transform study-specific data into the target (i.e. harmonized) format. Harmonized datasets located on server in each research centres across Europe were interconnected through a federated database system to perform statistical analysis.

          Results

          Retrospective harmonization led to the generation of common format variables for 73% of matches considered (96 targeted variables across 8 studies). Authenticated investigators can now perform complex statistical analyses of harmonized datasets stored on distributed servers without actually sharing individual-level data using the DataSHIELD method.

          Conclusion

          New Internet-based networking technologies and database management systems are providing the means to support collaborative, multi-center research in an efficient and secure manner. The results from this pilot project show that, given a strong collaborative relationship between participating studies, it is possible to seamlessly co-analyse internationally harmonized research databases while allowing each study to retain full control over individual-level data. We encourage additional collaborative research networks in epidemiology, public health, and the social sciences to make use of the open source tools presented herein.

          Related collections

          Most cited references35

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The PhenX Toolkit: Get the Most From Your Measures

          The potential for genome-wide association studies to relate phenotypes to specific genetic variation is greatly increased when data can be combined or compared across multiple studies. To facilitate replication and validation across studies, RTI International (Research Triangle Park, North Carolina) and the National Human Genome Research Institute (Bethesda, Maryland) are collaborating on the consensus measures for Phenotypes and eXposures (PhenX) project. The goal of PhenX is to identify 15 high-priority, well-established, and broadly applicable measures for each of 21 research domains. PhenX measures are selected by working groups of domain experts using a consensus process that includes input from the scientific community. The selected measures are then made freely available to the scientific community via the PhenX Toolkit. Thus, the PhenX Toolkit provides the research community with a core set of high-quality, well-established, low-burden measures intended for use in large-scale genomic studies. PhenX measures will have the most impact when included at the experimental design stage. The PhenX Toolkit also includes links to standards and resources in an effort to facilitate data harmonization to legacy data. Broad acceptance and use of PhenX measures will promote cross-study comparisons to increase statistical power for identifying and replicating variants associated with complex diseases and with gene-gene and gene-environment interactions.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Big data: The future of biocuration.

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Sharing research data to improve public health.

                Bookmark

                Author and article information

                Contributors
                Journal
                Emerg Themes Epidemiol
                Emerg Themes Epidemiol
                Emerging Themes in Epidemiology
                BioMed Central
                1742-7622
                2013
                21 November 2013
                : 10
                : 12
                Affiliations
                [1 ]Research Institute of the McGill University Health Centre, 2155 Guy, office 458, Montreal, Quebec H3H 2R9, Canada
                [2 ]Public Population Project in Genomics and Society, Montreal, Canada
                [3 ]Ontario Institute for Cancer Research, MaRS Centre, Toronto, Canada
                [4 ]D2K Research Group, School of Social and Community Medicine, University of Bristol, Bristol, UK
                [5 ]Department of Endocrinology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
                [6 ]Department of Chronic Disease Prevention, Public Health Genomics Unit, National Institute for Health and Welfare, Helsinki, Finland
                [7 ]Institute for Molecular Medicine, University of Helsinki, Helsinki, Finland
                [8 ]Department of Epidemiology, University Medical Center Groningen, Groningen, The Netherlands
                [9 ]European Academy of Bolzano/Bozen (EURAC), Center for Biomedicine, Bolzano, Italy
                [10 ]Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
                [11 ]Department of Public Health and General Practice, HUNT Research Center, Norwegian University of Science and Technology, Trondheim, Norway
                [12 ]Department of Cardiology and Epidemiology, University Medical Centre Groningen, Groningen, The Netherlands
                [13 ]Respiratory Epidemiology, Occupational Medicine and Public Health, National Heart and Lung Institute, Imperial College, London, UK
                Article
                1742-7622-10-12
                10.1186/1742-7622-10-12
                4175511
                24257327
                d3f2f8a5-0e01-44bb-9603-ab6ba61f2c6e
                Copyright © 2013 Doiron et al.; licensee BioMed Central Ltd.

                This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 3 July 2013
                : 11 November 2013
                Categories
                Analytic Perspective

                Public health
                Public health

                Comments

                Comment on this article