6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Bringing Big Data to Bear in Environmental Public Health: Challenges and Recommendations

      brief-report

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Understanding the role that the environment plays in influencing public health often involves collecting and studying large, complex data sets. There have been a number of private and public efforts to gather sufficient information and confront significant unknowns in the field of environmental public health, yet there is a persistent and largely unmet need for findable, accessible, interoperable, and reusable (FAIR) data. Even when data are readily available, the ability to create, analyze, and draw conclusions from these data using emerging computational tools, such as augmented and artificial intelligence (AI) and machine learning, requires technical skills not currently implemented on a programmatic level across research hubs and academic institutions. We argue that collaborative efforts in data curation and storage, scientific computing, and training are of paramount importance to empower researchers within environmental sciences and the broader public health community to apply AI approaches and fully realize their potential. Leaders in the field were asked to prioritize challenges in incorporating big data in environmental public health research: inconsistent implementation of FAIR principles in data collection and sharing, a lack of skilled data scientists and appropriate cyber-infrastructures, and limited understanding of possibilities and communication of benefits were among those identified. These issues are discussed, and actionable recommendations are provided.

          Related collections

          Most cited references34

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The FAIR Guiding Principles for scientific data management and stewardship

          There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures.

            Because humans are invariably exposed to complex chemical mixtures, estimating the health effects of multi-pollutant exposures is of critical concern in environmental epidemiology, and to regulatory agencies such as the U.S. Environmental Protection Agency. However, most health effects studies focus on single agents or consider simple two-way interaction models, in part because we lack the statistical methodology to more realistically capture the complexity of mixed exposures. We introduce Bayesian kernel machine regression (BKMR) as a new approach to study mixtures, in which the health outcome is regressed on a flexible function of the mixture (e.g. air pollution or toxic waste) components that is specified using a kernel function. In high-dimensional settings, a novel hierarchical variable selection approach is incorporated to identify important mixture components and account for the correlated structure of the mixture. Simulation studies demonstrate the success of BKMR in estimating the exposure-response function and in identifying the individual components of the mixture responsible for health effects. We demonstrate the features of the method through epidemiology and toxicology applications.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The CompTox Chemistry Dashboard: a community data resource for environmental chemistry

              Despite an abundance of online databases providing access to chemical data, there is increasing demand for high-quality, structure-curated, open data to meet the various needs of the environmental sciences and computational toxicology communities. The U.S. Environmental Protection Agency’s (EPA) web-based CompTox Chemistry Dashboard is addressing these needs by integrating diverse types of relevant domain data through a cheminformatics layer, built upon a database of curated substances linked to chemical structures. These data include physicochemical, environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay data, surfaced through an integration hub with link-outs to additional EPA data and public domain online resources. Batch searching allows for direct chemical identifier (ID) mapping and downloading of multiple data streams in several different formats. This facilitates fast access to available structure, property, toxicity, and bioassay data for collections of chemicals (hundreds to thousands at a time). Advanced search capabilities are available to support, for example, non-targeted analysis and identification of chemicals using mass spectrometry. The contents of the chemistry database, presently containing ~ 760,000 substances, are available as public domain data for download. The chemistry content underpinning the Dashboard has been aggregated over the past 15 years by both manual and auto-curation techniques within EPA’s DSSTox project. DSSTox chemical content is subject to strict quality controls to enforce consistency among chemical substance-structure identifiers, as well as list curation review to ensure accurate linkages of DSSTox substances to chemical lists and associated data. The Dashboard, publicly launched in April 2016, has expanded considerably in content and user traffic over the past year. It is continuously evolving with the growth of DSSTox into high-interest or data-rich domains of interest to EPA, such as chemicals on the Toxic Substances Control Act listing, while providing the user community with a flexible and dynamic web-based platform for integration, processing, visualization and delivery of data and resources. The Dashboard provides support for a broad array of research and regulatory programs across the worldwide community of toxicologists and environmental scientists. Electronic supplementary material The online version of this article (10.1186/s13321-017-0247-6) contains supplementary material, which is available to authorized users.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Artif Intell
                Front Artif Intell
                Front. Artif. Intell.
                Frontiers in Artificial Intelligence
                Frontiers Media S.A.
                2624-8212
                15 May 2020
                2020
                : 3
                : 31
                Affiliations
                [1] 1Department of Environmental Health Sciences, Yale School of Public Health , New Haven, CT, United States
                [2] 2Department of Statistics and Data Science, Yale University , New Haven, CT, United States
                [3] 3Symbrosia Inc , Kailua-Kona, HI, United States
                [4] 4US Environmental Protection Agency, Center for Public Health and Environmental Assessment , Research Triangle Park, NC, United States
                [5] 5Microsoft Corporation, AI for Earth , Redmond, WA, United States
                [6] 6National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences , Research Triangle Park, NC, United States
                Author notes

                Edited by: Frank Emmert-Streib, Tampere University, Finland

                Reviewed by: Kristina Hettne, Center for Digital Scholarship at the Leiden University Library, Netherlands; Thomas Luechtefeld, Toxtrack LLC, United States

                *Correspondence: Vasilis Vasiliou vasilis.vasiliou@ 123456yale.edu

                This article was submitted to Medicine and Public Health, a section of the journal Frontiers in Artificial Intelligence

                Article
                10.3389/frai.2020.00031
                7654840
                33184612
                04f1d4e4-424c-42f0-a992-6edb0285b514
                Copyright © 2020 Comess, Akbay, Vasiliou, Hines, Joppa, Vasiliou and Kleinstreuer.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 14 June 2019
                : 06 April 2020
                Page count
                Figures: 0, Tables: 1, Equations: 0, References: 49, Pages: 7, Words: 5738
                Categories
                Artificial Intelligence
                Perspective

                artificial intelligence,public health,machine learning,open data,environmental health sciences,big data

                Comments

                Comment on this article