1
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Poster: found
      Is Open Access

      Data dictionary cookbook for research data and software interoperability at global scale

      poster

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We are now facing profound changes (biodiversity, climate, pandemic, etc.). Human impacts and their mitigation will depend on our ability to mobilize research at the global level. The sustainable development of the society will largely depend on the sustainable development of global science and scientific research tools, outputs, and research ecosystems. This globalization of research requires interoperating our observation and experimentation systems in order to better understand these changes, to better simulate their effects. The Covid-19 pandemic is now raging around the world. The reproducibility of research and results across regions in different contexts should accelerate human responses. Data sharing and the development of Synthesis Research with data aggregation at large scale is critical to enable such processes. The use of common knowledge, vocabularies, standards and procedures at a large scale is necessary. The objective of this poster is to report on the challenges met while building data dictionaries in three global projects related to biodiversity and/or disease research: PARSEC, Kakila, ERINHA-Advance. The Kakila database centralizes and harmonizes marine mammal observation data for the AGOA sanctuary around the French archipelago of Guadeloupe, French Antilles. The PARSEC Project is building new tools for data sharing and reuse through a transnational investigation of the socioeconomic impact of protected areas. The ERINHA-Advance project aims to support the operations of the ERINHA research infrastructure which is designed to generate data from transnational access research activities on highly pathogenic agents. In these 3 global case-studies, similar challenges have arisen: to aggregate and interoperate pre-existing heterogeneous data at the global scale, and to share common tools to monitor, maintain quality,  scan scale and cope with uncertainty. This poster proposes a draft common methodology, a  data dictionary cookbook, which will provide a roadmap towards the building of large scale - data dictionaries. Topics proposed to be covered in such a cookbook include: how to search for existing and appropriate data dictionaries, controlled vocabularies or other semantic resources (before building a new one), the first steps for data dictionary building, data dictionary literacy (and why it is a mandatory work), how to define all scientific objects, aspects (or use existing one) and agree on the definitions with the whole community, building / proposing variables / indicators with ontology models, schemas, variables naming rules and context awareness, and finally addressing dimension issues considering each context. The common experience of our three projects showed that we need to proceed step by step as simply as possible and to ensure that each step is understandable for the whole community. It is necessary to improve access and re-use of all existing semantic materials and not trying to build a cathedral with a little spoon.

          Related collections

          Author and article information

          Journal
          Zenodo
          2021
          20 April 2021
          Affiliations
          [1 ] ERINHA AISBL (European Research Infrastructure on Highly Pathogenic Agents)
          [2 ] OMMAG
          [3 ] IUEM-Université de Bretagne Occidentale
          [4 ] University of São Paulo
          [5 ] Word Data System
          [6 ] ORCID
          [7 ] Univ Brest, CNRS, Sorbonne Université, ISYEB
          [8 ] Research Institute for Humanity and Nature
          [9 ] LETG, Université de Bretagne Occidentale
          [10 ] PNDB, UMS 2006 PatriNat
          [11 ] CNRS, UMR GEODE, Université Toulouse 2
          [12 ] University of Toulouse
          [13 ] National Institute of Information and Communications Technology
          [14 ] University of California Santa Barbara
          [15 ] Tokyo Metropolitan University
          [16 ] Scientific Electronic Library Online
          [17 ] The University of Queensland
          [18 ] American Geophysical Union
          [19 ] National University
          Author information
          https://orcid.org/0000-0003-4073-7456
          https://orcid.org/0000-0003-3621-1005
          https://orcid.org/0000-0003-4909-4848
          https://orcid.org/0000-0002-8743-4244
          https://orcid.org/0000-0001-7862-8955
          https://orcid.org/0000-0002-8795-8056
          https://orcid.org/0000-0001-7670-4475
          https://orcid.org/0000-0002-8933-743X
          https://orcid.org/0000-0002-8504-068X
          https://orcid.org/0000-0002-0864-659X
          https://orcid.org/0000-0002-7724-1721
          https://orcid.org/0000-0002-1202-0194
          https://orcid.org/0000-0001-8608-3895
          https://orcid.org/0000-0003-1129-334X
          https://orcid.org/0000-0002-1693-8322
          https://orcid.org/0000-0002-2098-0902
          https://orcid.org/0000-0002-0207-0139
          https://orcid.org/0000-0002-2713-6202
          https://orcid.org/0000-0002-2623-0854
          https://orcid.org/0000-0003-2926-8353
          https://orcid.org/0000-0002-3223-6996
          https://orcid.org/0000-0001-5976-4943
          Article
          10.5281/zenodo.4683066
          1aeadd06-5284-4cf2-a2e0-5267b71adf1f

          Creative Commons Attribution 4.0 International

          History

          Data dictionary, cookbook, Research Data Management, Interoperability, reproducibility, FAIR Data, Data Reuse, Data Aggregation

          Comments

          Comment on this article