22
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Publish your biodiversity research with us!

      Submit your article here.

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Improving the Adoption and Evolution of Data Standards for Fossil Specimens

      , ,
      Biodiversity Information Science and Standards
      Pensoft Publishers

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          As we atomize and expand the digital representation of specimen information through data standards, it is critical to evaluate the implementation of these developments, including how well they serve discipline-specific needs. In particular, fossil specimens often present challenges because they require information to be captured that is seemingly parallel to, but not entirely aligned with, that of their extant counterparts. Previous work to evaluate data sharing practices of paleontology collections has shown an imbalance in the use of Darwin Core (DwC) (Wieczorek et al. 2012) terms and many instances of underutilized terms (Little 2018). To expand upon that broad assessment and encourage better adoption of evolving standards and data practices by fossil collections, a more in-depth review of term usage is necessary. Here we review specific DwC terms that are underutilized or that present challenges for fossil occurrence records, and we examine the subsequent impact on data discovery of paleo specimens. We conclude by sharing options for improving standards implementation within a paleo context.We see key patterns and challenges in current implementation of DwC in paleo collections, as evidenced by evaluations of the typical mappings found in occurrence records for fossil specimens, data flags applied by aggregators, and discussions within the paleo collections community. These can be organized into three broad groupings.Group 1: Some DwC terms (or classes of terms) are clear to implement, but are underutilized due to issues that are also found within the neontological community. Example: Location. In the case of terms related to the Location class, paleontology has a need for a way to deal with sensitive locality information. The sensitivity here typically relates to laws restricting the sharing of locality information to protect fossil sites versus neontological requirements to protect threatened, rare, or endangered species. The end goal of needing to fuzz locality information without completely making the specimen record undiscoverable or unusable is the same. There is a need for better education at the paleo data provider-level related to standards for recording and sharing information in this category, which could be based on existing neontological community standards.Group 2: A second group of DwC terms often seem clear to implement, but the terminology used to describe and define them might be unfamiliar to paleontologists or read as unnecessary for fossil occurrences. This uncertainty about the applicability of a term to paleo data can often result in data not being mapped or fully shared. Example: recordedBy (= collector). In these cases, a simple translation of what the definition means in verbiage that is familiar to paleontologists, or the inclusion of paleo-oriented examples in the DwC documentation, can make implementation clear.Group 3: A third group of issues relates to DwC terms, classes, and/or extensions that are more complicated in the context of fossil vs. neontological data. In some cases use of these terms is complicated for neontological data as well, but perhaps for different reasons. The terms impacted by these challenges can sometimes have the same general use, but due to the nature of fossil preservation, or because a term has a different meaning within the discipline of paleontology, additional layers of uncertainty or ambiguity are present. Examples: Resource Relationship/Interactions, Individual count, Preparations, Taxon. Review of these terms and their related classes and/or the extensions they are part of has revealed that they might require qualification, further explanation, additional vocabulary terms, or even the need for special handling instructions when data are ingested and normalized at the aggregator level. This group of issues is more complicated to resolve, but the problems are not intractable and can progress toward solutions through further discussion within the community, active participation in the standards development and review process, and development of clear guidelines. Strategically assessing these terms and generating discipline-specific guidelines to be used by the paleo community can improve the mobilization and discovery of fossil occurrence data. Documenting these paleo data practices not only helps data providers, it also increases the utility of these data within the broader research community by clearly outlining how the terms were used. Overall, this discipline-focused approach to understanding the implementation of data standards like DwC at the term level, helps to increase knowledge sharing across the paleo community, improves data quality and standards adoption, and moves these datasets towards alignment with best practices like the FAIR (Findable, Accessible, Interoperable, Reusable) data principles. 

          Related collections

          Most cited references2

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Darwin Core: An Evolving Community-Developed Biodiversity Data Standard

          Biodiversity data derive from myriad sources stored in various formats on many distinct hardware and software platforms. An essential step towards understanding global patterns of biodiversity is to provide a standardized view of these heterogeneous data sources to improve interoperability. Fundamental to this advance are definitions of common terms. This paper describes the evolution and development of Darwin Core, a data standard for publishing and integrating biodiversity information. We focus on the categories of terms that define the standard, differences between simple and relational Darwin Core, how the standard has been implemented, and the community processes that are essential for maintenance and growth of the standard. We present case-study extensions of the Darwin Core into new research communities, including metagenomics and genetic resources. We close by showing how Darwin Core records are integrated to create new knowledge products documenting species distributions and changes due to environmental perturbations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Establishing a New Framework for Paleontological Data Through an Evaluation of Current Data Sharing Practices

            The long-term lifecycle management of natural history data requires careful planning. Elements that have a significant impact on this planning include data quality, domain-specific requirements, and data interoperability. Standards like Darwin Core Wieczorek et al. 2012 are built to be flexible, allowing institutions to share data quickly without extensive modification of internal information management processes. However, there is often limited consensus on the exact meanings and use of key terms by various domains. If we want to increase the quality, interoperability, and long-term health of collections data, we must reassess how we record specimen data, paying special attention to the terms we use and how we use them. Here we share results from efforts to evaluate current data sharing practices for data from paleontology collections. By analysing the use of terms in Darwin Core, we are constructing a framework for how paleontological data is shared, how terms are used across many institutions, and where there are inconsistencies or lack of terms to support a fully robust record. We have also used data quality assessment and validation tools developed by organizations like the Global Biodiversity Information Facility (GBIF) to provide insight and testing for term-specific requirements addressing quality on a more global scale than might be the focus of any more locally driven data quality assessment. These assessments can guide the development of a new framework for sharing paleontological data, enabling the community to collaborate and find solutions to increase quality and interoperability. Additionally, individual institutions can utilize the framework to enhance long-term care of digital assets with global participation in mind.
              Bookmark

              Author and article information

              Contributors
              Journal
              Biodiversity Information Science and Standards
              BISS
              Pensoft Publishers
              2535-0897
              September 23 2021
              September 23 2021
              : 5
              Article
              10.3897/biss.5.75646
              2d9ce5fe-c6ed-481b-b0ff-14edd17d50ce
              © 2021

              https://creativecommons.org/share-your-work/public-domain/cc0/

              History

              Comments

              Comment on this article