Blog
About

1
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      FAIR data in meta-omics research: Using the MOD-CO schema to describe structural and operational elements of workflows from field to publication

      , , , , ,

      Biodiversity Information Science and Standards

      Pensoft Publishers

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Nucleic acid and protein sequencing-based analyses are increasingly applied to determine origin, identity and traits of environmental (biological) objects and organisms. In this context, the need for corresponding data structures has become evident. As existing schemas and community standards in the domains of biodiversity and molecular biological research are comparatively limited with regard to the number of generic and specific elements, previous schemas for describing the physical and digital objects need to be replaced or expanded by new elements for covering the requirements from meta-omics techniques and operational details. On the one hand, schemas and standards are hitherto mostly focussed on elements, descriptors, or concepts that are relevant for data exchange and publication, on the other hand, detailed operational aspects regarding origin context and laboratory processing, as well as data management details, like the documentation of physical and digital object identifiers, are rather neglected. The conceptual schema for Meta-omics Data and Collection Objects (MOD-CO; https://www.mod-co.net/) has been set up recently Rambold et al. 2019. It includes design elements (descriptors or concepts), describing structural and operational details along the work- and dataflow from gathering environmental samples to the various transformation, transaction, and measurement steps in the laboratory up to sample and data publication and archiving. The concepts are named according to a multipartite naming structure, describing internal hierarchies and are arranged in concept (sub-)collections. By supporting various kinds of data record relationships, the schema allows for the concatenation of individual records of the operational segments along a workflow (Fig. 1). Thus, it may serve as a logical and structural backbone for laboratory information management systems. The concept structure in version 1.0 comprises 653 descriptors (concepts) and 1,810 predefined descriptor states, organised in 37 concept (sub-)collections. The published version 1.0 is available as various schema representations of identical content (https://www.mod-co.net/wiki/Schema_Representations). A normative XSD (= XML Schema Definition) for the schema version 1.0 is available under http://schema.mod-o.net/MOD-CO_1.0.xsd. The MOD-CO concepts might be integrated as descriptor/element structures in the relational database DiversityDescriptions (DWB-DD) an open-source and freely available software of the Diversity Workbench (DWB; https://diversityworkbench.net/Portal/DiversityDescriptions; https://diversityworkbench.net). Currently, DWB-DD is installed at the Data Center of the Bavarian Natural History Collections (SNSB) to build an instance of its own for storing and maintaining MOD-CO-structured meta-omics research data packages and enrich them with ‘metadata’ elements from the community standards Ecological Markup Language (EML), Minimum Information about any (x) Sequence (MIxS), Darwin Core (DwC) and Access to Biological Collection Data (ABCD). These activities are achieved in the context of ongoing FAIR ('Findable, Accessible, Interoperable and Reuseable') biodiversity research data publishing via the German Federation for Biological Data (GFBio) network (https://www.gfbio.org/). Version 1.1 of the schema with extended collections of structural and operational design concepts is scheduled for 2020.

          Related collections

          Most cited references 1

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Meta-omics data and collection objects (MOD-CO): a conceptual schema and data model for processing sample data in meta-omics research

          Abstract With the advent of advanced molecular meta-omics techniques and methods, a new era commenced for analysing and characterizing historic collection specimens, as well as recently collected environmental samples. Nucleic acid and protein sequencing-based analyses are increasingly applied to determine the origin, identity and traits of environmental (biological) objects and organisms. In this context, the need for new data structures is evident and former approaches for data processing need to be expanded according to the new meta-omics techniques and operational standards. Existing schemas and community standards in the biodiversity and molecular domain concentrate on terms important for data exchange and publication. Detailed operational aspects of origin and laboratory as well as object and data management issues are frequently neglected. Meta-omics Data and Collection Objects (MOD-CO) has therefore been set up as a new schema for meta-omics research, with a hierarchical organization of the concepts describing collection samples, as well as products and data objects being generated during operational workflows. It is focussed on object trait descriptions as well as on operational aspects and thereby may serve as a backbone for R&D laboratory information management systems with functions of an electronic laboratory notebook. The schema in its current version 1.0 includes 653 concepts and 1810 predefined concept values, being equivalent to descriptors and descriptor states, respectively. It is published in several representations, like a Semantic Media Wiki publication with 2463 interlinked Wiki pages for concepts and concept values, being grouped in 37 concept collections and subcollections. The SQL database application DiversityDescriptions, a generic tool for maintaining descriptive data and schemas, has been applied for setting up and testing MOD-CO and for concept mapping on elements of corresponding schemas.
            Bookmark

            Author and article information

            Journal
            Biodiversity Information Science and Standards
            BISS
            Pensoft Publishers
            2535-0897
            July 02 2019
            July 02 2019
            : 3
            Article
            10.3897/biss.3.37596
            © 2019

            http://creativecommons.org/licenses/by/4.0/

            Comments

            Comment on this article