The Pensoft Data Publishing Workflow: The FAIRway from articles to Linked Open Data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

As one of the first advocates of open access and open data in the field of biodiversity publishiing, Pensoft has adopted a multiple data publishing model, resulting in the ARPHA-BioDiv toolbox (Penev et al. 2017). ARPHA-BioDiv consists of several data publishing workflows and tools described in the Strategies and Guidelines for Publishing of Biodiversity Data and elsewhere: Data underlying research results are deposited in an external repository and/or published as supplementary file(s) to the article and then linked/cited in the article text; supplementary files are published under their own DOIs and bear their own citation details. Data deposited in trusted repositories and/or supplementary files and described in data papers; data papers may be submitted in text format or converted into manuscripts from Ecological Metadata Language (EML) metadata. Integrated narrative and data publishing realised by the Biodiversity Data Journal, where structured data are imported into the article text from tables or via web services and downloaded/distributed from the published article. Data published in structured, semanticaly enriched, full-text XMLs, so that several data elements can thereafter easily be harvested by machines. Linked Open Data (LOD) extracted from literature, converted into interoperable RDF triples in accordance with the OpenBiodiv-O ontology (Senderov et al. 2018) and stored in the OpenBiodiv Biodiversity Knowledge Graph. The above mentioned approaches are supported by a whole ecosystem of additional workflows and tools, for example: (1) pre-publication data auditing, involving both human and machine data quality checks (workflow 2); (2) web-service integration with data repositories and data centres, such as Global Biodiversity Information Facility (GBIF), Barcode of Life Data Systems (BOLD), Integrated Digitized Biocollections (iDigBio), Data Observation Network for Earth (DataONE), Long Term Ecological Research (LTER), PlutoF, Dryad, and others (workflows 1,2); (3) semantic markup of the article texts in the TaxPub format facilitating further extraction, distribution and re-use of sub-article elements and data (workflows 3,4); (4) server-to-server import of specimen data from GBIF, BOLD, iDigBio and PlutoR into manuscript text (workflow 3); (5) automated conversion of EML metadata into data paper manuscripts (workflow 2); (6) export of Darwin Core Archive and automated deposition in GBIF (workflow 3); (7) submission of individual images and supplementary data under own DOIs to the Biodiversity Literature Repository, BLR (workflows 1-3); (8) conversion of key data elements from TaxPub articles and taxonomic treatments extracted by Plazi into RDF handled by OpenBiodiv (workflow 5). These approaches represent different aspects of the prospective scholarly publishing of biodiversity data, which in a combination with text and data mining (TDM) technologies for legacy literature (PDF) developed by Plazi, lay the ground of an entire data publishing ecosystem for biodiversity, supplying FAIR (Findable, Accessible, Interoperable and Reusable data to several interoperable overarching infrastructures, such as GBIF, BLR, Plazi TreatmentBank, OpenBiodiv and various end users.

Related collections

Most cited references 2

Record: found
Abstract: found
Article: found

Is Open Access

OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system

Viktor Senderov, Kiril Simov, Nico Franz … (2018)

Background The biodiversity domain, and in particular biological taxonomy, is moving in the direction of semantization of its research outputs. The present work introduces OpenBiodiv-O, the ontology that serves as the basis of the OpenBiodiv Knowledge Management System. Our intent is to provide an ontology that fills the gaps between ontologies for biodiversity resources, such as DarwinCore-based ontologies, and semantic publishing ontologies, such as the SPAR Ontologies. We bridge this gap by providing an ontology focusing on biological taxonomy. Results OpenBiodiv-O introduces classes, properties, and axioms in the domains of scholarly biodiversity publishing and biological taxonomy and aligns them with several important domain ontologies (FaBiO, DoCO, DwC, Darwin-SW, NOMEN, ENVO). By doing so, it bridges the ontological gap across scholarly biodiversity publishing and biological taxonomy and allows for the creation of a Linked Open Dataset (LOD) of biodiversity information (a biodiversity knowledge graph) and enables the creation of the OpenBiodiv Knowledge Management System. A key feature of the ontology is that it is an ontology of the scientific process of biological taxonomy and not of any particular state of knowledge. This feature allows it to express a multiplicity of scientific opinions. The resulting OpenBiodiv knowledge system may gain a high level of trust in the scientific community as it does not force a scientific opinion on its users (e.g. practicing taxonomists, library researchers, etc.), but rather provides the tools for experts to encode different views as science progresses. Conclusions OpenBiodiv-O provides a conceptual model of the structure of a biodiversity publication and the development of related taxonomic concepts. It also serves as the basis for the OpenBiodiv Knowledge Management System. Electronic supplementary material The online version of this article (doi:10.1186/s13326-017-0174-5) contains supplementary material, which is available to authorized users.

0 comments Cited 24 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

ARPHA-BioDiv: A toolbox for scholarly publication and dissemination of biodiversity data based on the ARPHA Publishing Platform

Lyubomir Penev, Teodor Georgiev, Peter Geshev … (2017)

The ARPHA-BioDiv Тoolbox for Scholarly Publishing and Dissemination of Biodiversity Data is a set of standards, guidelines, recommendations, tools, workflows, journals and services, based on the ARPHA Publishing Platform of Pensoft, designed to ease scholarly publishing of biodiversity and biodiversity-related data that are of primary interest to EU BON and GEO BON networks. ARPHA-BioDiv is based on the infrastructure, knowledge and exeprience gathered in the years-long research, development and publishing activities of Pensoft, upgraded with novel tools and workflows that resulted from the FP7 project EU BON.

0 comments Cited 7 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Biodiversity Information Science and Standards

Abbreviated Title: BISS

Publisher: Pensoft Publishers

ISSN (Electronic): 2535-0897

Publication date Created: June 13 2019

Publication date (Electronic): June 13 2019

Volume: 3

Article

DOI: 10.3897/biss.3.35902

SO-VID: 8f014589-be93-4e9d-87c0-2d94d17096bd

License:

http://creativecommons.org/licenses/by/4.0/

History

Data availability:

Publish your biodiversity research with us!

Submit your article here.

The Pensoft Data Publishing Workflow: The FAIRway from articles to Linked Open Data

Read this article at

Abstract

Related collections

Pensoft Biodiversity

Most cited references 2

OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system

ARPHA-BioDiv: A toolbox for scholarly publication and dissemination of biodiversity data based on the ARPHA Publishing Platform

Author and article information

Journal

Article

History

Comments

Comment on this article

Similar content 47

Most referenced authors 15