2
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Set of Simple Tools For Assembling, Annotating, Versioning and Publishing Taxonomies

      Biodiversity Information Science and Standards

      Pensoft Publishers

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Biodiversity data publishers rely on virtually assembled taxonomic hierarchies to structure their data, with operational units involving scientific names, nomenclatural acts and taxonomic trees. The main goal for the majority of biodiversity aggregators, databases, and software developed specifically for managing scientific names, biological samples and other occurrences has been to establish a single, unified biological classification, to serve as their structural "taxonomic backbone." Resources to produce and publish biological classifications digitally are thus, typically restricted to those generating unified taxonomic backbones, leaving individual researchers and decentralized communities with few options to assemble, visualize, version and disseminate multiple taxonomies online.To aid the creation of a culture of assembling, annotating, versioning, and publishing taxonomies online, and to help users interested in taxonomic classifications that lack digital communities, the development of a set of modular and independent tools is proposed, based on the following complementary features:A web application to serve as the taxonomy curator (referred to as the Curator)A web application to serve as the optional taxonomic database and information provider (referred to as the Aggregator)These tools are being designed and built following modern software development standards, in a modular architecture consisting of front-end clients, databases, and back-end applications, with the provision for a public Application Programming Interface (API) that will make data available for any interested parties and can be potentially integrated into large-scale projects like the Global Biodiversity Information Facility (GBIF), Integrated Digitized Biocollections (iDigBio), Symbiota (Gries et al. 2014), and Plazi (Agosti and Egloff 2009).Curator toolThe Curator tool will be a publicly accessible front-end web application, with which users can assemble, curate, and export taxonomies. The primary focus is to support the user-preferred taxonomy generation, with manual inputs and optional annotations of the resulting product. Users can pick between three modes of taxonomy assembly:manual mode with assisted taxon search,automated generation from an online source, andautomated generation from a file upload. Taxonomies can be edited and annotated as necessary. Once a user is satisfied with their taxonomy, they can save it in one or all of the available formats for exporting and external usage (common formats include, among others, JSON (JavaScript Object Notation), CSV (comma-separated values), and XML). Logged in users can also opt to save the taxonomy in the Aggregator database, which will make the taxonomy publicly available. Ideally, all fields in the Curator forms should correspond to terms included in the Darwin Core standard (Wieczorek et al. 2012) or Plazi’s TaxonX schema (Agosti and Egloff 2009) (for hierarchies available in published treatments).Aggregator toolThe Aggregator tool will communicate with the database and will provide users with a number of functionalities, such as:Store and publish versioned taxonomies generated with the CuratorAPI endpoints for automation (JSON/XML formats/CSV download)Optional unique identifier/DOI generation for published taxonomiesSearch engine with user-friendly interface as well as API endpoint for querying the databaseThe possibility of making taxonomies available as an API endpoint, as well as exporting taxonomies in different formats, will ensure that this tool behaves as a taxonomic source that can be used by virtually any interested party or application. The tools are being modelled as a decentralized community resource that can be used for any or all taxonomic groups and, as such, its scale and impact will be driven by bottom-up community use. The goal is not to provide extensive coverage of all biological organisms, but rather to provide an open digital toolkit and space for biodiversity researchers and projects that lack access to open, structured, online taxonomic publication venues and dedicated tools.Practical examples of usage for these tools include:A user generates multiple taxonomic concepts for organisms they are studying, which can then be queried and analyzed by scripts that make taxonomic alignments to compare different scientific hypotheses throughout time;An institution wants to publish a regional Symbiota portal to manage specimens in a particular collection, so they establish an annotated working taxonomic backbone with the Curator that Symbiota will then be able to ingest before samples can be imported into the portal;A researcher wants to export a biodiversity portal taxonomy at a given moment and wants to annotate and publish this version in an upcoming paper to establish scientific baselines for proper taxonomic communication.

          Related collections

          Most cited references 3

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Darwin Core: An Evolving Community-Developed Biodiversity Data Standard

          Biodiversity data derive from myriad sources stored in various formats on many distinct hardware and software platforms. An essential step towards understanding global patterns of biodiversity is to provide a standardized view of these heterogeneous data sources to improve interoperability. Fundamental to this advance are definitions of common terms. This paper describes the evolution and development of Darwin Core, a data standard for publishing and integrating biodiversity information. We focus on the categories of terms that define the standard, differences between simple and relational Darwin Core, how the standard has been implemented, and the community processes that are essential for maintenance and growth of the standard. We present case-study extensions of the Darwin Core into new research communities, including metagenomics and genetic resources. We close by showing how Darwin Core records are integrated to create new knowledge products documenting species distributions and changes due to environmental perturbations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Taxonomic information exchange and copyright: the Plazi approach

            Background A large part of our knowledge on the world's species is recorded in the corpus of biodiversity literature with well over hundred million pages, and is represented in natural history collections estimated at 2 – 3 billion specimens. But this body of knowledge is almost entirely in paper-print form and is not directly accessible through the Internet. For the digitization of this literature, new territories have to be chartered in the fields of technical, legal and social issues that presently impede its advance. The taxonomic literature seems especially destined for such a transformation. Discussion Plazi was founded as an association with the primary goal of transforming both the printed and, more recently, "born-digital" taxonomic literature into semantically enabled, enhanced documents. This includes the creation of a test body of literature, an XML schema modeling its logic content (TaxonX), the development of a mark-up editor (GoldenGATE) allowing also the enhancement of documents with links to external resources via Life Science Identifiers (LSID), a repository for publications and issuance of bibliographic identifiers, a dedicated server to serve the marked up content (the Plazi Search and Retrieval Server, SRS) and semantic tools to mine information. Plazi's workflow is designed to respect copyright protection and achieves extraction by observing exceptions and limitations existent in international copyright law. Conclusion The information found in Plazi's databases – taxonomic treatments as well as the metadata of the publications – are in the public domain and can therefore be used for further scientific research without any restriction, whether or not contained in copyrighted publications.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Symbiota – A virtual platform for creating voucher-based biodiversity information communities

              Abstract We review the Symbiota software platform for creating voucher-based biodiversity information portals and communities. Symbiota was originally conceived to promote small- to medium-sized, regionally and/or taxonomically themed collaborations of natural history collections. Over the past eight years the taxonomically diverse portals have grown into an important resource in North America and beyond for mobilizing, integrating, and using specimen- and observation-based occurrence records and derivative biodiversity information products. Designed to mirror the conceptual structure of traditional floras and faunas, Symbiota is exclusively web-based and employs a novel data model, information linking, and algorithms to provide highly dynamic customization. The themed portals enable meaningful access to biodiversity data for anyone from specialist to high school student. Symbiota emulates functionality of modern Content Management Systems, providing highly sophisticated yet intuitive user interfaces for data entry, batch processes, and editing. Each kind of content provision may be selectively accessed by authenticated information providers. Occupying a fairly specific niche in the biodiversity informatics arena, Symbiota provides extensive data exchange facilities and collaborates with other development projects to incorporate and not duplicate functionality as appropriate.
                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                Journal
                Biodiversity Information Science and Standards
                BISS
                Pensoft Publishers
                2535-0897
                September 16 2021
                September 16 2021
                : 5
                Article
                10.3897/biss.5.75344
                © 2021

                Comments

                Comment on this article