2
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A workflow for standardising and integrating alien species distribution data

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Biodiversity data are being collected at unprecedented rates. Such data often have significant value for purposes beyond the initial reason for which they were collected, particularly when they are combined and collated with other data sources. In the field of invasion ecology, however, integrating data represents a major challenge due to the notorious lack of standardisation of terminologies and categorisations, and the application of deviating concepts of biological invasions. Here, we introduce the SInAS workflow, short for Standardising and Integrating Alien Species data. The SInAS workflow standardises terminologies following Darwin Core, location names using a proposed translation table, taxon names based on the GBIF backbone taxonomy, and dates of first records based on a set of predefined rules. The output of the SInAS workflow provides various entry points that can be used both to improve coherence among the databases and to check and correct the original data. The workflow is flexible and can be easily adapted and extended to the needs of different users. We illustrate the workflow using a case-study integrating five widely used global databases of information on biological invasions. The comparison of the standardised databases revealed a surprisingly low degree of overlap, which indicates that the amount of data may currently not be fully exploited in the original databases. We highly recommend the use and development of publicly available workflows to ensure that the integration of databases is reproducible and transparent. Workflows, such as SInAS, ultimately increase trust in data, study results, and conclusions.

          Related collections

          Most cited references 24

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The taxonomic name resolution service: an online tool for automated standardization of plant names

          Background The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. Results The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. Conclusions We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Names are key to the big new biology.

            Those who seek answers to big, broad questions about biology, especially questions emphasizing the organism (taxonomy, evolution and ecology), will soon benefit from an emerging names-based infrastructure. It will draw on the almost universal association of organism names with biological information to index and interconnect information distributed across the Internet. The result will be a virtual data commons, expanding as further data are shared, allowing biology to become more of a 'big science'. Informatics devices will exploit this 'big new biology', revitalizing comparative biology with a broad perspective to reveal previously inaccessible trends and discontinuities, so helping us to reveal unfamiliar biological truths. Here, we review the first components of this freely available, participatory and semantic Global Names Architecture. Copyright © 2010 Elsevier Ltd. All rights reserved.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Native invaders – challenges for science, management, policy, and society

                Bookmark

                Author and article information

                Journal
                NeoBiota
                NB
                Pensoft Publishers
                1314-2488
                1619-0033
                July 28 2020
                July 28 2020
                : 59
                : 39-59
                Article
                10.3897/neobiota.59.53578
                © 2020

                Comments

                Comment on this article