0
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Avenues into Integration: Communicating taxonomic intelligence from sender to recipient

      , , ,

      Biodiversity Information Science and Standards

      Pensoft Publishers

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          “What is crucial for your ability to communicate with me… pivots on the recipient’s capacity to interpret—to make good inferential sense of the meanings that the declarer is able to send” (Rescher 2000, p148).Conventional approaches to reconciling taxonomic information in biodiversity databases have been based on string matching for unique taxonomic name combinations (Kindt 2020, Norman et al. 2020). However, in their original context, these names pertain to specific usages or taxonomic concepts, which can subsequently vary for the same name as applied by different authors. Name-based synonym matching is a helpful first step (Guala 2016, Correia et al. 2018), but may still leave considerable ambiguity regarding proper usage (Fig. 1). Therefore, developing "taxonomic intelligence" is the bioinformatic challenge to adequately represent, and subsequently propagate, this complex name/usage interaction across trusted biodiversity data networks. How do we ensure that senders and recipients of biodiversity data not only can share messages but do so with “good inferential sense” of their respective meanings?Key obstacles have involved dealing with the complexity of taxonomic name/usage modifications through time, both in terms of accounting for and digitally representing the long histories of taxonomic change in most lineages. An important critique of proposals to use name-to-usage relationships for data aggregation has been the difficulty of scaling them up to reach comprehensive coverage, in contrast to name-based global taxonomic hierarchies (Bisby 2011). The Linnaean system of nomenclature has some unfortunate design limitations in this regard, in that taxonomic names are not unique identifiers, their meanings may change over time, and the names as a string of characters do not encode their proper usage, i.e., the name “Genus species” does not specify a source defining how to use the name correctly (Remsen 2016, Sterner and Franz 2017). In practice, many people provide taxonomic names in their datasets or publications but not a source specifying a usage. The information needed to map the relationships between names and usages in taxonomic monographs or revisions is typically not presented it in a machine-readable format. New approaches are making progress on these obstacles. Theoretical advances in the representation of taxonomic intelligence have made it increasingly possible to implement efficient querying and reasoning methods on name-usage relationships (Chen et al. 2014, Chawuthai et al. 2016, Franz et al. 2015). Perhaps most importantly, growing efforts to produce name-usage mappings on a medium scale by data providers and taxonomic authorities suggest an all-or-nothing approach is not required. Multiple high-profile biodiversity databases have implemented internal tools for explicitly tracking conflicting or dynamic taxonomic classifications, including eBird using concept relationships from AviBase (Lepage et al. 2014); NatureServe in its Biotics database; iNaturalist using its taxon framework (Loarie 2020); and the UNITE database for fungi (Nilsson et al. 2019). Other ongoing projects incorporating taxonomic intelligence include the Flora of Alaska (Flora of Alaska 2020), the Mammal Diversity Database (Mammal Diversity Database 2020) and PollardBase for butterfly population monitoring (Campbell et al. 2020).

          Related collections

          Most cited references 10

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Avibase – a database system for managing and organizing taxonomic concepts

          Abstract Scientific names of biological entities offer an imperfect resolution of the concepts that they are intended to represent. Often they are labels applied to entities ranging from entire populations to individual specimens representing those populations, even though such names only unambiguously identify the type specimen to which they were originally attached. Thus the real-life referents of names are constantly changing as biological circumscriptions are redefined and thereby alter the sets of individuals bearing those names. This problem is compounded by other characteristics of names that make them ambiguous identifiers of biological concepts, including emendations, homonymy and synonymy. Taxonomic concepts have been proposed as a way to address issues related to scientific names, but they have yet to receive broad recognition or implementation. Some efforts have been made towards building systems that address these issues by cataloguing and organizing taxonomic concepts, but most are still in conceptual or proof-of-concept stage. We present the on-line database Avibase as one possible approach to organizing taxonomic concepts. Avibase has been successfully used to describe and organize 844,000 species-level and 705,000 subspecies-level taxonomic concepts across every major bird taxonomic checklist of the last 125 years. The use of taxonomic concepts in place of scientific names, coupled with efficient resolution services, is a major step toward addressing some of the main deficiencies in the current practices of scientific name dissemination and use.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The use and limits of scientific names in biological informatics

             David Remsen (2016)
            Abstract Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription. These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Reasoning over Taxonomic Change: Exploring Alignments for the Perelleschus Use Case

              Classifications and phylogenetic inferences of organismal groups change in light of new insights. Over time these changes can result in an imperfect tracking of taxonomic perspectives through the re-/use of Code-compliant or informal names. To mitigate these limitations, we introduce a novel approach for aligning taxonomies through the interaction of human experts and logic reasoners. We explore the performance of this approach with the Perelleschus use case of Franz & Cardona-Duque (2013). The use case includes six taxonomies published from 1936 to 2013, 54 taxonomic concepts (i.e., circumscriptions of names individuated according to their respective source publications), and 75 expert-asserted Region Connection Calculus articulations (e.g., congruence, proper inclusion, overlap, or exclusion). An Open Source reasoning toolkit is used to analyze 13 paired Perelleschus taxonomy alignments under heterogeneous constraints and interpretations. The reasoning workflow optimizes the logical consistency and expressiveness of the input and infers the set of maximally informative relations among the entailed taxonomic concepts. The latter are then used to produce merge visualizations that represent all congruent and non-congruent taxonomic elements among the aligned input trees. In this small use case with 6-53 input concepts per alignment, the information gained through the reasoning process is on average one order of magnitude greater than in the input. The approach offers scalable solutions for tracking provenance among succeeding taxonomic perspectives that may have differential biases in naming conventions, phylogenetic resolution, ingroup and outgroup sampling, or ostensive (member-referencing) versus intensional (property-referencing) concepts and articulations.
                Bookmark

                Author and article information

                Contributors
                Journal
                Biodiversity Information Science and Standards
                BISS
                Pensoft Publishers
                2535-0897
                October 09 2020
                October 09 2020
                : 4
                Article
                10.3897/biss.4.59006
                © 2020

                Comments

                Comment on this article