55
views
0
recommends
+1 Recommend
2 collections
    1
    shares

      Publish your biodiversity research with us!

      Submit your article here.

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The use and limits of scientific names in biological informatics

      research-article
      1
      ZooKeys
      Pensoft Publishers
      Taxonomic name services, taxon concepts, identifiers, relevance, search and retrieval

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Abstract

          Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription.

          These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The taxonomic name resolution service: an online tool for automated standardization of plant names

          Background The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. Results The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. Conclusions We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Names are key to the big new biology.

            Those who seek answers to big, broad questions about biology, especially questions emphasizing the organism (taxonomy, evolution and ecology), will soon benefit from an emerging names-based infrastructure. It will draw on the almost universal association of organism names with biological information to index and interconnect information distributed across the Internet. The result will be a virtual data commons, expanding as further data are shared, allowing biology to become more of a 'big science'. Informatics devices will exploit this 'big new biology', revitalizing comparative biology with a broad perspective to reveal previously inaccessible trends and discontinuities, so helping us to reveal unfamiliar biological truths. Here, we review the first components of this freely available, participatory and semantic Global Names Architecture. Copyright © 2010 Elsevier Ltd. All rights reserved.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              A New Name for Pneumocystis from Humans and New Perspectives on the Host-Pathogen Relationship

              The disease known as Pneumocystis carinii pneumonia (PCP) is a major cause of illness and death in persons with impaired immune systems. While the genus Pneumocystis has been known to science for nearly a century, understanding of its members remained rudimentary until DNA analysis showed its extensive diversity. Pneumocystis organisms from different host species have very different DNA sequences, indicating multiple species. In recognition of its genetic and functional distinctness, the organism that causes human PCP is now named Pneumocystis jiroveci Frenkel 1999. Changing the organism’s name does not preclude the use of the acronym PCP because it can be read “ Pneumocystis pneumonia.” DNA varies in samples of P. jiroveci, a feature that allows reexamination of the relationships between host and pathogen. Instead of lifelong latency, transient colonization may be the rule.
                Bookmark

                Author and article information

                Journal
                Zookeys
                Zookeys
                ZooKeys
                ZooKeys
                Pensoft Publishers
                1313-2989
                1313-2970
                2016
                7 January 2016
                : 550
                : 207-223
                Affiliations
                [1 ]Department of Marine Resources, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA 02543
                Author notes
                Corresponding author: David Remsen ( dremsen@ 123456mbl.edu )

                Academic editor: E. Michel

                Article
                10.3897/zookeys.550.9546
                4741222
                26877660
                2f137b56-8d40-43e6-9962-a0e0c71538b9
                David Remsen

                This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 6 March 2015
                : 9 March 2015
                Categories
                Research Article

                Animal science & Zoology
                taxonomic name services,taxon concepts,identifiers,relevance,search and retrieval

                Comments

                Comment on this article