22
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Controlling the taxonomic variable: Taxonomic concept resolution for a southeastern United States herbarium portal

      , , ,
      Research Ideas and Outcomes
      Pensoft Publishers

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Overview. Taxonomic names are imperfect identifiers of specific and sometimes conflicting taxonomic perspectives in aggregated biodiversity data environments. The inherent ambiguities of names can be mitigated using syntactic and semantic conventions developed under the taxonomic concept approach. These include: (1) representation of taxonomic concept labels (TCLs: name sec. source) to precisely identify name usages and meanings, (2) use of parent/child relationships to assemble separate taxonomic perspectives, and (3) expert provision of Region Connection Calculus articulations (RCC–5: congruence, [inverse] inclusion, overlap, exclusion) that specify how data identified to different-sourced TCLs can be integrated. Application of these conventions greatly increases trust in biodiversity data networks, most of which promote unitary taxonomic 'syntheses' that obscure the actual diversity of expert-held views. Better design solutions allow users to control the taxonomic variable and thereby assess the robustness of their biological inferences under different perspectives. A unique constellation of prior efforts – including the powerful Symbiota collections software platform, the Euler/X multi-taxonomy alignment toolkit, and the "Weakley Flora" which entails 7,000 concepts and more than 75,000 RCC–5 articulations – provides the opportunity to build a first full-scale concept resolution service for SERNEC, the SouthEast Regional Network of Expertise and Collections, currently with 60 member herbaria and 2 million occurrence records. Intellectual merit. We have developed a multi-dimensional, step-wise plan to transition SERNEC's data culture from name- to concept-based practices. (1) We will engage SERNEC experts through annual, regional workshops and follow-up interactions that will foster buy-in and ultimately the completion of 12 community-identified use cases. (2). We will leverage RCC–5 data from the Weakley Flora and further development of the Euler/X logic reasoning toolkit to provide comprehensive genus- to variety-level concept alignments for at least 10 major flora treatments with highest relevance to SERNEC. The visualizations and estimated > 1 billion inferred concept-to-concept relations will effectively drive specimen data integration in the transformed portal. (3) We will expand Symbiota's taxonomy and occurrence schemas and related user interfaces to support the new concept data, including novel batch and map-based specimen determination modules, with easy output options in Darwin Core Archive format. (4) Through combinations of the new technology, enlisted taxonomic expertise, and SERNEC's large image resources, we will upgrade minimally 80% of all SERNEC specimen identifications from names to the narrowest suitable TCLs, or add "uncertainty" flags to specimens needing further study. (5) We will utilize the novel tools and data to demonstrate how controlling for the taxonomic variable in 12 use cases variously drives the outcomes of evolutionary, ecological, and conservation-based research hypotheses. Broader impacts. Our project is focused on just one herbarium network, but the potential impact is as wide as Darwin Core or even comparative biology. We believe that trust in networked biodiversity data depends on open and dynamic system designs, allowing expert access and resolution of multiple conflicting views that reflect the complex realities of ongoing taxonomic research. Taking well over 1 million SERNEC records from name- to TCL-resolution will show that "big" specimen data can pass the credibility threshold needed to validate the substantive data mobilization investment. We will mentor one postdoctoral researcher (UNC), two Ph.D. students (ASU, UIUC), and at least 15 undergraduate students (ASU). Each of our workshops will capacitate 10-15 SERNEC experts, who in turn can recruit colleagues and students at their home collections. We will incorporate the project theme and use cases into undergraduate courses taught at six institutions and reaching an estimated 300-500 students annually (10-40% minority students). At each institution, project members will make a systematic effort to recruit new students from underrepresented groups. Our group's leadership of Symbiota (with close ties to iDigBio), SERNEC, and local biodiversity projects and centers will further promote the new data culture. We will create a feature story "Where do plant species occur?" for ASU's popular "Ask A Biologist" website, and a series of undergraduate student-led "How-To" videos that illustrate the use case workflows, including the creation of multi-taxonomy alignments.

          Related collections

          Most cited references76

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The taxonomic name resolution service: an online tool for automated standardization of plant names

          Background The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. Results The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. Conclusions We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Error cascades in the biological sciences: the unwanted consequences of using bad taxonomy in ecology.

            Why do ecologists seem to underestimate the consequences of using bad taxonomy? Is it because the consequences of doing so have not been yet scrutinized well enough? Is it because these consequences are irrelevant? In this paper I examine and discuss these questions, focusing on the fact that because ecological works provide baseline information for many other biological disciplines, they play a key role in spreading and magnifying the abundance of a variety of conceptual and methodological errors. Although overlooked and underestimated, this cascade-like process originates from trivial taxonomical problems that affect hypotheses and ideas, but it soon shifts into a profound practical problem affecting our knowledge about nature, as well as the ecosystem structure and functioning and the efficiency of human health care programs. In order to improve the intercommunication among disciplines, I propose a set of specific requirements that peer reviewed journals should request from all authors, and I also advocate for urgent institutional and financial support directed at reinvigorating the formation of scientific collections that integrate taxonomy and ecology.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Molecular phylogeny of the Magnoliaceae: the biogeography of tropical and temperate disjunctions.

              The boreotropical flora concept suggests that relictual tropical disjunctions between Asia and the Americas are a result of the expansion of the circumboreal tropical flora from the middle to the close of the Eocene. Subsequently, temperate species diverged at high latitudes and migrated to other continents. To test this concept, we conducted a molecular phylogenetic analysis (using cpDNA) of the Magnoliaceae, a former boreotropical element that currently contains both tropical and temperate disjuncts. Divergence times of the clades were estimated using sequences of matK and two intergenic regions consisting of psbA-trnH and atpB-rbcL. Results indicate the tropical American section Talauma branched first, followed by the tropical Asian clade and the West Indies clade. Within the remaining taxa, two temperate disjunctions were formed. Assuming the temperate disjunction of Magnolia acuminata and Asian relatives occurred 25 mya (late Oligocene; based on seed fossil records), section Talauma diverged 42 mya (mid-Eocene), and tropical Asian and the West Indies clades 36 mya (late Eocene). These events correlate with cooling temperatures during the middle to late Eocene and probably caused the tropical disjunctions.
                Bookmark

                Author and article information

                Journal
                Research Ideas and Outcomes
                RIO
                Pensoft Publishers
                2367-7163
                September 30 2016
                September 30 2016
                : 2
                : e10610
                Article
                10.3897/rio.2.e10610
                d7468f62-d624-4b97-9e86-2eb7f287c7ef
                © 2016

                http://creativecommons.org/licenses/by/4.0/


                Comments

                Comment on this article