73
views
1
recommends
+1 Recommend
1 collections
    2
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Extraction of phenotypic traits from taxonomic descriptions for the tree of life using natural language processing

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Premise of the Study

          Phenotypic data sets are necessary to elucidate the genealogy of life, but assembling phenotypic data for taxa across the tree of life can be technically challenging and prohibitively time consuming. We describe a semi‐automated protocol to facilitate and expedite the assembly of phenotypic character matrices of plants from formal taxonomic descriptions. This pipeline uses new natural language processing ( NLP) techniques and a glossary of over 9000 botanical terms.

          Methods and Results

          Our protocol includes the Explorer of Taxon Concepts ( ETC), an online application that assembles taxon‐by‐character matrices from taxonomic descriptions, and MatrixConverter, a Java application that enables users to evaluate and discretize the characters extracted by ETC. We demonstrate this protocol using descriptions from Araucariaceae.

          Conclusions

          The NLP pipeline unlocks the phenotypic data found in taxonomic descriptions and makes them usable for evolutionary analyses.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          HTPheno: An image analysis pipeline for high-throughput plant phenotyping

          Background In the last few years high-throughput analysis methods have become state-of-the-art in the life sciences. One of the latest developments is automated greenhouse systems for high-throughput plant phenotyping. Such systems allow the non-destructive screening of plants over a period of time by means of image acquisition techniques. During such screening different images of each plant are recorded and must be analysed by applying sophisticated image analysis algorithms. Results This paper presents an image analysis pipeline (HTPheno) for high-throughput plant phenotyping. HTPheno is implemented as a plugin for ImageJ, an open source image processing software. It provides the possibility to analyse colour images of plants which are taken in two different views (top view and side view) during a screening. Within the analysis different phenotypical parameters for each plant such as height, width and projected shoot area of the plants are calculated for the duration of the screening. HTPheno is applied to analyse two barley cultivars. Conclusions HTPheno, an open source image analysis pipeline, supplies a flexible and adaptable ImageJ plugin which can be used for automated image analysis in high-throughput plant phenotyping and therefore to derive new biological insights, such as determination of fitness.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Plant Ontology (PO): a Controlled Vocabulary of Plant Structures and Growth Stages

            The Plant Ontology Consortium (POC) (www.plantontology.org) is a collaborative effort among several plant databases and experts in plant systematics, botany and genomics. A primary goal of the POC is to develop simple yet robust and extensible controlled vocabularies that accurately reflect the biology of plant structures and developmental stages. These provide a network of vocabularies linked by relationships (ontology) to facilitate queries that cut across datasets within a database or between multiple databases. The current version of the ontology integrates diverse vocabularies used to describe Arabidopsis, maize and rice (Oryza sp.) anatomy, morphology and growth stages. Using the ontology browser, over 3500 gene annotations from three species-specific databases, The Arabidopsis Information Resource (TAIR) for Arabidopsis, Gramene for rice and MaizeGDB for maize, can now be queried and retrieved.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Phylogenetic Analysis of Araucariaceae: Integrating Molecules, Morphology, and Fossils

                Bookmark

                Author and article information

                Contributors
                clendara@gmail.com
                Journal
                Appl Plant Sci
                Appl Plant Sci
                10.1002/(ISSN)2168-0450
                APS3
                Applications in Plant Sciences
                John Wiley and Sons Inc. (Hoboken )
                2168-0450
                31 March 2018
                March 2018
                : 6
                : 3 ( doiID: 10.1002/aps3.2018.6.issue-3 )
                : e1035
                Affiliations
                [ 1 ] Department of Biology University of Florida Gainesville Florida 32611 USA
                [ 2 ] School of Information University of Arizona Tucson Arizona 85719 USA
                Author notes
                [*] [* ]Author for correspondence: clendara@ 123456gmail.com
                Author information
                http://orcid.org/0000-0003-2834-7412
                Article
                APS31035
                10.1002/aps3.1035
                5895189
                29732265
                72664174-cb6a-4b87-82a6-d64ab3b81365
                © 2018 Endara et al. Applications in Plant Sciences is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America.

                This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                History
                : 05 October 2017
                : 31 January 2018
                Page count
                Figures: 2, Tables: 1, Pages: 12, Words: 6316
                Funding
                Funded by: AVAToL: Next Generation Phenomics Project
                Award ID: DEB‐1208256
                Funded by: Building a Comprehensive Evolutionary History of Flagellate Plants
                Award ID: DEB‐1541506
                Funded by: Analyzing Fine‐Grained Semantic Markup of Descriptive Literature
                Award ID: DBI‐1147266
                Categories
                Protocol Note
                Protocol Notes
                Invited Special Article
                For the Special Issue: Methods for Exploring the Plant Tree of Life
                Custom metadata
                2.0
                aps31035
                March 2018
                Converter:WILEY_ML3GV2_TO_NLMPMC version:version=5.3.4 mode:remove_FC converted:11.04.2018

                morphological matrices,natural language processing,phenotypic traits,taxonomic descriptions

                Comments

                Comment on this article