16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Species complex delimitations in the genus Hedychium: A machine learning approach for cluster discovery

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Premise

          Statistical methods used by most morphologists to validate species boundaries (such as principal component analysis [PCA] and non‐metric multidimensional scaling [nMDS]) are limiting because these methods are mostly used as visualization methods, and because the groups are identified by taxonomists (i.e., supervised), adding human bias. Here, we use a spectral clustering algorithm for the unsupervised discovery of species boundaries followed by the analysis of the cluster‐defining characters.

          Methods

          We used spectral clustering, nMDS, and PCA on 16 morphological characters within the genus Hedychium to group 93 individuals from 10 taxa. A radial basis function kernel was used for the spectral clustering with user‐specified tuning values (gamma). The goodness of the discovered clusters using each gamma value was quantified using eigengap, a normalized mutual information score, and the Rand index. Finally, mutual information–based character selection and a t‐test were used to identify cluster‐defining characters.

          Results

          Spectral clustering revealed five, nine, and 12 clusters of taxa in the species complexes examined here. Character selection identified at least four characters that defined these clusters.

          Discussion

          Together with our proposed character analysis methods, spectral clustering enabled the unsupervised discovery of species boundaries along with an explanation of their biological significance. Our results suggest that spectral clustering combined with a character selection analysis can enhance morphometric analyses and is superior to current clustering methods for species delimitation.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found
          Is Open Access

          Machine Learning for High-Throughput Stress Phenotyping in Plants.

          Advances in automated and high-throughput imaging technologies have resulted in a deluge of high-resolution images and sensor data of plants. However, extracting patterns and features from this large corpus of data requires the use of machine learning (ML) tools to enable data assimilation and feature identification for stress phenotyping. Four stages of the decision cycle in plant stress phenotyping and plant breeding activities where different ML approaches can be deployed are (i) identification, (ii) classification, (iii) quantification, and (iv) prediction (ICQP). We provide here a comprehensive overview and user-friendly taxonomy of ML tools to enable the plant community to correctly and easily apply the appropriate ML tools and best-practice guidelines for various biotic and abiotic stress traits.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Random forests for classification in ecology.

            Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The impact of species concept on biodiversity studies.

              Species are defined using a variety of different operational techniques. While discussion of the various methodologies has previously been restricted mostly to taxonomists, the demarcation of species is also crucial for conservation biology. Unfortunately, different methods of diagnosing species can arrive at different entities. Most prominently, it is widely thought that use of a phylogenetic species concept may lead to recognition of a far greater number of much less inclusive units. As a result, studies of the same group of organisms can produce not only different species identities but also different species range and number of individuals. To assess the impact of different definitions on conservation issues, we collected instances from the literature where a group of organisms was categorized both under phylogenetic and nonphylogenetic concepts. Our results show a marked difference, with surveys based on a phylogenetic species concept showing more species (48%) and an associated decrease in population size and range. We discuss the serious consequences of this trend for conservation, including an apparent change in the number of endangered species, potential political fallout, and the difficulty of deciding what should be conserved.
                Bookmark

                Author and article information

                Contributors
                preeti_s@iiserb.ac.in
                gowdav@iiserb.ac.in
                Journal
                Appl Plant Sci
                Appl Plant Sci
                10.1002/(ISSN)2168-0450
                APS3
                Applications in Plant Sciences
                John Wiley and Sons Inc. (Hoboken )
                2168-0450
                31 July 2020
                July 2020
                : 8
                : 7 ( doiID: 10.1002/aps3.v8.7 )
                : e11377
                Affiliations
                [ 1 ] Department of Biological Sciences Indian Institute of Science Education and Research Bhopal Bhopal Bypass Road Bhopal Madhya Pradesh 462066 India
                [ 2 ] Department of Computer Science and Automation Indian Institute of Science Bengaluru Karnataka 560012 India
                Author notes
                [*] [* ] Authors for correspondence: gowdav@ 123456iiserb.ac.in , preeti_s@ 123456iiserb.ac.in

                Author information
                https://orcid.org/0000-0003-0760-4432
                https://orcid.org/0000-0001-9692-4096
                https://orcid.org/0000-0001-8533-0014
                Article
                APS311377
                10.1002/aps3.11377
                7394710
                fd22d51d-286d-4bdc-aac0-fb8599c0424a
                © 2020 Saryan et al. Applications in Plant Sciences is published by Wiley Periodicals LLC on behalf of the Botanical Society of America

                This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                History
                : 21 October 2019
                : 27 May 2020
                Page count
                Figures: 5, Tables: 2, Pages: 13, Words: 8861
                Funding
                Funded by: Indian Institute of Science , open-funder-registry 10.13039/100007780;
                Funded by: Science and Engineering Research Board , open-funder-registry 10.13039/501100001843;
                Funded by: Council of Scientific and Industrial Research
                Categories
                Application Article
                Application Articles
                Invited Special Article
                For the Special Issue: Machine Learning in Plant Biology: From Genomics to Field Studies
                Custom metadata
                2.0
                July 2020
                Converter:WILEY_ML3GV2_TO_JATSPMC version:5.8.6 mode:remove_FC converted:31.07.2020

                cluster characterization,hedychium,morphological analysis,spectral clustering

                Comments

                Comment on this article