4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      TrainSel: An R Package for Selection of Training Populations

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A major barrier to the wider use of supervised learning in emerging applications, such as genomic selection, is the lack of sufficient and representative labeled data to train prediction models. The amount and quality of labeled training data in many applications is usually limited and therefore careful selection of the training examples to be labeled can be useful for improving the accuracies in predictive learning tasks. In this paper, we present an R package, TrainSel, which provides flexible, efficient, and easy-to-use tools that can be used for the selection of training populations (STP). We illustrate its use, performance, and potentials in four different supervised learning applications within and outside of the plant breeding area.

          Related collections

          Most cited references50

          • Record: found
          • Abstract: not found
          • Article: not found

          PORTFOLIO SELECTION*

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Prediction of total genetic value using genome-wide dense marker maps.

            Recent advances in molecular genetic techniques will make dense marker maps available and genotyping many individuals for these markers feasible. Here we attempted to estimate the effects of approximately 50,000 marker haplotypes simultaneously from a limited number of phenotypic records. A genome of 1000 cM was simulated with a marker spacing of 1 cM. The markers surrounding every 1-cM region were combined into marker haplotypes. Due to finite population size N(e) = 100, the marker haplotypes were in linkage disequilibrium with the QTL located between the markers. Using least squares, all haplotype effects could not be estimated simultaneously. When only the biggest effects were included, they were overestimated and the accuracy of predicting genetic values of the offspring of the recorded animals was only 0.32. Best linear unbiased prediction of haplotype effects assumed equal variances associated to each 1-cM chromosomal segment, which yielded an accuracy of 0.73, although this assumption was far from true. Bayesian methods that assumed a prior distribution of the variance associated with each chromosome segment increased this accuracy to 0.85, even when the prior was not correct. It was concluded that selection on genetic values predicted from markers could substantially increase the rate of genetic gain in animals and plants, especially if combined with reproductive techniques to shorten the generation interval.
              Bookmark
              • Record: found
              • Abstract: not found
              • Book: not found

              Adaptation in Natural and Artificial Systems

                Bookmark

                Author and article information

                Contributors
                Journal
                Front Genet
                Front Genet
                Front. Genet.
                Frontiers in Genetics
                Frontiers Media S.A.
                1664-8021
                07 May 2021
                2021
                : 12
                : 655287
                Affiliations
                [1] 1Agriculture & Food Science Centre, Animal and Crop Science Division, University College Dublin , Dublin, Ireland
                [2] 2Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA), Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Universidad Politécnica de Madrid (UPM) , Madrid, Spain
                Author notes

                Edited by: Diego Jarquin, University of Nebraska-Lincoln, United States

                Reviewed by: Roberto Fritsche-Neto, International Rice Research Institute (IRRI), Philippines; Luc L. Janss, Aarhus University, Denmark

                *Correspondence: Deniz Akdemir deniz.akdemir.work@ 123456gmail.com
                Julio Isidro y Sánchez j.isidro@ 123456upm.es

                This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics

                Article
                10.3389/fgene.2021.655287
                8138169
                34025720
                5afd925f-71fc-419c-bc10-a3e2ec715491
                Copyright © 2021 Akdemir, Rio and Isidro y Sánchez.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 18 January 2021
                : 31 March 2021
                Page count
                Figures: 5, Tables: 0, Equations: 4, References: 50, Pages: 12, Words: 8201
                Categories
                Genetics
                Original Research

                Genetics
                training optimization,machine learning,genomic selection,genomic prediction,image classification,multi-objective optimization,mixed models

                Comments

                Comment on this article