28
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Dealing with complexity of new phenotypes in modern dairy cattle breeding

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Implications Dairy cattle breeding companies and dairy cattle farmers face several challenges resulting in an increasing spectrum of traits with relevance to the breeding goal. Many of the evolving new traits are difficult-to-measure and their biological and genetic background as well as their relationship with other traits of interest is not yet well-understood which hinders proper implementation in breeding programs. Interdisciplinary and across-country data pooling and research including the application of innovative new methods helps to adapt breeding goals faster and better to the new requirements. Introduction Worldwide, animal breeding has played and still plays an important role in increasing the production efficiency of animals, e.g., dairy cattle. The development of low-cost genotyping strategies such as single nucleotide polymorphisms (SNPs) and genotyping-by-sequencing (Elshire et al., 2011; Kumar et al., 2012) has made genomic evaluations indispensable for modern dairy cattle breeding methods (Meuwissen et al., 2001; de los Campos et al., 2013; Gianola, 2013) and programs (Schaeffer, 2006; Lillehammer et al., 2011; Pryce and Daetwyler, 2011) and represented a quantum leap—often compared to the successful implementation of artificial insemination. However, the quality of any genomic breeding value estimation strongly depends on the number of phenotyped animals and the observed heritability of the used phenotypes (Daetwyler et al., 2008). The success of animal breeding is still mainly based on phenotypic animal observations and the tremendous progress made is largely due to appropriate trait definitions and comprehensive performance tests. Animal breeding companies as well as dairy farmers face several challenges concerning the sustainability of the entire dairy production system. This includes the impact of livestock on the environment and climate, the concern of increasing scarcity of natural resources (including genetic diversity) and feed, or concerns about animal welfare and health, and antimicrobial resistance. In the era of phenomics, the availability of robust phenotypes for these new issues is important. The technical revolution and the availability and processing of high amounts of data play a key role in this context. New phenotypes are based on large-scale or advanced measuring technologies. Sensor recordings play an increasingly important role for a wide range of traits (e.g., methane emissions, rumen microbiome characterization, mid-infrared spectra from milk samples, and behavioral traits). Especially in the initial phase of recording, when the use of novel phenotypes is often not yet or insufficiently validated by research, pooling of data across different research partners within and across countries can be very helpful. It allows for a faster and sound implementation in breeding programs. Nevertheless, data pooling can get complicated if data are measured using different protocols or sensor technologies or if data processing is handled differently or not transparently. All phenotypes have an inherent value that can be estimated as the contribution of an additional record to the genetic gain within a modern breeding goal (González-Recio et al., 2014). However, integrating a variety of new phenotypes into existing breeding programs is challenging due to the increasing complexity and unknown or potentially undesirable genetic correlations between different traits in the breeding goal. Our goal here is to give a brief overview about the development and use of new phenotypes in the era of phenomics as well as to show constraints when implementing them in modern dairy cattle breeding programs. Evolving new phenotypes in the era of phenomics The definition of the phenotype of an organism can be broad; in general, it refers to a set of traits of an organism and includes morphological and physiological characteristics as well as behavioral patterns. Traits are identifiable characteristics of animals which differ from each other, and which can be measured and analyzed as statistical quantities. In the context of animal breeding, important traits are those that have a considerable genetic determination and which either have an immediate economic, social, or environmental value. Mike Coffey’s often quoted statement “In the age of the genotype [genomics], phenotype is king” points out that measuring and recording of appropriate phenotypes is critical for genomic selection to function accurately. In the era of phenomics, the phenotype is even more in the spotlight of research. Difficult-to-measure phenotypes and complex interactions between old and novel breeding goal traits have become increasingly important. Currently, three main trait complexes are considered meaningful in the future: on the one hand, efficiencies of energy, nutrients, and environmental resources, on the other hand, health and resistance characteristics as well as animal well-being (Boichard and Brochard, 2012). This results in the challenge of obtaining precise and comprehensive information for these traits. Recent engineering advances and the decreasing cost of electronic technologies have allowed the development of sensing solutions supporting precision farming that automatically collect data, such as physiological parameters, new production measures, and behavioral traits. One of the current target values is sensor-derived activity patterns (e.g., from pedometers, transponders, bolus, and camera systems) from which characteristics of specific animal behavior can be derived. In addition, conclusions regarding health, fertility, or well-being can be drawn from individual deviations from such animal-specific patterns. Furthermore, animal interactions and social behavioral characteristics (aggressive vs. tolerant animals) as well as social networks within a herd can be derived (Foris et al., 2019; Salau et al., 2019). Moreover, in dairy science mid-infrared spectroscopy has been pointed out as a potential tool to collect data at the population level for phenotypic and genetic purposes, and, thus, is an evolving research topic. Commonly, mid-infrared spectroscopy is used to predict quality traits in milk samples. In addition to traditional traits (e.g., protein, fat, lactose, and urea contents), also milk characteristics like fatty acid, protein and mineral composition, milk coagulation, milk acidity, melamine content, and ketone bodies can be predicted and used to estimate, e.g., body energy status and methane emissions (de Marchi et al., 2014). Beyond this, research in the world of “omics” has led to different levels of phenotypes. The study of the omics cascade includes investigations based on metabolomes, proteomes, transcriptomes, and genomes (Figure 1). Metabolomics applied to animal breeding might become a cornerstone of the next generation of phenotyping approaches that are needed to refine and improve trait description and, in turn, to set up innovative breeding value estimations (Fontanesi, 2016). Knowledge of the biological background and genetic architecture of new and conventional traits can be enlarged using metabolomic information, thereby opening opportunities for novel applications in animal breeding. For example, biomarkers for particular physiological states or predispositions of animals can be used to breed more robust animals, as pointed out by Klein et al. (2012) who revealed that the level of glycerophosphocholine in milk samples is a suitable biomarker for the risk of ketosis, and, furthermore, allows selection for metabolically stable cows. Based on these findings, Ehret et al. (2015) combined SNP information, routine milk recording data, and, among other metabolites, the concentration of glycerophosphocholine in individual milk samples to predict the cow’s individual ketosis risk by machine learning techniques (Figure 2), and, thereby first showed the potential of these approaches. Figure 1. The omics cascade in systems biology approach is linking several levels of biological information of a certain phenotype. Adapted from Schwerin, unpublished. Figure 2. Due to their universal learning ability and flexibility in integrating various sorts of data, machine learning methods, like artificial neural networks, offer great advantages for constructing reliable predictive models for traits like multifactorial diseases. Recently, effects of animal production on climate (e.g., emission of methane) have become an important topic, at least in the scientific community, whereas no concrete efforts to include greenhouse gas emissions in breeding goals are currently in progress; however, given that greenhouse gas emissions are a much-debated political topic, studies to include this trait in breeding goals may be conducted in the near future. A series of studies revealed a moderate heritability of methane emissions showing that selective breeding for lower-emitting animals is possible (de Haas et al., 2011; Hayes et al., 2013; Bell et al., 2014). However, many direct phenotyping methods currently available are expensive and time-consuming, and therefore, the number of possible measurements is limited to a few animals. In addition, the gold standard method (respiration chambers) has the disadvantage that animals are measured in an artificial environment. Other methods that can be used in production situations (pasture, feedlot, or dairy feeding station) allow collection of methane samples for only a part of a day and require repeated measurements (Pickering et al., 2015). Given that direct phenotyping techniques are difficult and expensive, it can be assumed that recording on a large scale is only feasible using a proxy or, most likely, a combination of different proxies (i.e., indicators or indirect traits) which are sufficiently correlated to methane output, easily accessible, inexpensive to record, and, if more than one proxy is used, reflect independent sources of variation in methane emission. Currently, methane emission is measured or estimated using a large number of different methods (rarely on the same individuals) and there is lacking knowledge about how these data can be combined to enable genomic selection of cows with lower methane emissions (de Haas et al., 2017). Furthermore, there is no consensus on which phenotype to use for selection purposes: methane in liters per day or grams per day, methane in liters per kilogram of energy-corrected milk or dry matter intake, or a residual methane phenotype, where methane production is corrected for milk production and live weight (de Haas et al., 2017). Feed intake, a major determinant of methane production (Knapp et al., 2014), is currently discussed as an important new breeding goal trait, and, in contrast to methane, implementation of this trait into modern breeding goals is underway, yet, this is not trivial. Selection for dry matter intake has to be seen in the context of conflictive requirements regarding animal fitness and efficiency (Tetens et al., 2014). Simultaneous selection for low dry matter intake and high milk yield might improve feed efficiency but bears the risk of aggravating the energy deficit postpartum and related health problems (Tetens et al., 2014). Based on longitudinal and multivariate analyses of energy balance, dry matter intake, and energy-corrected milk yield across days in milk, Krattenmacher et al. (2019) were able to demonstrate a clearly lactation stage-specific genetic architecture of energy homeostasis with heritability estimates and genetic correlations that varied in the course of lactation and lactation stage-dependent association signals and concluded that it seems possible to optimize the lactation trajectory of dry matter intake in order to improve animal health in early lactation and feed efficiency in later lactation. This example illustrates that repeatedly recording phenotypes at different production phases, as well as knowledge on genetic correlations among all traits of interest across days in milk, is an important prerequisite for designing balanced breeding goals aiming to fine-tune dairy cattle in a proper way. With more traits, especially more complex traits, setting up reasonable breeding goals is much more sophisticated and often requires innovative approaches. Need and prerequisites for data pooling and joint research Breeding programs are often similar across countries, at least with respect to the traits included in the breeding goal. Even for novel traits with predominantly environmental or societal (instead of economical) relevance, efforts to implement these new traits into breeding goals are usually not limited to a single country. When dealing with traits which are difficult or costly to measure (e.g., feed intake/efficiency), in most cases, phenotypes are scarce. In such situations, interdisciplinary and across-country data pooling and research is often the best guarantee to ensure a fast and adequate implementation in breeding programs. However, such initiatives can be hindered by different production systems, the use of different protocols or methods for measuring, IP issues, and finally, if breeding companies are involved, by competition between countries. Likewise, setting up suitable agreements for data sharing and usability of the information derived through the analysis of pooled data is often a complicated and time-consuming task. Shortly after the successful implementation of genomic selection for routinely measured traits, the world’s largest collection of data for feed intake on genotyped dairy cattle has been created within the framework of the global Dry Matter Initiative (gDMI). de Haas et al. (2015) for the first time demonstrated that, provided a multi-trait approach is used, combining similar phenotypes across populations can increase the accuracy of genomic breeding values for important, but rare traits, such as dry matter intake. In the meantime, similar projects combining feed intake data were set up, e.g., the German project optiKuh which has been described in detail by Harder et al. (2019). The optiKuh data set consisted of data from different research farms that agreed to record as homogeneous data as possible over a 2-yr period. Using these data for genomic breeding value estimation, Harder et al. (accepted) observed comparably high reliabilities. This highlights the importance of standardized protocols for data recording, which is also considered relevant for other novel traits such as greenhouse gas emissions. Thus, the development of universal guidelines for recording difficult-to-measure traits is a crucial step toward implementation in breeding programs. Need for collaborations of different scientific fields New phenotypes from different sources, the technical revolution, and the need for detailed data on individual animals for precise dairy farming management have led to a dramatic increase in data volume (Figure 3). In the past, the rapidly growing number of genotyped and sequenced animals has already provoked geneticists to strengthen the scientific cooperation with experts from several other disciplines, such as computer science, bioinformatics, mathematics, and statistics. This newly evolved field of interdisciplinary research focuses on estimating more accurate predictive values of phenotypes by using predictive modeling methods such as machine learning (González-Camacho et al., 2018). The field of machine learning offers many flexible algorithms that are suitable for analysis of large, mainly complex data sets. Conventional statistical methods, such as regression, require the assumption of a specific parametric function (e.g., linear, quadratic, etc.), and large quantities of data must be discarded if one or more explanatory variables are missing. Machine learning algorithms, on the other hand, can accommodate complex dependencies among explanatory variables and can function effectively in the presence of missing values for some variables (Caraviello et al., 2006). In addition, network reconstruction methodologies based on systems biology concepts have been applied to disentangle the complexity of different levels of phenotypic information and linking metabolomics with other omics data (Fontanesi, 2016). Figure 3. Data sources and volumes are steadily increasing, and, as a result, analysis techniques are also getting more complex. Challenges in defining modern breeding goals in dairy cattle The essence of achieving a breeding goal through elaborated genetic improvement programs is the collection of accurate and comprehensive phenotypic data. The main factors determining the immediate merit of a phenotype are the number of phenotypic records available, the heritability, and the economic value of the trait. Furthermore, the usefulness of a phenotype is affected by several other factors, including the costs of establishing an adapted breeding program as well as the costs for phenotyping and genotyping (Gonzalez-Recio et al., 2014). In this context it is especially challenging to include traits which are related to public goods and, therefore, are of social relevance rather than of direct economic impact for farmers or hard-to-measure traits (e.g., addressing efficiency). In some instances, contingent valuation could serve as a tool to incorporate nonmarketed goods in the breeding goal. With respect to feed efficiency breeding goals have to be treated with some care. It is intuitive to propose saving feed costs by selecting on residual feed intake (Pryce et al., 2015); however, it well might be counterproductive at the sensitive early stage of lactation, when cows experience a negative energy balance and are prone to production diseases. Genetic correlations for feed intake and energy balance on the trajectory of days in milk now allow to select for these lactation stage-specific traits but the according economic weights have to be derived to make full use of these characteristics (Harder et al., 2019; Krattenmacher et al, 2019). To accomplish a broader view next to the monetary outcome on the farm level, the impact on the sector level should be considered and incorporated. Further unsolved problems are interdependencies and causality between traits. For example, on the one hand, high yield in dairy cows may increase susceptibility to certain diseases and, on the other hand, the incidence of a disease may affect yield negative (Rosa et al., 2011). The use of structural equation models can be extremely useful in this context (Wu et al., 2010). Genomic selection enables efficient selection for hard-to-measure traits, which was previously a limitation. Apart from the increased rate of genetic progress for production and quality traits, which allows faster reaction to changes in production circumstances, the huge benefit of this methodology lies in the improvement of expensive-to-measure traits (e.g., methane emission) by transferring genomic knowledge from estimates within comparatively small reference populations to the population level. Conclusion Modern dairy cow breeding programs aim to achieve an efficiency optimum in production under several constraints such as the best possible standards of animal health and welfare, together with minimal environmental impact (Figure 4). In the era of phenomics, both research and practical developments are focused on new phenotypes for animal breeding purposes that face these new challenges. It should be noted that there are still large gaps in understanding the biological background and genetic architecture of novel traits. Particularly for poorly defined phenotypes that are difficult or expensive to measure, the relationship between genome and phenome is far from being understood. Therefore, a strong interdisciplinary collaboration is necessary, both in the development of suitable measuring technologies, operation protocols, and evaluation methods as well as for the analysis of interactions between relevant (possibly unwantedly correlated) traits. Some of the traits which are currently studied might turn out to be not suitable for breeding but can still be useful for management purposes. With increasing number and complexity of breeding goal traits, the design of balanced breeding goals has become more complicated than in the past. However, problems and target directions are similar across different countries, and, thus, pooling of data (e.g., to create sufficiently large reference populations for genomic selection) still enables rapid progress. Figure 4. In order to balance the genetic progress for all traits of interest, breeding goals need to be widened and appropriate weight has to be given to traits in the selection index.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: found
          • Article: not found

          Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach

          Background The prediction of the genetic disease risk of an individual is a powerful public health tool. While predicting risk has been successful in diseases which follow simple Mendelian inheritance, it has proven challenging in complex diseases for which a large number of loci contribute to the genetic variance. The large numbers of single nucleotide polymorphisms now available provide new opportunities for predicting genetic risk of complex diseases with high accuracy. Methodology/Principal Findings We have derived simple deterministic formulae to predict the accuracy of predicted genetic risk from population or case control studies using a genome-wide approach and assuming a dichotomous disease phenotype with an underlying continuous liability. We show that the prediction equations are special cases of the more general problem of predicting the accuracy of estimates of genetic values of a continuous phenotype. Our predictive equations are responsive to all parameters that affect accuracy and they are independent of allele frequency and effect distributions. Deterministic prediction errors when tested by simulation were generally small. The common link among the expressions for accuracy is that they are best summarized as the product of the ratio of number of phenotypic records per number of risk loci and the observed heritability. Conclusions/Significance This study advances the understanding of the relative power of case control and population studies of disease. The predictions represent an upper bound of accuracy which may be achievable with improved effect estimation methods. The formulae derived will help researchers determine an appropriate sample size to attain a certain accuracy when predicting genetic risk.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Priors in whole-genome regression: the bayesian alphabet returns.

            Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term "Bayesian alphabet" denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless n ≫ p. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters ("tuning knobs") are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that n ≪ p.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              SNP Discovery through Next-Generation Sequencing and Its Applications

              The decreasing cost along with rapid progress in next-generation sequencing and related bioinformatics computing resources has facilitated large-scale discovery of SNPs in various model and nonmodel plant species. Large numbers and genome-wide availability of SNPs make them the marker of choice in partially or completely sequenced genomes. Although excellent reviews have been published on next-generation sequencing, its associated bioinformatics challenges, and the applications of SNPs in genetic studies, a comprehensive review connecting these three intertwined research areas is needed. This paper touches upon various aspects of SNP discovery, highlighting key points in availability and selection of appropriate sequencing platforms, bioinformatics pipelines, SNP filtering criteria, and applications of SNPs in genetic analyses. The use of next-generation sequencing methodologies in many non-model crops leading to discovery and implementation of SNPs in various genetic studies is discussed. Development and improvement of bioinformatics software that are open source and freely available have accelerated the SNP discovery while reducing the associated cost. Key considerations for SNP filtering and associated pipelines are discussed in specific topics. A list of commonly used software and their sources is compiled for easy access and reference.
                Bookmark

                Author and article information

                Journal
                Anim Front
                Anim Front
                af
                Animal Frontiers: The Review Magazine of Animal Agriculture
                Oxford University Press (US )
                2160-6056
                2160-6064
                April 2020
                01 April 2020
                01 April 2020
                : 10
                : 2
                : 23-28
                Affiliations
                Institute of Animal Breeding and Husbandry, Christian-Albrechts-University , Kiel, Germany
                Author notes

                These authors contributed equally to this work.

                Article
                vfaa005
                10.1093/af/vfaa005
                7111594
                32257600
                a1664c64-6602-44d8-8b1d-ac64319a0845
                © Seidel, Krattenmacher, and Thaller

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 6
                Categories
                Feature Articles
                AcademicSubjects/SCI00960

                breeding goal,dairy cattle,phenomics,phenotype
                breeding goal, dairy cattle, phenomics, phenotype

                Comments

                Comment on this article