47
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Let's Face It—Complex Traits Are Just Not That Simple

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The idea that we can reconstruct a human face from a DNA sample has great appeal: DNA from a crime scene could be used to create a facial image of a suspect; the faces of prehistoric peoples could be reconstructed from their remains; the face of a child could be predicted in utero from amniocentesis. This is the promise implicit in the study of Claes et al. [1]. In their own words: “…our methods provide the means of identifying the genes that affect facial shape and for modeling the effects of these genes to generate a predicted face.” (pg. 10) Unfortunately, this promise greatly overreaches the data and analyses represented by the study, and it misrepresents our current understanding of the genetics of complex morphological traits. Worse, this claim, and the fairly sensational media reports that have stemmed from it, detract significantly from what is otherwise an important paper that highlights the potential of an interesting new technique to investigate the genetic basis for variation in the shape of the human face. Claes et al. [1] base their analysis on a mixed ancestry sample of 592 people genotyped with a 540,000 single nucleotide polymorphism (SNP) array. The primary analyses conducted deal with the effects of sex and ancestry on facial shape. Sexual dimorphism (when taking ancestry into account) explains 13% of the total shape variation in the face, while ancestry (when taking sex into account) explains 10%. These results are interesting, particularly as the contribution of ancestry is unexpectedly high. This surprising result has tremendous implications for the microevolution of facial shape and, potentially, the role of sexual selection. Could the amount of facial shape variation attributed to ancestry have accumulated through genetic drift? If not, what might explain the surprising contribution of ancestry to facial shape? These questions are intriguing and important regardless of the extent facial shape, or even facial shape characteristics, can be reliably predicted from DNA. But why is the claim that facial shape can be predicted from DNA so troubling? The first reason is that it isn't actually supported by the work done in this study. Claes et al. [1] examine the effects of candidate genes for normal craniofacial variation. They select 46 genes based on evidence of accelerated evolution and evidence from animal models that these genes are expressed in the head. Here, they argue that, because the genes in question are already known to contribute to head development, no adjustment for multiple comparisons is necessary. This assumption is highly problematic. Lots of genes are involved in the development of the head, but that does not mean that those genes contribute to normal variation in the face. Large-scale genome wide association studies (GWAS) in humans often fail to demonstrate significant roles for genes that are known to produce relevant phenotypes in experimental models. In fact, it is also quite possible that many genes not known to play important roles in craniofacial development contribute to normal variation in the face. Many genes that have known developmental roles may be so important that variation in their function or expression is highly selected against. For example, Shh plays crucial roles in forming the face [2], and yet this gene does not appear in GWAS studies of human facial shape variation [3]. Alternatively, variation in a characteristic like facial shape may come from unexpected sources such as genetic variation in gap junction proteins that influence the ability of bone to respond to mechanical load during chewing [4]. Examination of Table S2 reveals that only one of the 46 candidate genes tested would have survived Bonferroni adjustment for multiple testing. Even if one accepts the rationale for avoiding adjustment for multiple testing, the authors could have compared the p values obtained for the 46 genes from random samples of SNPs drawn from the 55,000 tested. In the absence of such a test, the study contributes nothing new to our understanding of how genes influence the shape of the face since the genes tested may or may not actually contribute anything to normal variation in the shape of the face. The second, and deeper, issue, though, is whether genomic prediction of complex morphologies is even feasible. Obviously, variation in genes causes variation at the phenotypic level. This does not mean, though, that a complex phenotype can be accurately predicted from genetic data. For a trait such as coronary heart disease (CHD), prediction of risk is highly problematic, even though quite a lot is known about the underlying genetics [5]. Much less is known about the genetics of complex morphological traits like the shape of the face. The problem is that the genotype–phenotype map for morphological traits is incredibly complicated (Figure 1). It is not just that variation in genes exhibit many relationships with phenotypic outcomes (Figure 1B). It is, rather, that phenotypic variation in morphological traits is structured by developmental processes at multiple levels and times in development. These processes and their interactions are complex but modular in their organization [6], [7]. This architecture of development is responsible for the modulation of phenotypic variance [8] and covariation structure [6], [7] (Figure 1C). Changes to developmental processes that influence the shape of the head tend to produce highly integrated and often unexpected effects on global shape [7], [9]–[11]. Even subtle effects, such as those produced by enhancers to craniofacial genes acting in spatiotemporally specific ways during development, produce global rather than localized shape transformations of the head [12]. 10.1371/journal.pgen.1004724.g001 Figure 1 Two complementary depictions of the developmental architecture underlying the genotype–phenotype map for complex traits. A captures the idea that large amounts of genetic variation funnel to smaller numbers of pathways and processes. These processes then interact to produce structured and modulated phenotypic variation in a complex trait. B, which derives from Wagner [20], shows the many-to-many mapping of genes to traits; while both Cs show the modular pattern of gene effects on processes and the effects of processes on sets of phenotypic traits. These depictions illustrate some of the complexity involved in constructing models of genotype–phenotype relations in complex traits. Complex patterns of interactions among developmental processes can also generate unexpected patterns of heritability. Although genetic variance may be predominantly additive for complex traits such as oil content in maize or body mass in mice [13], [14], this may not be the case for multidimensional and modularly organized morphological traits like facial shape [15], [16]. If a significant proportion of the genetic variance for facial shape is non-additive, which remains an empirical question, prediction of facial shape from genotype is greatly complicated. The genetic basis for facial shape variation may be as much, if not more, about epistatic interactions and context-specific developmental interactions than about the additive effects of individual genes. The fact that a large, recent GWAS study of facial shape revealed few causative loci is suggestive of a very complex genetic architecture for this trait [3]. The overselling of these results is unfortunate and unnecessary because it detracts from what is otherwise a very interesting study. The bootstrapped response-based imputation modeling (BRIM) technique, for example, is an intriguing extension of the shape analysis toolkit. Like other geometric morphometric methods, it is based on Procrustes superimposition and multivariate data reduction of variation in landmark position. The method allows for estimation of a single quantitative axis of variation that would correspond to a multidimensional factor such as ancestry, or a discrete variable such as sex. The method relies on machine learning algorithms to define shape axes that best correspond to such variables. As such, the method has great potential for furthering quantitative analyses of the genetics of complex morphologies. The use of dense landmark representation and machine learning algorithms similarly has potential in the analysis of complex morphologies, and this study points the way towards future applications of such techniques. This aspect of the paper would be stronger had they validated BRIM by comparison to existing methods. Estimation of the effects of single genes, ancestry, sex or any other factor of interest on total shape variation and local shape variation can already be done using the current GM toolkit [17] or dense correspondence analysis [18], [19]. Although, how much better BRIM performs than existing methods is hard to tell without validation, despite the seemingly encouraging results presented here. Developing a mechanistic understanding of the genotype–phenotype map is undoubtedly one of the greatest challenges of modern biology. Claes et al. offer us a new and valuable tool to apply to this grand challenge. We should not be fooled, however, into thinking that we are anywhere close to understanding developmental genetics at the level where prediction of complex morphological traits is feasible. Overselling and overpromising in science is dangerous because it creates unreasonable expectations both at the public and policymaker levels. Ultimately, this runs the risk of diverting valuable scientific resources away from the important task of understanding how variation in genes plays through developmental processes to produce the amazing diversity of organismal form.

          Related collections

          Most cited references11

          • Record: found
          • Abstract: found
          • Article: not found

          Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits

          Introduction Complex phenotypes, including quantitative traits and common diseases, are controlled by many genes and by environmental factors. How do these genes combine to determine the phenotype of an individual? The simplest model is to assume that genes act additively with each other both within and between loci, but of course they may interact to show dominance or epistasis, respectively. A long standing controversy has existed concerning the importance of these non-additive effects, involving both Fisher [1] and Wright [2]. Estimates of genetic variance components within populations have indicated that most of the variance is additive [3],[4]. Increasing knowledge about biological pathways and gene networks implies, however, that gene-gene interactions (epistasis) are important, and some have argued recently that much genetic variance in populations is due to such interactions [5],[6],[7],[8]. It is important to distinguish between the observations of dominance or epistasis at the level of gene action at individual loci, exemplified by a table of genotypic values, and the observations of variance due to these components in analysis of data from a population. For example, at a completely dominant locus almost all the variance contributed is additive if the recessive gene is at high frequency [3],[4]. An understanding of the nature of complex trait variation is important in evolutionary biology, medicine and agriculture and has gained new relevance with the ability to map genes for complex traits, as demonstrated by the recent burst of papers that report genome-wide association studies between complex traits and thousands of single nucleotide polymorphisms (SNPs) [9],[10],[11],[12],[13]. Here we attempt to resolve the alternative sources of evidence on the importance of non-additive genetic variation. We evaluate the evidence from empirical studies of genetic variance components and indeed find that additive variance typically accounts for over half and often close to 100% of the total genetic variance. We then present new theory and results that show why this is the case even if there are non-additive effects at the level of gene action. Empirical Evidence for Additive and Non-Additive Genetic Variance Estimation of Genetic Variance The genetic variance V G can be partitioned into additive (V A), dominance (V D), and a combined epistatic component (V I), which itself can be partitioned into two locus (V AA, V AD, and V DD) and multiple locus components (V AAA, etc.) [3],[4],[14],[15],[16],[17]. Estimation of additive and non-additive variance components utilises the observed phenotypic similarity of relatives and the expected contribution of additive and non-additive effects to that similarity [3],[4]. In addition to resemblance due to additive or non-additive genetic factors, relatives may resemble each other due to common environmental effects. In an extremely large data set with very many different kinds of relationships present, it is possible in principle to partition variation into many components using modern statistical methods such as residual maximum likelihood [18] (REML) with the animal model [4],[19],[20]. In practice it is never possible to estimate many variance components with useful precision, however, not least because there is a high degree of confounding: for example, full sibs have a higher covariance for all single and multi-locus genetic components than do half sibs. The coefficients of epistatic components are small (e.g., V AA/16 for half-sibs), so estimates have high sampling error and there is little power to distinguish V A from, say, V AA. Selection, assortative mating, and non-genetic covariances also confound estimates. Consequently, there are few accurate estimates of non-additive variance components but there is indirect evidence. For instance, a narrow sense heritability value (h 2 = V A/V P) of one-half, typical for many traits, implies that dominance, all the vast number of epistatic components, and the environmental component, collectively contribute no more than V A. Similarly if the heritability is only a little less than the repeatability (the phenotypic correlation of repeated measures), all non-additive genetic variances and the permanent environmental variance together comprise this small difference. With these caveats we summarise data of various types. Laboratory Animals and Livestock The extensive data on experimental organisms show a range of heritability, higher for morphological than fitness associated traits, averaging as follows [21]: morphology - 0.46, physiology (e.g., oxygen consumption, resistance to heat stress) - 0.33, behaviour - 0.30, and life history - 0.26. There have been extensive estimates of heritability for traits of livestock. For example, for beef cattle, these averaged: post-weaning weight gain 0.31, market weight for age 0.41, backfat thickness 0.44 [22]. In general for morphological traits, such as carcass fatness, egg weight in poultry or fat and protein content of cow's milk, a heritability of 0.5 or so is the norm, whereas for growth traits or milk yield 0.25–0.35 is more typical [23]. These estimates of heritability from half-sib correlations could be biased upwards by additive epistatic terms, but they can not account for estimates of heritability over 25%. Furthermore, estimates of realised heritability from response to selection [3] are not biased in that way, because epistatic components do not contribute to long term selection response [24], and estimates of realised heritability range up to 0.5 for fat content of mice, for example [25]. There are a number of cases where it can be shown directly that V A contributes almost all of V G and indeed almost all of V P. For bristle number in Drosophila melanogaster, the phenotypic correlation between abdominal segments, which, assuming they are influenced by the same genes, estimates V G/V P, is only a little higher than the heritability, indicating that V A/V G∼0.8 [26]. For finger ridge count (in humans), estimates of heritability are close to one and consistent from different sorts of relatives [27]. Even for lowly heritable traits such as litter size in pigs, the repeatability is little higher than the heritability, implying that most genetic variance is additive [28]. Whilst there is a clear relationship between heritability and type of trait, it should be noted that low heritability does not imply low genetic variance: the evolvability (√V A/mean) is higher for fitness than morphological traits [29], and even for estimates of fitness itself or traits closely related to it, additive genetic variance is present [30],[31]. There are rarely good direct estimates of epistatic or dominance variance because these variance components are usually estimated from full-sibs and therefore confounded with the common environment shared by full sibs. However, if the heritability is high, the space for them is limited. Experiments on inbreeding depression provide some evidence on the importance of non-additive effects. Inbreeding depression implies directional dominance in gene effects but, for a given rate of inbreeding depression, as the number of loci increases and the gene frequencies move toward 0 or 1.0, the dominance variance decreases towards zero. Consequently, the importance of inbreeding depression for traits related to fitness is not evidence that the dominance variance is large. The observed linearity of inbreeding depression with inbreeding co-efficient is easiest to explain with directional dominance but not with DD or higher order epistatic effects because these would cause non-linearity unless they happened to exactly cancel each other out. Twin Studies in Humans In contrast to studies of sibs and more distant relatives, identical twins can provide estimates of V G. The classical twin design of samples of monozygotic (MZ) and dizygotic (DZ) twin pairs has been used extensively to estimate variance components for a wide range of phenotypes in human populations. The primary statistics from these studies are the correlations between MZ pairs (r MZ) and between DZ pairs (r DZ). If twin resemblance due to common environmental factors is the same for MZ and DZ twins then r MZ>r DZ implies that part of the resemblance is due to genetic factors and r MZ>2r DZ implies the importance of non-additive genetic effects. Conversely, r MZ<2r DZ implies that common environmental factors cause some of the observed twin resemblance. Sophisticated variance component partitioning methods to estimate components of additive, non-additive and common environmental effects are used widely [32], but all rely on the strong assumptions that resemblance due to common environmental effects is the same for MZ and DZ twins. Attempts to test this hypothesis have not found any evidence to reject it [33],[34]. Nevertheless, even accepting this assumption about common environmental variance, in the classical twin design there are only two primary statistics and three or more variance components cannot be estimated without making additional assumptions. We summarised the MZ and DZ correlations for a wide variety of phenotypes from published twin studies from a single productive laboratory in Australia (genepi.qimr.edu.au). The criteria were that each study must have more than 100 MZ and more than 100 DZ pairs and that the study subjects were Australian twins. For non-continuous traits, studies were included only if they reported polychoric or tetrachoric correlations. In total, 86 phenotypes qualified of which 42 were clinical measures of quantitative traits (including, for example, blood pressure, biochemical measures in blood, body-mass-index, height, tooth dimensions; a full list of phenotypes is available upon request). The MZ and DZ correlations are summarised in Table 1. The correlations were not separated according to the sex of the individuals in all studies; but for those that did separate the sexes, the overall MZ and DZ correlations were calculated as an average, weighted by the total number of pairs. The distribution of r MZ−2r DZ across all 86 phenotypes is shown in Figure 1. On average the MZ correlation is about twice the DZ correlation across a wide range of phenotypes. If we consider only clinically measured phenotypes and ignore opposite-sex twins then the MZ correlation is clearly less than twice the DZ correlation (Table 1). It is possible but unlikely that the variance due to common environmental factors, assortative mating and non-additive genetic factors exactly cancel each other out by chance. Thus the simplest explanation of the results is that additive variance explains most of the observed similarity of twins and non-additive variance is generally of small magnitude and cannot explain a large proportion of the genetic variance. 10.1371/journal.pgen.1000008.g001 Figure 1 Distribution of r MZ−2r DZ for all traits on human twins. Data are from published papers by N.G. Martin and colleagues of the Queensland Institute of Medical Research, Brisbane (www.genepi.edu.au). Across a wide variety of traits the mean difference between the monozygotic twin correlation and twice the dizygotic twin correlation is close to zero, which is consistent with predominantly additive genetic variance and the absence of a large component of variance due to common environmental effects. 10.1371/journal.pgen.1000008.t001 Table 1 Meta-analysis of MZ and DZ correlations in humansa. Group All phenotypes Clinically measured phenotypes No. traits r No. traits r MZ females 58 0.61 24 0.76 MZ males 48 0.65 24 0.75 DZ females 58 0.34 24 0.45 DZ males 48 0.36 24 0.43 OS pairsb 46 0.29 23 0.36 All MZ 86 0.58 42 0.67 All DZ 86 0.29 42 0.35 MZ−2DZ 86 0.00 42 −0.04 These show the correlations (r) of phenotypes of twins, averaged over ranges of traits estimated in large data sets a Data from published papers by N.G. Martin and colleagues of the Queensland Institute of Medical Research, Brisbane (www.genepi.edu.au) b Opposite sex Model Gene Frequency Distributions In view of the apparent conflict between the observations of high proportions of additive genetic variance (often half or more of the phenotypic variance, and even more of the total genetic variance) and the recent reports of epistasis at quantitative trait loci (QTL) [8], we consider explanations beyond that of simple sampling errors and bias of estimates. We focus particularly on the role that the distribution of gene frequencies may play in the relation between the genetic model and the observed genetic variance components. Genetic variance components depend on the mean value of each genotype and the allele frequencies at the genes affecting the trait [3],[4],[17]. Unfortunately the allele frequencies at most genes affecting complex traits are not known, but the distribution of allele frequencies can be predicted under a range of assumptions. This distribution depends on the magnitude of the evolutionary forces that create and maintain variance, including mutation, selection, drift and migration. As the effects on fitness of genes at many of the loci influencing most quantitative traits are likely to be small, we can invoke theory for neutral alleles to serve as a reference point. An important such reference is the frequency distribution under a balance between mutation and random genetic drift due to finite population size in the absence of selection. If mutations are rare, the distribution of the frequency (p) of the mutant allele is f(p)∝1/p, i.e. approximately L-shaped [2],[35],[36], with the high frequency at the tail being due to mutations arising recently. The allele which increases the value of the trait may be the mutant or ancestral allele, so its frequency has a U-shaped distribution f(p)∝1/p+1/(1−p) = 1/[p(1−p)]. As we shall use it often, we define the ‘U’ distribution explicitly by this formula. For loci at which the mutants are generally deleterious, the frequency distribution will tend to be more concentrated near p = 0 or 1 than for this neutral reference point. As another simple reference point we use the uniform distribution, f(p)∝1, 1/(2N) ≤ p ≤ 1−1/(2N), with N the population size. This approximates the steady state distribution of a neutral mutant gene which has been segregating for a very long time [2], and also has much more density at intermediate gene frequencies than the ‘U’ distribution. Our third reference point is at p = 0.5, as in populations derived from inbred crosses, and is the extreme case of central tendency of gene frequency. These analyses assume a gene frequency distribution which is relevant to no selection. For a more limited range of examples we consider the impact of selection on the partition of variance. We consider a limited range of genetic models, some simple classical ones and others based on published models of metabolic pathways or results of QTL mapping experiments. Uniform: f(p) = 1, assuming N is sufficiently large that the discreteness of the distribution and any non-uniformity as p approaches 1 or 0 can be ignored, i.e. integrated over 0 to 1. This and the ‘U’ gene frequency distributions are, for simplicity, assumed to be continuous. Neutral mutation model (‘U’): f(p)∝1/[p(1−p)]. To standardise the distribution, with population size N assumed to be large, note that Thus , where K∼ln(2N). Genetic variance components are obtained by integration of expressions for the variance as a function of p for a specific model of the gene frequency distribution. For multiple locus models the distribution of all loci is assumed to be identical and there is no linkage disequilibrium. We focus on the contribution of additive genetic variance (V A) to genotypic variance (V G). Genotypic Values Single Locus with Arbitrary Dominance Consider a single biallelic locus with genotypic values for CC, Cc and cc of +a, d and −a, respectively (notation of [3]). Then, from [3] For the uniform distribution of p Hence E(V A) = a 2/3 +d 2/15 and E(V D) = 2d 2/15, giving E(VA)/E(VG) = 1−2d 2/(5a 2+3d 2). For the ‘U’-distribution, assuming N is large, and ignoring terms of O(1/N), the integrals simplify to E(V A)∼(a 2+d 2/3)/K, E(V D)∼d 2/(3K) and E(V A)/E(V G) = 1−d 2/(3a 2+2d 2). Additive × Additive Model without Dominance or Interactions Including Dominance A simple additive × additive epistatic model has these genotypic values: Assuming the frequency of B is p and of C is q, with linkage equilibrium: Mean = M = 2a[pq+(1−p)(1−q)] The average effect of substitution of allele B is given by [37] and hence V A = 2a 2[p(1−p)(1−2q)2+q(1−q)(1−2p)2] = a 2(Hp +Hq −4HpHq ), where H is heterozygosity The AA epistatic effect is given by (αα)BC = ¼ d2 M/dpdq = a. Hence V AA = 4a 2 p(1−p)q(1−q)a 2 = a 2 HpHq and VG  = a 2(Hp +Hq −3HpHq ), Uniform: simple integration gives E(V A) = 2a 2/9, E(V AA) = a 2/9, E(V G) = a 2/3 ‘U’: Similarly E(V AA) = a 2/(4K). Hence E(V A)/E(V G) = (2−4/K)/(2−3/K) = 1−1/(2K−3), which → 1 for large K. The residue, if any, is V AA. Duplicate Factor Model A simple epistatic model involving all epistatic components for two loci is the following: For an arbitrary number (L) of loci (i), the genotypic value is a except for the multiple ‘recessive’ homozygote, and for one locus it is complete dominance. For pi  = 0.5 at all loci: V G = a 2[(½)2L −(¼)2L ], V A = a 2 L(½)4L−1 and V A/V G = 2L/(22L −1). For two loci, V A/V G = 4/15. Uniform: For two loci, E(V A)/E(V G) = 9/16 and declines to 0 as L increases. ‘U’: For two loci For large N, with two loci E(V A) /E(V G) → 4/5 and for very many loci E(V A) /E(V G) → 0 Complementary Model Another simple epistatic model involving all components is the following: which can also be defined for multiple loci. For two loci, for example, using similar methods it can be shown that: for pi  = 0.5, V A/V G = 0.56; with the uniform distribution, E(V A)/E(V G) = 2/3; and with the ‘U’ distribution . Analyses of General Models For two-locus models in which the genotypic values were not functions of simple parameters, the genotypic values were entered as data, and V A and V G calculated as functions of the gene frequencies p and q. Bivariate numerical integration was undertaken using Simpson's rule by computing e.g. V A(p,q)f(p,q) over an (m+1) × (m+1) grid of equally spaced p and q values, taking m = 210 or higher power of 2 as necessary for adequate convergence. Results were computed for some models of metabolic pathways [38],[39] and for some published models obtained from QTL analysis [8]. Results/Discussion Single Locus Model Many general points are illustrated by two simple examples, the single locus model with dominance and the two locus model with AA interaction, so we consider these in more detail. For the single locus model with genotypic values for CC, Cc and cc of +a, d and −a, respectively, V A = 2p(1−p)[a+d(1−2p)]2 and V D = 4p 2(1−p)2 d 2. For d = a, i.e. complete dominance of C, V A = 8p(1−p)3 a 2 and V D = 4p 2(1−p)2 a 2 and thus: at p = 0.5, V A = (2/3)V G; if the dominant allele is rare (i.e. p → 0), V G → 8p and V A/V G → 1, and if it is common, V G → 4p 2 and V A/V G → 0. Note, however, that V G and V A are much higher when the dominant allele is at low frequency, e.g. 0.1, than are V G and V D when the recessive is at low frequency, e.g. p = 0.9. Even for an overdominant locus (a = 0), all genetic variance becomes additive at extreme gene frequencies. Considering now expectations (E) over the frequency distributions, let η 2 = E(V A)/E(V G), an equivalent to narrow sense heritability if V E = 0. For the ‘U’ distribution, η 2 = 1−d 2/(3a 2+2d 2) and for the uniform distribution, η 2 = 1−2d 2/(5a 2+3d 2). Hence, for a completely dominant locus, η 2 = 0.8 and η 2 = 0.75 respectively; whereas V A/V G = 0.67 for p = 0.5. In summary, the fraction of the genetic variance that is additive genetic decreases as the proportion of genes at extreme frequencies decreases (Table 2). 10.1371/journal.pgen.1000008.t002 Table 2 Summary of expected proportion of V G that is V A for different modelsa. Genetic model Distribution of allele frequencies p = ½ Uniform ‘U’ (N  = 100)b ‘U’ (N = 1000) Dominance without epistasis d = ½a 0.89 0.91 0.93 0.93 Dominance without epistasis d = a 0.67 0.75 0.80 0.80 Dominance without epistasis a = 0 0.00 0.33 0.50 0.50 A × A without dominance 0.00 0.67 0.87 0.92 Duplicate factor 2 loci 0.27 0.56 0.71 0.75 Duplicate factor 100 loci 0.00 0.00 0.00 0.00 Complementary 2 loci 0.57 0.67 0.74 0.76 a Models defined in Methods section b Population size Two Locus Additive × Additive Model The genotypic values (see Theory section) for the simple AA model for double homozygotes BBCC and bbcc are +2a and for bbCC and BBcc are 0, and all single or double heterozygotes are intermediate (+a). With linkage equilibrium, V A/V G = 1−HpHq /[Hp +Hq −3HpHq ], where the heterozygosities are Hp and Hq at loci B and C. Thus V A/V G → 1 if either locus is at extreme frequency (i.e. p or q → 0 or 1), and equals 0 when p = q = 0.5. If p = q, for gene frequencies 0.1, 0.2, 0.3 and 0.4, V A/V G = 0.88, 0.69, 0.43 and 0.14. For the uniform distribution η 2 = 2/3, and for the ‘U’ distribution, the variances are a function of the population size, because more extreme frequencies are possible at larger population sizes. Thus η 2 = (2−4/K)/(2−3/K), where K = ln(2N), so η 2 → 1 for large K. Any residue is V AA. These two examples, the single locus and A × A model, illustrate what turns out to be the fundamental point in considering the impact of the gene frequency distribution. When an allele (say C) is rare, so most individuals have genotype Cc or cc, the allelic substitution or average effect of C vs. c accounts for essentially all the differences found in genotypic values; or in other words the linear regression of genotypic value on number of C genes accounts for the genotypic differences (see [3], p 117). Hence almost all V G is accounted for by V A. Other Epistatic Models With the ‘U’ distribution, most genes have one rare allele and so most variance is additive. Further examples (Table 2) illustrate this point, including the duplicate factor and complementary models where there is substantial dominance and epistasis. These models show mostly V A for the ‘U’ distribution for a few loci but the proportion of the variance which is additive genetic declines as the number increases. With many loci, however, such extreme models do not explain the covariance of sibs (i.e. any heritability) or the approximate linearity of inbreeding depression with inbreeding coefficient, F, found in experiments [3],[4],[40],[41],[42], or the linearity in response to artificial selection [43]. We also analysed a well-studied systems biology model of flux in metabolic pathways [38],[39],[44] and found again that the expected proportion of V G that is accounted for by V A is large (Table 3). 10.1371/journal.pgen.1000008.t003 Table 3 Examples of expected proportion of V G that is VA in models of flux in linear metabolic pathways with a model flux J∝[Σ i (1/Ei )]−1 for a system with 10 loci in which 8 are invariant wild type and two (B, C) are mutants. Activities Flux relative to wildtype, J BBCC = 1 E(V A)/E(V G) E bb Ecc J BbCc J bbCC J BBcc J bbcc Distribution of allele frequencies 0.5 Unia U100b U1000c 1 0.1 0.92 1 0.53 0.53 0.81 0.86 0.88 0.88 0.5 0.1 0.90 0.91 0.53 0.50 0.81 0.85 0.88 0.88 0.1 0.1 0.86 0.53 0.53 0.36 0.77 0.82 0.86 0.87 0.1 0.01 0.85 0.53 0.09 0.09 0.72 0.79 0.83 0.84 Enzyme activities are Ei  = 1 for loci 3 to 8, E BB = E CC = 1, values of E bb and E cc are listed, and heterozygotes are intermediate, e.g. E Cc = ½(1+E cc), assuming gene frequency distributions as in Table 2. Flux modelled as [39]. a Uniform b U-shaped with population size of 100 c U-shaped with population size of 1000 Examples of Models from Highly Epistatic Published QTL Analyses A number of QTL analyses using crosses between populations (some inbred, some selected) have been published in which particular pairs (or more) of loci have been identified to have substantial epistatic effects [8]. We consider examples of the more extreme cases of epistasis found, obtaining variance components by numerical integration. Results are shown in Table 4, for examples from [8] deliberately chosen as extreme. Even so, the proportion of the genetic variance that is additive is high with the ‘U’ distribution, except in the dominance × dominance example. Further, as these examples were selected by Carlborg and Haley and us as cases of extreme epistasis, it is not unreasonable to assume that the real epistatic effects are smaller than their estimates. 10.1371/journal.pgen.1000008.t004 Table 4 Examples of expected proportion of V G that is V A in highly epistatic published QTL analyses assuming gene frequency distributions as in Table 2. Modela Genotypic values E(V A)/E(V G) BBCC BbCC bbCC BBCc BbCc bbCc BBcc Bbcc bbcc Distribution of allele frequencies 0.5 Unib U100c U1000d DomEp 4 10 15 11 8 7 10 8 7 0.05 0.52 0.73 0.78 Co-ad 39.0 38.7 35.7 37.6 38.9 37.7 36.8 39.6 40.4 0.11 0.62 0.81 0.85 D × D 4 13 6 13 7 11 5 13 6 0.00 0.15 0.37 0.42 a Values obtained from tables or by interpolation from Box 1c–e of Carlborg and Haley [8]: key to their nomenclature: DomEp: Dominant epistasis (complex); Co-ad: Co-adaptive epistasis; D × D: dominance × dominance epistasis. b Uniform. c U-shaped with population size of 100. d U-shaped with population size of 1000. Relaxation of Assumptions Expectation of a Ratio of Variance Components The formulae we have given have been for the quantities E(V A), E(V G) and the ratio E(V A)/E(V G). The quantity actually observed is V A/V G = Σ iV Ai /ΣV Gi where the expression denotes the sums over loci (i) of the additive and total genetic variance contributed by each in the absence of epistasis or linkage disequilibrium, or in the presence of these, sums over relevant sets of loci. As, for any locus, or for their sum, in general E(V A/V G) ≠ E(V A)/E(V G), we need to consider the relevance of the quantities calculated. Whilst it would be possible to obtain approximations using statistical differentiation [4], formulae are complicated and invoke an assumption of small coefficients of variation of the quantities which does not always hold. Hence we used Monte Carlo simulation and some examples are given in Table 5. It is seen that, except with very few loci, the bias is not great in using the ratio of expectations. In real situations where many loci of differing effects and frequencies are likely to be involved, the bias is likely to be trivial unless a single locus contributes almost all the variance. 10.1371/journal.pgen.1000008.t005 Table 5 Bias in use of E(V A)/E(V G) rather than E(V A/V G) for some models in Table 2 as a function of Numbers of Loci. Uniform distribution E(V A)/E(V G) E(V A/V G) from simulation Locia 64 16 4 1 a = 1, d = 1 0.750 0.749 0.747 0.734 0.609 a = 0, d = 1 0.333 0.335 0.337 0.348 0.430 A × A 0.667 0.667 0.666 0.660 0.646 Dupl. factor 0.562 0.559 0.549 b b ‘U’ distribution with N = 1000 E(V A)/E(V G) E(V A/V G) from simulation Loci* 64 16 4 1 a = 1, d = 1 0.800 0.798 0.796 0.773 0.561 a = 0, d = 1 0.500 0.502 0.516 0.585 0.800 A × A 0.918 0.918 0.919 0.925 0.945 Dupl. factor 0.746 0.743 0.733 b b a Number of loci for non-epistatic cases (complete dominance a = 1, d = 1, and overdominance a = 0, d = 1), numbers of pairs of loci for two-locus epistatic models (A × A and duplicate factor. b Not computed as V G = 0 in some replicates. Influence of Linkage Disequilibrium (LD) In this analysis we have assumed there is Hardy-Weinberg equilibrium (HWE) and linkage equilibrium among the loci. As departures from HWE are transient with random mating, they can be ignored, but LD can persist, and hence the estimated effects at locus C depend on those fitted at B and vice versa. The effect of LD is to reduce the number of haplotypes that segregate in the population so what would be epistatic variance becomes additive or dominance variance. For example, consider the A × A model and complete LD, i.e. equal frequencies at B and C loci and both loci segregating but with only two haplotypes present. Then only Bc and bC haplotypes are present, and genotypic values are 0 for homozygous classes and a for heterozygotes (‘pure’ overdominance), or only BC and bc haplotypes, with genotypic values 2a for homozygotes and a for heterozygotes (‘pure’ underdominance). In either case variances are the same as for the dominance case with a = 0. Thus LD would lead to attribution of real epistatic variance to additive or dominance variance, and would exacerbate the results obtained from discussions of gene frequency distribution. Consequences of Multiple Alleles In these models we have considered solely biallelic loci, appropriate for low mutation rates. Multiallelic loci, in terms of their effects on the trait, can arise from mutations at different structural or control sites. Predictions are complicated by the need to consider k(k−1)/2 genotypic values at a k allelic locus, and many further epistatic terms, so we consider two extreme cases. If the alleles all have similar effects, for example due to a knock-out, the effective mutation rate is increased, but it would require very many such sites for the distribution of frequencies of the trait alleles to differ greatly from proportionality to 1/[p(1−p)]. Such segregation of multiple alleles will be more common in large populations, where in any case the frequency distribution is most extreme, and so the impact is unlikely to be large. A second case is where all alleles have different effects and dominance interactions. Any allelic substitution then produces a change in the mean and so additive variance is present and for example, contributes more V A than does the overdominance model at p = 0.5. Alternative Models The analysis we have given for estimating effects of dominance and epistasis is for the classical method using simple averages over genotypes weighted by their frequencies, which are the least squares estimates in the balanced case and the basis for the analysis of variance [14],[15],[16]. There are alternative parameterisations aimed at exemplifying more clearly the nature of the interactions, including that of ‘physiological epistasis’ [45]. Whilst such alternatives may be of use in the analysis and interpretation of gene or QTL mapping experiments where individual genotypes can be identified or predicted from linked markers, such alternative parameterizations are not feasible in analysis of populations using data solely on the quantitative traits, from which the estimates of genetic variance components and heritability are obtained. Further, as has been pointed out [46], although the estimated effects may differ, the variances explained by different models are generally the same in segregating populations. Effects of Selection on Gene Frequency Distributions and Partition of Variance The ‘U’ and indeed uniform gene frequency distributions are limiting cases applying in the absence of selection on loci affecting the quantitative trait. The results for a wide range of models can be summarised as follows: gene frequencies that cause V A/V G to be small also cause V G to be small. Consequently, when V A and V G are summed over a full range of frequencies, V A/V G is large. This conclusion is dependent on the distribution of gene frequencies being symmetrical, so that cases with large V G and large V A/V G are as common as cases with small V G and small V A/V G. The impact of selection will depend on how it acts on the trait or traits analysed and also on other aspects of fitness, so we need to consider whether the findings are robust to selection. Stabilising selection on the trait, such that individuals with phenotype closest to an optimum are most fit, leads to maintenance of the population mean at or close to the optimum, so that mutants are at a disadvantage if they increase or decrease trait values. Consequently the gene frequency distribution is still broadly U-shaped, but with much more concentration near 0 or 1 [47]. Hence such selection is likely to increase proportions of additive variance. This conclusion would be wrong if there was widespread overdominance at the level of individual genes because this would push gene frequencies to intermediate values. However, the observed inbreeding depression is incompatible with widespread overdominance [48]. Under the neutral mutation or stabilising selection models where gene frequency distributions have extreme U shape, subsequent directional selection will lead to either rapid fixation or increase to intermediate frequency of genes affecting the trait. Even if the distribution of allele frequencies is initially symmetric, a net increase in variance over generations might thus be expected [49] (Chapter 6). Accelerated responses to artificial selection have not been seen, however, in lines founded from natural populations [50]. Calculations show that if genes are analysed independently such an increase in variance with artificial selection can in theory occur following the neutral model only if most gene effects are large (unpublished) or with more extreme frequency distributions following stabilising selection [51]. These ignore the build up of negative gametic disequilibrium through the Bulmer effect [52], however, whereas in simulated multi-locus models of Drosophila no increase in variance was found [51]. Linkage effects would be weaker in species with more chromosomes, but selection lines in these have typically not been founded directly from natural populations. Other types of selection do lead to an asymmetrical distribution of allele frequencies because the unfavourable allele will typically be at a low frequency. We have considered the case of genes whose effect on both the trait measured and on fitness shows complete dominance. Thus recessive and dominant favourable and unfavourable mutants were considered, and their expected contribution to variance computed during their lifetime to fixation or loss, using transition matrix methods. Results are given in Table 6 for population size (N) 100 and selective values (s) of the homozygote of 0.05 (Ns = 5), but the qualitative result is not affected by using weaker or stronger selection. Deleterious, recessive mutations show the lowest V A/V G but even here it is 0.44 and these cases also show the lowest total variance. Consequently, in a trait affected by a mix of genes with varying types of gene action, V A/V G is likely to be well above 0.5. 10.1371/journal.pgen.1000008.t006 Table 6 Expected variance contributed by mutant genes before fixation for population size 100, specified dominance on the quantitative trait (a vs d) and selective (dis)advantage (s in heterozygote and homozygote)a. Model s(het) s(hom) a d E(V G) E(V A)/E(V G) Neutral dominant 0 0 1 1 0.388 0.86 Neutral recessive 0 0 1 −1 0.166 0.66 Neutral randomb 0 0 1 1 or −1 0.277 0.80 Deleterious dominant −0.05 −0.05 1 1 0.145 0.97 Deleterious recessive 0 −0.05 1 −1 0.052 0.44 Advantageous dominant 0.05 0.05 1 1 0.375 0.74 Advantageous recessive 0 0.05 1 −1 0.151 0.71 a e.g., if the mutant gene is completely recessive for the trait and for fitness, d = −a and s(hom) = 0. b Equally likely to be completely dominant or recessive mutants, hence values as in Table 2. Thus if the highest and lowest genotypic values correspond to multiple homozygous classes, it is clear that a high proportion of the variance is expected to be additive genetic even with selection. The potential exceptions occur when there is a maximum at intermediate frequencies, such as with an overdominant locus or some of the cases shown in Table 4. Nevertheless, few confirmed cases of clear overdominance/heterozygote superiority have been found (other than sickle cell anaemia) and the patterns in Table 4 are somewhat erratic. Effect of Population Size and Bottlenecks The theoretical analysis has been undertaken for large populations but much of the experimental data comes from livestock, laboratory animals and humans, all of which have experienced bottlenecks of reduced effective population size. As has been much explored, bottlenecks of population size are likely to change the proportion of variation that is additive, and for example to increase levels of V A for recessives at low frequency [53] and to ‘convert’ epistatic into additive variation [54],[55],[56],[57],[58], thereby increasing the ratio V A/V G. For example, for the additive × additive two locus model, the ratio of variances at inbreeding level F in terms of values at F = 0 is V A(F)/V G(F) = (V A+4FV AA)/(V A+V AA+3FV AA) for any gene frequency (using results of [54], but for loci with dominance or dominance interactions, V A(F)/V G(F) depends on gene frequency. This occurs because the bottleneck leads to the dispersal of gene frequencies and the reduction in mean heterozygosity, so for the AA model, if frequencies are initially intermediate (e.g. 0.5) there is a substantial increase in V A/V G, whereas if frequencies initially follow the ‘U’ distribution, there is little V AA initially, total variance falls and the level of dispersion and V A/V G do not increase appreciably. Indeed, for a population that starts with the gene frequency distribution U-shaped, the loss of heterozygosity is due to fixation. Among the genes that remain segregating the distribution of gene frequencies flattens considerably, and in the absence of new mutation approaches the uniform distribution which has a lower ratio of V A/V G than the ‘U’ distribution. However, despite this, V AA declines faster than V A because, as loci become fixed, the number of pairs of segregating loci declines faster than the number of segregating loci. Thus it is not obvious what effect the bottlenecks in livestock, laboratory or human populations have had on the ratio V A/V G. We suspect it has not been large because, if a large reduction in heterozygosity had occurred, these populations would show low genetic variance and there is no indication that this is the case. In any case, the results show that the conclusion that most genetic variance is additive is fairly robust to assumptions about the distribution of gene frequencies, for instance the ‘U’ and uniform distributions both lead to qualitatively the same conclusion. Evidence for the Effect of Gene Frequency on Variance Components A test of the hypothesis that the lack of non-additive variance observed in populations of humans or animals is because gene frequencies near 0.5 are much less common than those more extreme, not because non-additive effects are absent, is to compare variance components among populations with different gene frequency profiles. For crops such as maize and for laboratory animals, estimates can be got both from outbreds and from populations with gene frequencies of one-half derived from crosses of inbred lines. There are a limited number of possible contrasts and linkage confounds comparisons of variation in F2 and later inter se generations, however, so it is difficult to partition variation between single locus and epistatic components (e.g. [17] ch. 7). The most extensive data are on yield traits in maize. The magnitudes of heritability and of dominance relative to additive variance estimated for different kinds of populations in a substantial number of studies (including 24 on F2 and 27 on open-pollinated, i.e. outbreds) have been summarised [59]. Average estimates of h 2 were 0.19 for open-pollinated populations, 0.23 for synthetics from recombination of many lines, 0.24 for F2 populations, 0.13 for variety crosses and 0.14 for composites. Estimates of V A/V G (from tabulated values of V D/V A [59]) were 0.57, 0.55, 0.50, 0.42 and 0.43, respectively, which are inconclusive but indicate relatively more dominance variance at frequencies of 0.5. Analyses of the magnitude of epistasis at the level of effects, rather than variance, do not provide consistent patterns. For example, in two recent analyses of substantial data sets of F2 populations of maize, one found substantial epistasis [60] and the other almost none [61]. In an analysis of a range of traits in recombinant inbred lines, F2 and triple test crosses [62] in Arabidopsis thaliana, there was substantial additive genetic and dominance variance for all traits, with most estimates of V D/V A in the range 0.3 to 0.5, essentially no significant additive × additive epistatic effects, but several cases of epistasis involving dominance [63]. Although there does appear to be more dominance variance in populations with gene frequencies of one-half than with dispersed frequencies, from these results we cannot reject or accept the hypothesis that there is relatively much more epistatic variance in such populations. One explanation is indeed that there is not a vast amount of epistatic variance in populations at whatever frequency, although another is that maize has unusually small amounts of epistasis. Many additive QTL were identified in an analysis of a line derived from the F2 of highly divergent high and low oil content lines from the long term Illinois maize selection experiment, but with almost no evidence of epistasis or indeed dominance effects [64]. In contrast, an F2 of divergent lines of long-term selected poultry and an F2 from inbred lines of mice showed evidence of highly epistatic QTL effects for body weight [65],[66]. We do not claim to understand these different results, but as has been pointed out [67],[68], QTL with significant epistatic interaction effects might not represent the majority of QTL with small effects contributing to gene networks. Conclusions and Consequences We have summarised empirical evidence for the existence of non-additive genetic variation across a range of species, including that presented here from twin data in humans, and shown that most genetic variance appears to be additive genetic. There are two primary explanations, first that there is indeed little real dominant or epistatic gene action, or second that it is mainly because allele frequencies are distributed towards extreme values, as for example in the neutral mutation model. Complete or partial dominance of genes is common, at least for those of large effect; and epistatic gene action has been reported in some QTL experiments [8],[69]. Detailed analyses in Drosophila melanogaster, using molecular and genetic tools available for it, identify substantial amounts of epistasis, including behavioural traits [70] and abdominal bristle number [71], yet most genetic variation in segregating populations for bristle number appears to be additive (as noted above). But many QTL studies of epistatic gene action suffer from a high degree of multiple testing, increasingly so the more loci and orders of interaction are included, such that they may be exaggerating the amount of epistasis reported. On the assumption that many of the effects are indeed real, we have turned our attention to the second explanation. The theoretical models we have investigated predict high proportions of additive genetic variance even in the presence of non-additive gene action, basically because most alleles are likely to be at extreme frequencies. If the spectrum of allele frequencies is independent of which are the dominant or epistatic alleles, V A/V G is large for almost any pattern of dominance and epistasis because V A/V G is low only at allele frequencies where V G is low, and so contributes little to the total VG . The distribution of allele frequencies is expected to be independent of which are the dominant or epistatic alleles for neutral polymorphisms; but under natural selection the favourable allele is expected to be common and lead to high or low V A/V G depending on whether it is dominant (low V A) or recessive (high V A). The equivalent case for epistasis is that all genotype combinations except one is favourable (low V A) vs. only one genotype combination is favourable (high V A). If genetic variation in traits associated with fitness is due almost entirely to low frequency, deleterious recessive genes which are unresponsive to natural selection, these traits would show low V A/V G. However, neither the empirical evidence nor the theory supports this expectation. There seems to be substantial additive genetic variance for fitness associated traits [21] and fitness itself [30],[31],[72]. Although heritabilities for such traits may be low, they show high additive genetic coefficient of variation (evolvability) [29], and the correlation of repeat records is typically little higher than the heritability (e.g., litter size in pigs), indicating that V A/V G is one-half or more. In agreement with this, when the life history of deleterious, recessive mutants was modelled, V A/V G was found to be 0.44 (Table 6), basically because rare recessives contribute so little variance, albeit most is V D, in non-inbred populations. We believe we have a plausible gene frequency model to explain the minimal amounts of non-additive genetic and particularly epistatic variance. What consequences do our findings have? For animal and plant breeding, maintaining emphasis on utilising additive variation by straightforward selection remains the best strategy. For gene mapping, our results imply that V A is important so we should be able to detect and identify alleles with a significant gene substitution effect within a population. Such variants have been reported from genome-wide association studies in human population [9],[10],[11],[12],[13]. Although there may well be large non-additive gene effects, the power to detect gene-gene interactions in outbred populations is a function of the proportion of variance they explain, so it will be difficult to detect such interactions unless the effects are large and the genes have intermediate frequency. Thus we expect that the success in replicating reported epistatic effects will be even lower than it is for additive or dominance effects, both because multi-locus interactions will be estimated less accurately than main effects and because they explain a lower proportion of the variance. Finally, if epistatic effects are real, gene substitution effects may vary widely between populations which differ in allele frequency, so that significant effects in one population may not replicate in others.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Deciphering the Palimpsest: Studying the Relationship Between Morphological Integration and Phenotypic Covariation.

            Organisms represent a complex arrangement of anatomical structures and individuated parts that must maintain functional associations through development. This integration of variation between functionally related body parts and the modular organization of development are fundamental determinants of their evolvability. This is because integration results in the expression of coordinated variation that can create preferred directions for evolutionary change, while modularity enables variation in a group of traits or regions to accumulate without deleterious effects on other aspects of the organism. Using our own work on both model systems (e.g., lab mice, avians) and natural populations of rodents and primates, we explore in this paper the relationship between patterns of phenotypic covariation and the developmental determinants of integration that those patterns are assumed to reflect. We show that integration cannot be reliably studied through phenotypic covariance patterns alone and argue that the relationship between phenotypic covariation and integration is obscured in two ways. One is the superimposition of multiple determinants of covariance in complex systems and the other is the dependence of covariation structure on variances in covariance-generating processes. As a consequence, we argue that the direct study of the developmental determinants of integration in model systems is necessary to fully interpret patterns of covariation in natural populations, to link covariation patterns to the processes that generate them, and to understand their significance for evolutionary explanation.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A Genome-Wide Association Study Identifies Five Loci Influencing Facial Morphology in Europeans

              Introduction The morphogenesis and patterning of the face is one of the most complex events in mammalian embryogenesis. Signaling cascades initiated from both facial and neighboring tissues mediate transcriptional networks that act to direct fundamental cellular processes such as migration, proliferation, differentiation and controlled cell death. The complexity of human facial development is reflected in the high incidence of congenital craniofacial anomalies, and almost certainly underlies the vast spectrum of subtle variation that characterizes facial appearance in the human population. Facial appearance has a strong genetic component; monozygotic (MZ) twins look more similar than dizygotic (DZ) twins or unrelated individuals. The heritability of craniofacial morphology is as high as 0.8 in twins and families [1], [2], [3]. Some craniofacial traits, such as facial height and position of the lower jaw, appear to be more heritable than others [1], [2], [3]. The general morphology of craniofacial bones is largely genetically determined and partly attributable to environmental factors [4]–[11]. Although genes have been mapped for various rare craniofacial syndromes largely inherited in Mendelian form [12], the genetic basis of normal variation in human facial shape is still poorly understood. An appreciation of the genetic basis of facial shape variation has far reaching implications for understanding the etiology of facial pathologies, the origin of major sensory organ systems, and even the evolution of vertebrates [13], [14]. In addition, it is feasible to speculate that once the majority of genetic determinants of facial morphology are understood, predicting facial appearance from DNA found at a crime scene will become useful as investigative tool in forensic case work [15]. Some externally visible human characteristics, such as eye color [16]–[18] and hair color [19], can already be inferred from a DNA sample with practically useful accuracies. In a recent candidate gene study carried out in two independent European population samples, we investigated a potential association between risk alleles for non-syndromic cleft lip with or without cleft palate (NSCL/P) and nose width and facial width in the normal population [20]. Two NSCL/P associated single nucleotide polymorphisms (SNPs) showed association with different facial phenotypes in different populations. However, facial landmarks derived from 3-Dimensional (3D) magnetic resonance images (MRI) in one population and 2-Dimensional (2D) portrait images in the other population were not completely comparable, posing a challenge for combining phenotype data. In the present study, we focus on the MRI-based approach for capturing facial morphology since previous facial imaging studies by some of us have demonstrated that MRI-derived soft tissue landmarks represent a reliable data source [21], [22]. In geometric morphometrics, there are different ways to deal with the confounders of position and orientation of the landmark configurations, such as (1) superimposition [23], [24] that places the landmarks into a consensus reference frame; (2) deformation [25]–[27], where shape differences are described in terms of deformation fields of one object onto another; and (3) linear distances [28], [29], where Euclidean distances between landmarks instead of their coordinates are measured. Rationality and efficacy of these approaches have been reviewed and compared elsewhere [30]–[32]. We briefly compared these methods in the context of our genome-wide association study (GWAS) (see Methods section) and applied them when appropriate. We extracted facial landmarks from 3D head MRI in 5,388 individuals of European origin from Netherlands, Australia, and Germany, and used partial Procrustes superimposition (PS) [24], [30], [33] to superimpose different sets of facial landmarks onto a consensus 3D Euclidean space. We derived 48 facial shape features from the superimposed landmarks and estimated their heritability in 79 MZ and 90 DZ Australian twin pairs. Subsequently, we conducted a series of GWAS separately for these facial shape dimensions, and attempted to replicate the identified associations in 568 Canadians of European (French) ancestry with similar 3D head MRI phenotypes and additionally sought supporting evidence in further 1,530 individuals from the UK and 2,337 from Australia for whom facial phenotypes were derived from 2D portrait images. Results Characteristics of the study cohorts from The Netherlands (RS1, RS2), Australia (QTIMS, BLTS), Germany (SHIP, SHIP-TREND), Canada (SYS) and the United Kingdom (TwinsUK) are provided in Table 1. All participants included in this study were of European ancestry. Facial landmarks in the discovery cohorts RS1, RS2, QTIMS, SHIP, and SHIP-TREND were derived from directly comparable 3D head MRIs analyzed using the very same method (Figure 1). Similar 3D MRIs were available in SYS but the phenotyping method used here was slightly different (see Method). For the BLTS and TwinsUK cohorts, facial landmarks were derived from 2D portrait photos. The SYS (mean age 15 years), QTIMS (mean age 23 years), and BLTS (mean age 23 years) cohorts were on average much younger than other cohorts considered (mean age over 50 years, Table 1). The majority of the TwinsUK cohort was female (95.5%). 10.1371/journal.pgen.1002932.g001 Figure 1 Nine facial landmarks extracted via image registration tools from 3D MRIs. An MRI of one of the authors (MK) is used for illustration. A, with the landmark for left zygion (ZygL) highlighted, where a clipping plane was used to uncover the bone; B, with the landmarks for left (EyeL) and right pupils (EyeR) highlighted, where a clipping plane was used to uncover the vitreous humor; C, with the four nasal landmarks highlighted, including the left alare, nasion (Nsn), pronasale (Prn), and subnasale (Sbn). 10.1371/journal.pgen.1002932.t001 Table 1 Characteristics of the study subjects (N = 9,823). Cohort Country Individual For Image N Male% Age ± sd RS1 Netherlands Unrelated discovery 3D head MRI 2,470 46.4 59.7 ± 8.0 RS2 Netherlands Unrelated discovery 3D head MRI 745 43.1 59.0 ± 7.9 QTIMS Australia Twins discovery 3D head MRI 545 39.6 23.7 ± 2.3 SHIP Germany Unrelated discovery 3D head MRI 797 47.3 46.0 ± 12.8 SHIP-TREND Germany Unrelated discovery 3D head MRI 831 44.8 50.4 ± 13.6 SYS Canada Siblings replication 3D head MRI 568 48.1 15.1 ± 1.9 TwinsUK UK Twins replication 2D portrait photo 1,530 9.5 58.4 ± 12.9 BLTS Australia Twins replication 2D portrait photo 2,337 47.8 23.6 ± 4.6 For 3D MRI based phenotyping, we focused on the nine most well-defined landmarks from the upper part of the face, including Zygion Right (ZygR), Zygion Left (ZygL), Eyeball Right (EyeR), Eyeball Left (EyeL), Alare Right (AlrR), Alare Left (AlrL), Nasion (Nsn), Pronasale (Prn), and Subnasale (Sbn) (Figure 1). The lower part of the face i.e., from underneath the nose further down was not available due to brain-focussed MRI scanning. Raw landmark coordinates from 5,388 individuals in the five discovery cohorts (RS1, RS2, QTIMS, SHIP, and SHIP-TREND) showed systematic differences in position and orientation (Figure 2A). They were superimposed onto a consensus 3D Euclidean space based on partial PS (Figure 2B). A total of 27 principal components (PCs), and the centroid size parameter, were derived from the superimposed landmarks. Eleven PCs, each explaining >1% and all together explaining 96.0% of the total shape variance, were selected for further genetic association analysis (Table S1). Furthermore, we derived 36 Euclidean distances between each pair of landmarks. The partial PS had no effect on inter-landmark distances i.e., the distances remain the same after the superimposition. We derived the phenotypic correlations in discovery cohorts containing only adults or young adults. The SYS cohort was excluded from this correlation analysis because the changes through adolescence may confound the effect of age. Centroid size was highly correlated with the first PC (r = 0.96, Table S1) as well as with all 36 inter-landmark distances (mean r = 0.76; minimal r = 0.56 for Prn-Sbn; maximal r = 0.94 for ZygR-AlrL; Table S2). Inter-landmark distances were also correlated with each other (mean r = 0.56; minimal r = 0.10 between EyeL-AlrL and ZygL-EyeL; maximal r = 0.96 between AlrR-Nsn and AlrL-Nsn; Table S2). The distances between symmetric landmarks all showed the highest correlations (Table S2), consistent with general knowledge about facial symmetry. Compared with females, males on average had greater centroid size (P 3%); further investigations of less common and/or quite rare variants as for instance available from next generation sequencing data may provide a more complete figure on the genetic basis of facial morphology. In spite of these limitations we have been able to demonstrate that a phenotype as complex as human facial morphology can be successfully investigated via the GWAS approach with a moderately large sample size. Three of the five loci highlighted here map to known developmental genes with a previously demonstrated role in craniofacial patterning, one of which has been unequivocally associated with nasion position in a recent independent GWAS [34]. The remaining two loci map to or close to C5orf50 and COL17A1, neither of which have previously been implicated in facial development. The associated DNA variants may either affect neighboring genes, or alternatively identify C5orf50, and COL17A1 as potential new players in the molecular regulation of facial patterning. Overall, we have uncovered five genetic loci that contribute to normal differences in facial shape, representing a significant advance in our knowledge of the genetic determination of facial morphology. Our findings may serve as a starting point for future studies, which may test for allele specific expression of these candidate genes and re-sequence their coding regions to identify possible functional variants. Moreover, our data also highlight that the high heritability of facial shape phenotypes (as estimated here and elsewhere), similar to that of adult body height [53], involves many common DNA variants with relatively small phenotypic effects. Future GWAS on the facial phenotype should therefore employ increased sample sizes as this has helped to identify more genes for many other complex human phenotypes such as height [53] and various human diseases. Combined with the emerging advances in 3D imaging techniques, this offers the poteintial to advance our understanding of the complex molecular interactions governing both normal and pathological variations in facial shape. Materials and Methods Rotterdam Study (RS), The Netherlands The RS is an ongoing population-based prospective study including a main cohort RS-I [54] and two extensions RS-II and RS-III [55], [56], including 15,000 participants altogether, of whom 12,000 have GWAS data. Collection and purification of DNA have been described in detail previously [57]. A subset of participants were scanned on a 1.5 T General Electric MRI unit (GE Healthcare, Milwaukee, WI, USA), using an imaging protocol including a 3D T1-weighted fast RF gradient recalled acquisition in steady state with an inversion recovery prepulse. The following parameters were used: 192 slices, a resolution of 0.49×0.49×0.8 mm3 (up sampled from 0.6×0.7×0.8 mm3 using zero padding in the frequency domain), a repetition time (TR) of 13.8 ms, an echo time (TE) of 2.8 ms, an inversion time (TI) of 400 ms, and a flip angle of 20°. More details on image acquisition can be found elsewhere [58]. The Medical Ethics Committee of the Erasmus MC, University Medical Center Rotterdam, approved the study protocol, and all participants provided written informed consent. Principal components analysis of SNP microarray data was used to identify ancestry outliers. These were removed before further analyses, and the present sample is of exclusively northern/western European origin. The current study included 3,215 RS participants who had both SNP microarray data and 3D MRI. These participants were considered here as two cohorts (RS1 N = 2,470 and RS2 N = 745) as they were scanned and genotyped at different times. Brisbane Longitudinal Twin Study (BLTS) and Queensland Twin Imaging Study (QTIMS), Australia Adolescent twins and their siblings were recruited over a period of sixteen years into the BLTS at 12, 14 and 16 years, as detailed elsewhere [59] and as young adults into the QTIMS [60], [61]. The present study includes a sub-sample of 545 young adults (aged 20–30 years, M = 23.7±2.3 years; 79 MZ and 90 DZ pairs, 110 unpaired twins, and 97 singletons, from a total of 332 families) from QTIMS with 3D MRIs, and 2,137 adolescents (aged 10–22 years, M = 15.6±1.5 years; 311 MZ and 530 DZ pairs, 44 unpaired twins and 411 singletons, from a total of 1,038 families) from BLTS with 2D portrait photos. 3D T1-weighted MR images were collected at the Centre for Advanced Imaging, University of Queensland, using a 4T Bruker Medspec whole body scanner (Bruker Medical, Ettingen, Germany) [61]. 2D portrait photos were taken from a distance of 1–2 meters for identification, with no specific instruction given to smile. Those who had both 3D MRI scans and 2D photos were included in discovery GWAS and excluded from the replication analysis in 2D photos. Over 70% were digital with the remainder being scanned from high quality film. The study was approved by the Human Research Ethics Committee, Queensland Institute of Medical Research. Informed consent was obtained from all participants (or parent/guardian for those aged less than 18 years). Study of Health In Pomerania (SHIP), Germany The SHIP is a longitudinal population based cohort study assessing the prevalence and incidence of common, population relevant diseases and their risk factors with examinations at baseline (SHIP-0, 1997–2001), 5-year-follow-Up (SHIP-1, 2002–2006) and an ongoing 10-year-follow-Up (SHIP-2, 2008–2012) [62], [63]. Data collection from the baseline sample included 4,308 subjects. A new cohort targeted 5,000 participants (SHIP-TREND) has been started parallel to the SHIP-2-Follow-Up. In addition to the baseline examinations, participants of SHIP-2 and SHIP-TREND also had a whole-body MRI scan [64]. MRI examinations were performed on a 1.5T MR imager (Magnetom Avanto; Siemens Medical Systems, Erlangen, Germany). Head scans were taken with an axial ultra-fast gradient echo sequence (T1 MPRage, TE 1900.0, TR 3.4, Flip angle 15°, 1.0×1.0×1.0 mm voxel size). The present study includes 797 SHIP as well as 831 SHIP-TREND participants which had both SNP genotyping data and 3D MRI. The medical ethics committee of University of Greifswald approved the study protocol, and oral and written informed consents were obtained from each of the study participants. Saguenay Youth Study (SYS), Canada Adolescent sibpairs (12 to 18 years of age) were recruited from a French-Canadian population with a known founder effect living in the Saguenay-Lac-Saint-Jean region of Quebec, Canada in the context of the ongoing Saguenay Youth Study [65]. Local ethics committee approved the study; the parents and adolescent participants gave informed consent and assent, respectively. MRI was acquired on a Phillips 1.0-T magnet using the following parameters: 3D radio frequency spoiled gradient-echo scan with 140–160 slices, an isotropic resolution of 1 mm, TR 25 ms, TE 5 ms, and flip angle of 30°. Outlying individuals including those with putative Indigenous American admixture were excluded based on genetic outlier analysis [66]. The current study contains 568 adolescents with MRI and SNP data. St. Thomas' UK Adult Twin Registry (TwinsUK), United Kingdom The TwinsUK cohort is unselected for any disease and is representative of the general UK population [67]. All were volunteers, recruited through national media campaigns. Written informed consent was obtained from every participant. Population substructure and admixture was excluded using eigenvector analysis on SNP microarray data. The current study included 1,366 individuals with 2D portrait photos and SNP microarray data. Facial landmarks from 3D head MRI In our discovery cohorts, since the lower part of the face was not available from the MRIs, we focused on nine landmarks of the upper face (Figure 1). These included Right (ZygR) and left (ZygL) zygion: the most lateral point located on the cortex of the zygomatic arches; right (EyeR) and left (EyeL) eyeball: the middle point of the eyeball; right (AlrR) and left (AlrL) alare: the most lateral point on the surface of the ala nasi; nasion (Nsn): the skin point where the bridge of the nose meets the forehead; pronasale (Prn): the most anterior tip of the nose; subnasale (Sbn): the point where the base of the nasal septum meets the philtrum. Although these landmarks provide only a very sparse representation of the facial shape, they cover most prominent facial features and are easy to interpret and compare to other studies [22], [34], [51]. Furthermore, these landmarks could be measured with higher accuracy from images than most semilandmarks [22]. The coordinates of these nine landmarks were derived with an automated technique as described previously [20], which uses image registration to transfer predefined landmarks from a limited set of annotated images to an unmarked image. The manual annotation was based on landmark definitions from the anthropology literature [68], which were adapted for application to T1-weighted MR images of the head. None of the MRI showed distortions in a visual inspection. Furthermore, the automatically localized landmark positions were robust against the number of samples included. The test-retest correlations based on a subset of 40 subjects from QTIMS who were scanned twice were high (r>0.99). In the SYS cohort, in total 56 facial landmarks were available from a previous quantitative analysis of craniofacial morphology using 3D MRI [22]. In brief, an average MRI was constructed using non-rigid image registration. The surface of this average image represents the mean facial features and was then annotated with 56 landmarks and semi-landmarks. These landmarks were then warped using the nonlinear transformation that maps each subject to the average. This allows for automatic identification of the different craniofacial landmarks. Facial landmarks from 2D portrait photos We defined eight landmarks in 2D portrait photos that approximately correspond to the respective landmarks ascertained from our 3D MRIs. These include EyeL, EyeR, Prn, AlrL, AlrR, Nsn, earlobe left (EarL), and earlobe right (EarR) (Figure 5A). Note the Sbn, ZygL and ZygR landmarks available in 3D MRIs could not derived in 2D photos. We developed an algorithm to locate these landmarks in 2D portrait images and implemented it in an in-house C++ program. Briefly, the algorithm first recognizes the face layout within an image by matching a face template. It then recognizes eyes, nose, and ears by matching corresponding templates. The template matching routines were based on external open source computer vision library, OpenCV 2.3.1 (http://sourceforge.net/projects/opencvlibrary/). The automatically identified landmarks were then manually adjusted by 5 research assistants on a standard computer screen. Facial shape phenotypes We used un-scaled PS, or partial PS [24], [69], to superimpose the landmarks from the 3D MRIs in the discovery cohorts onto a consensus 3D Euclidean space. Unlike full PS, partial PS only re-positions and re-orientates but does not rescale the landmark configurations; thus, it has no effect on the Euclidean distances between landmarks as measured in terms of millimeters from MRIs. Keeping the absolute inter-landmark distances allows us to interpret the association results more directly. Furthermore, the full PS has been criticized for introducing artificial correlations between landmarks [70]. We considered the centroid size as a measurement of face size, and it was significantly correlated with absolute head volume (r = 0.95). We derived 11 principal components (PCs) from the superimposed landmarks, each explaining at least 1% of the total phenotypic variance. We also derived 36 Euclidean distances between all pairs of landmarks. Thus 48 phenotypes were included in our GWAS, including centroid size, 11 PCs, and 36 inter-landmark distances. All phenotypes were approximately normally distributed and outliers (>3sd) were removed. Deformation approaches including the use of transformational grids provide an alternate way to study shape difference. Thin plate splines (TPS) [27] depicts the deformation geometrically, where the total deformation is decomposed into several orthogonal components to localize and illustrate the shape differences. We used TPS to illustrate the facial shape differences between males and females using the tpsgrid function in R library shapes. For 3D MRI data in SYS, we used the 56 landmarks derived in a previous study and calculated 1,540 Euclidean distances between all pairs of landmarks. These distances were considered as phenotypes in our replication analysis of GWAS findings. We also chose a subset of nine landmarks most closely resembled those used for the current study for exact replication. Since the size of the face vary substantially between 2D portrait images, we used the full PS [24] to also remove the scaling differences between landmark configurations. Note that the inter-landmark distances from 2D photos do not represent the absolute distances in terms of millimeters regardless of whether full or partial PS was used. After superimposition, we calculated 28 Euclidean distances between all pairs of the 8 landmarks, which were considered as phenotypes in the replication analysis. The PS analyses were performed with CRAN package shapes developed by Ian Dryden [30]. Heritability analysis By clarifying which facial features are under strong genetic control, we should be better able to identify specific genes that influence facial variation. Heritability estimates are also important indicators of the phenotype quality. Using QTIMS (79 MZ pairs, 90 DZ pairs) heritability analysis was carried out in Mx [71] using full information maximum likelihood estimation of additive genetic variance (i.e. heritability), common environmental variance, and unique environmental variance. Sex and age were included as covariates. Phenotypic correlations were estimated in BLTS (311 MZ pairs, 90 DZ pairs) and in TwinsUK (93 MZ pairs, 352 DZ pairs) where the facial shape phenotypes were derived from 2D photos. Genotyping, imputation, quality control Details of SNP microarray genotyping, quality control and genotype imputation are described in prior GWAS conducted in RS [16], QTIMS and BLTS [72], SHIP [62], SYS [73], and TwinsUK [74]. In brief, DNA samples from the RS, BTNS, SYS and TwinsUK cohorts were genotyped using the Human 500–610 Quad Arrays of Illumina and samples from SHIP were genotyped using the Genome-Wide Human SNP Array 6.0 of Affymetrix and HumanOmni2.5 of Illumina, respectively. Genotyping of the SHIP-TREND probands (n = 986) was performed using the Illumina HumanOmni2.5-Quad, which has not been reported previously and described here as follows. DNA from whole blood was prepared using the Gentra Puregene Blood Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. Purity and concentration of DNA was determined using a NanoDrop ND-1000 UV-Vis Spectrophotometer (Thermo Scientific). The integrity of all DNA preparations was validated by electrophoresis using 0.8% agarose-1x TBE gels stained with ethidium bromide. Subsequent sample processing and array hybridization was performed as described by the manufacturer (Illumina) at the Helmholtz Zentrum München. Genotypes were called with the GenCall algorithm of GenomeStudio Genotyping Module v1.0. Arrays with a call rate below 94%, duplicate samples as identified by estimated IBD as well as individuals with reported and genotyped gender mismatch were excluded. The final sample call rate was 99.51%. Imputation of genotypes in the SHIP-TREND cohort was performed with the software IMPUTE v2.1.2.3 against the HapMap II (CEU v22, Build 36) reference panel. 667,024 SNPs were excluded before imputation (HWE p-value≤0.0001, call rate ≤0.95, monomorphic SNPs) and 366 SNPs were removed after imputation due to duplicate RSID but different positions. The total number of SNPs after imputation and quality control was 3,437,411. The genetic data analysis workflow was created using the Software InforSense. Genetic data were stored using the database Caché (InterSystems). After SNP imputation to the HapMap Phase II CEU reference panel (Build 36) and quality control, 2,558,979 autosomal SNPs were common in all discovery cohorts and used for analyses. GWAS and replication We conducted discovery phase GWAS in a combined set of all discovery cohorts (RS1, RS2, QTIM, SHIP, SHIP-Trend) for 48 facial shape phenotypes. Imputed GWAS data in all discovery cohorts were merged according to the positive strand. We tested 2,558,979 autosomal SNPs with linear regression (adjusted for sex, age, EIGENSTRAT-derived ancestry informative covariates [75], plus any additional ancestry informative covariates as appropriate) in GenABEL [76]. The centroid size was adjusted in the analysis of inter-landmark distances. SNPs with MAF 3% (results not shown). All SNPs with P values<5×10−8 in our discovery phase GWAS were sought for replication in SYS, TwinsUK, and BLTS. Promising SNPs were tested for association with 1,054 inter-landmark distances in SYS and 28 inter-landmark distances in a combined sample of TwinsUK and BLTS assuming additive allelic effect adjusted for sex and age using MERLIN [79], which also takes into account family relationships. We report the association results for the same phenotypes as discovered in GWAS as exact replication. In addition, for each SNP we report the percentage of significantly (P<0.05) associated phenotypes, which is expected to be lower than 5% under the null hypothesis of no association. For the analysis of 11 NSCL/P associated SNPs in our discovery cohorts, we additionally Bonferroni corrected the P values for 48 correlated phenotypes since no specific facial phenotypes were selected for replication. Supporting Information Figure S1 Thin plate spline deformation illustrating facial shape differences in males compared to females in discovery cohorts (N = 5388). The pixel information obtained from the mean shape of males was mapped to that of females. The deformed images illustrate the difference between the mean shpae of males (the curved plates) compared to that of females (imaginary flat plates). A. a 3D view of the mean facial shape of all individuals in the discovery cohorts before deformation; B. side projection of the deformed grid; C. front projection of the deformed grid; D. up-down projection of the deformed grid. (TIF) Click here for additional data file. Table S1 Correlation between PC and Euclidean distances in discovery cohorts. The color shade is independent between columns. (XLSX) Click here for additional data file. Table S2 Correlation matrix between pair-wise Euclidean distances and size in discovery cohorts. (XLSX) Click here for additional data file. Table S3 Effect of sex and age on facial shape in discovery cohorts. Distances are presented in millimeters. P values were adjusted for centroid size. (XLSX) Click here for additional data file. Table S4 Heritability of facial shape phenotypes derived from 3D MRI in QTIMS. Twin correlations and proportions of variance due to additive genetic (A), common environmental (C), and unique environmental (E) influences, shown with 95% confidence intervals (age and sex adjusted). (XLSX) Click here for additional data file. Table S5 SNPs (n = 102) associated (P<5e-7) with facial shape phenotypes in discovery phase GWAS. Trait, the phenotype for which the minimal P value was obtained. MinP, the minimal P value. (XLSX) Click here for additional data file. Table S6 Raw genotype and respective phenotype data for all SNPs that revealed genome-wide significant and suggestive evidence. (XLSX) Click here for additional data file.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                November 2014
                6 November 2014
                : 10
                : 11
                : e1004724
                Affiliations
                [1 ]Department of Cell Biology and Anatomy and the Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
                [2 ]Department of Mathematics, Florida State University, Tallahassee, Florida, United States of America
                [3 ]Department of Orthopedics, University of California, San Francisco, San Francisco, California, United States of America
                [4 ]Department of Pediatrics and Human Medical Genetics and Genomics Program, University of Colorado School of Medicine, Denver, Colorado, United States of America
                Seattle Children's Research Institute, United States of America
                Author notes

                The authors have declared that no competing interests exist.

                Wrote the paper: BH RS WM RSM. Other: BH WM RS.

                Article
                PGENETICS-D-14-00945
                10.1371/journal.pgen.1004724
                4222688
                25375250
                a481167a-57c8-47ba-9bb5-704f1111d17d
                Copyright @ 2014

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                Page count
                Pages: 3
                Funding
                The authors received support for research related to the topic of this perspective from NSERC Grant #238992-12 to BH, NIH 1R01DE021708 to BH and RM and NIH 1U01DE020054 to RS. The funders had no role in the preparation of the manuscript.
                Categories
                Formal Comment

                Genetics
                Genetics

                Comments

                Comment on this article