<p class="first" id="d360977e180">The current excitement around artificial intelligence
and the renewed interest in
“deep learning” (DL) have been applied to the genetic analysis of complex traits;
however, the performance of DL for genomic prediction of complex...
</p><p class="first" id="d360977e183">The genetic analysis of complex traits does
not escape the current excitement around
artificial intelligence, including a renewed interest in “deep learning” (DL) techniques
such as Multilayer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs). However,
the performance of DL for genomic prediction of complex human traits has not been
comprehensively tested. To provide an evaluation of MLPs and CNNs, we used data from
distantly related white Caucasian individuals (
<i>n</i> ∼100k individuals,
<i>m</i> ∼500k SNPs, and
<i>k</i> = 1000) of the interim release of the UK Biobank. We analyzed a total of
five phenotypes:
height, bone heel mineral density, body mass index, systolic blood pressure, and waist–hip
ratio, with genomic heritabilities ranging from ∼0.20 to 0.70. After hyperparameter
optimization using a genetic algorithm, we considered several configurations, from
shallow to deep learners, and compared the predictive performance of MLPs and CNNs
with that of Bayesian linear regressions across sets of SNPs (from 10k to 50k) that
were preselected using single-marker regression analyses. For height, a highly heritable
phenotype, all methods performed similarly, although CNNs were slightly but consistently
worse. For the rest of the phenotypes, the performance of some CNNs was comparable
or slightly better than linear methods. Performance of MLPs was highly dependent on
SNP set and phenotype. In all, over the range of traits evaluated in this study, CNN
performance was competitive to linear models, but we did not find any case where DL
outperformed the linear model by a sizable margin. We suggest that more research is
needed to adapt CNN methodology, originally motivated by image analysis, to genetic-based
problems in order for CNNs to be competitive with linear models.
</p>