+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Opportunities and obstacles for deep learning in biology and medicine

      1 , 2 , 3 , 4 , 5 , 2 , 6 , 7 , 2 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 17 , 12 , 18 , 15 , 19 , 20 , 21 , 22 , 23 , 15 , 24 , 25 , 26 , 17 , 15 , 16 , 24 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 2

      Journal of the Royal Society Interface

      The Royal Society

      deep learning, genomics, precision medicine, machine learning

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems—patient classification, fundamental biological processes and treatment of patients—and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.

          Related collections

          Most cited references 399

          • Record: found
          • Abstract: found
          • Article: not found

          Deep learning.

          Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
            • Record: found
            • Abstract: found
            • Article: not found

            An Integrated Encyclopedia of DNA Elements in the Human Genome

            Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research.
              • Record: found
              • Abstract: found
              • Article: not found

              A framework for variation discovery and genotyping using next-generation DNA sequencing data

              Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.

                Author and article information

                J R Soc Interface
                J R Soc Interface
                Journal of the Royal Society Interface
                The Royal Society
                April 2018
                4 April 2018
                4 April 2018
                : 15
                : 141
                [1 ]Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa , Honolulu, HI, USA
                [2 ]Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania , Philadelphia, PA, USA
                [3 ]Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania , Philadelphia, PA, USA
                [4 ]Department of Computational Medicine and Bioinformatics, University of Michigan Medical School , Ann Arbor, MI, USA
                [5 ]Harvard Medical School , Boston, MA, USA
                [6 ]Computational Biology and Stats, Target Sciences , GlaxoSmithKline, Stevenage, UK
                [7 ]Data Science Institute, Imperial College London , London, UK
                [8 ]Princess Margaret Cancer Centre , Toronto, Ontario, Canada
                [9 ]Department of Medical Biophysics , University of Toronto, Toronto, Ontario, Canada
                [10 ]Department of Computer Science , University of Toronto, Toronto, Ontario, Canada
                [11 ]Electrical Engineering and Computer Science, Vanderbilt University , Nashville, TN, USA
                [12 ]Ecological and Evolutionary Signal-processing and Informatics Laboratory, Department of Electrical and Computer Engineering, Drexel University , Philadelphia, PA, USA
                [13 ]Computational Biology Department, School of Computer Science, Carnegie Mellon University , Pittsburgh, PA, USA
                [14 ]Biophysics Program, Stanford University , Stanford, CA, USA
                [15 ]Department of Computer Science, Stanford University , Stanford, CA, USA
                [16 ]Department of Genetics, Stanford University , Stanford, CA, USA
                [17 ]Department of Computer Science, University of Virginia , Charlottesville, VA, USA
                [18 ]Imaging Platform, Broad Institute of Harvard and MIT , Cambridge, MA, USA
                [19 ]Toyota Technological Institute at Chicago , Chicago, IL, USA
                [20 ]Department of Computer Science, Trinity University , San Antonio, TX, USA
                [21 ]Lewis-Sigler Institute for Integrative Genomics, Princeton University , Princeton, NJ, USA
                [22 ]Integrative Bioinformatics, National Institute of Environmental Health Sciences, National Institutes of Health , Research Triangle Park, NC, USA
                [23 ]Howard Hughes Medical Institute , Janelia Research Campus, Ashburn, VA, USA
                [24 ]National Center for Biotechnology Information and National Library of Medicine, National Institutes of Health , Bethesda, MD, USA
                [25 ]Department of Wildlife Ecology and Conservation, University of Florida , Gainesville, FL, USA
                [26 ] , Austin, TX, USA
                [27 ]Division of Biomedical Informatics and Personalized Medicine, University of Colorado School of Medicine , Aurora, CO, USA
                [28 ]Institute of Organic Chemistry, Westfälische Wilhelms-Universität Münster , Münster, Germany
                [29 ]Innovation Center for Biomedical Informatics, Georgetown University Medical Center , Washington, DC, USA
                [30 ]Department of Pathology and Immunology, Washington University in Saint Louis , St Louis, MO, USA
                [31 ]Department of Medicine, Brown University , Providence, RI, USA
                [32 ]Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison , Madison, WI, USA
                [33 ]Morgridge Institute for Research , Madison, WI, USA
                Author notes

                Author order was determined with a randomized algorithm.

                © 2018 The Authors.

                Published by the Royal Society under the terms of the Creative Commons Attribution License, which permits unrestricted use, provided the original author and source are credited.

                Funded by: Gordon and Betty Moore Foundation,;
                Award ID: GBMF 4552
                Award ID: GBMF 4563
                Funded by: National Institutes of Health,;
                Award ID: DP2GM123485
                Award ID: P30CA051008
                Award ID: R01AI116794
                Award ID: R01GM089652
                Award ID: R01GM089753
                Award ID: R01LM012222
                Award ID: R01LM012482
                Award ID: R21CA220398
                Award ID: T32GM007753
                Award ID: T32HG000046
                Award ID: U54AI117924
                Funded by: Roy and Diana Vagelos Scholars Program in the Molecular Life Sciences;
                Funded by: U.S. National Library of Medicine,;
                Award ID: Intramural Research Program
                Funded by: National Science Foundation,;
                Award ID: 1245632
                Award ID: 1531594
                Award ID: 1564955
                Funded by: Natural Sciences and Engineering Research Council of Canada,;
                Award ID: RGPIN-2015-3948
                Funded by: NSF;
                Award ID: 1245632
                Award ID: 1531594
                Award ID: 1564955
                Funded by: Howard Hughes Medical Institute,;
                Review Articles
                Headline Review
                Custom metadata
                April, 2018

                Life sciences

                precision medicine, genomics, deep learning, machine learning


                Comment on this article