80
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Eleven grand challenges in single-cell data science

      review-article
      1 , 2 , 3 , 1 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 13 , 26 , 27 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 28 , 34 , 29 , 30 , 31 , 35 , 36 , 15 , 16 , 35 , 37 , 38 , 39 , 40 , 23 , 3 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 1 , 49 , 50 , 51 , 5 , 17 , 18 , 29 , 30 , 52 , 42 , 40 , 46 , 53 , 54 , 55 , 56 , 57 , 3 , 58 , 59 , 27 , 28 ,
      Genome Biology
      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands—or even millions—of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.

          Related collections

          Most cited references221

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          mixOmics: An R package for ‘omics feature selection and multiple data integration

          The advent of high throughput technologies has led to a wealth of publicly available ‘omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a ‘molecular signature’) to explain or predict biological conditions, but mainly for a single type of ‘omics. In addition, commonly used methods are univariate and consider each biological feature independently. We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a systems biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous ‘omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple ‘omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latest mixOmics integrative frameworks for the multivariate analyses of ‘omics data available from the package.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo

            High-throughput mapping of cellular differentiation hierarchies from single-cell data promises to empower systematic interrogations of vertebrate development and disease. Here, we applied single-cell RNA sequencing to >92,000 cells from zebrafish embryos during the first day of development. Using a graph-based approach, we mapped a cell state landscape that describes axis patterning, germ layer formation, and organogenesis. We tested how clonally related cells traverse this landscape by developing a transposon-based barcoding approach ("TracerSeq") for reconstructing single-cell lineage histories. Clonally related cells were often restricted by the state landscape, including a case in which two independent lineages converge on similar fates. Cell fates remained restricted to this landscape in chordin-deficient embryos. We provide web-based resources for further analysis of the single-cell data.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Bias, robustness and scalability in single-cell differential expression analysis

              Many methods have been used to determine differential gene expression from single-cell RNA (scRNA)-seq data. We evaluated 36 approaches using experimental and synthetic data and found considerable differences in the number and characteristics of the genes that are called differentially expressed. Prefiltering of lowly expressed genes has important effects, particularly for some of the methods developed for bulk RNA-seq data analysis. However, we found that bulk RNA-seq analysis methods do not generally perform worse than those developed specifically for scRNA-seq. We also present conquer, a repository of consistently processed, analysis-ready public scRNA-seq data sets that is aimed at simplifying method evaluation and reanalysis of published results. Each data set provides abundance estimates for both genes and transcripts, as well as quality control and exploratory analysis reports.
                Bookmark

                Author and article information

                Contributors
                mark.robinson@imls.uzh.ch
                as@cwi.nl
                Journal
                Genome Biol
                Genome Biol
                Genome Biology
                BioMed Central (London )
                1474-7596
                1474-760X
                7 February 2020
                7 February 2020
                2020
                : 21
                : 31
                Affiliations
                [1 ]GRID grid.5718.b, ISNI 0000 0001 2187 5445, Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, , University of Duisburg-Essen, ; Essen, Germany
                [2 ]Department of Paediatric Oncology, Haematology and Immunology, Medical Faculty, Heinrich Heine University, University Hospital, Düsseldorf, Germany
                [3 ]GRID grid.7490.a, ISNI 0000 0001 2238 295X, Computational Biology of Infection Research Group, , Helmholtz Centre for Infection Research, ; Braunschweig, Germany
                [4 ]GRID grid.38142.3c, ISNI 000000041936754X, Medical Oncology, Dana-Farber Cancer Institute, , Harvard Medical School, ; Boston, USA
                [5 ]GRID grid.12847.38, ISNI 0000 0004 1937 1290, Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, , University of Warsaw, ; Warszawa, Poland
                [6 ]GRID grid.1073.5, ISNI 0000 0004 0626 201X, Bioinformatics and Cellular Genomics, , St Vincent’s Institute of Medical Research, ; Fitzroy, Australia
                [7 ]GRID grid.1008.9, ISNI 0000 0001 2179 088X, Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, ; Melbourne, Australia
                [8 ]GRID grid.21107.35, ISNI 0000 0001 2171 9311, Department of Biostatistics, , Johns Hopkins University, ; Baltimore, MD USA
                [9 ]GRID grid.7400.3, ISNI 0000 0004 1937 0650, Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zürich, ; Zürich, Switzerland
                [10 ]MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, UK
                [11 ]The Alan Turing Institute, British Library, London, UK
                [12 ]GRID grid.17091.3e, ISNI 0000 0001 2288 9830, Department of Statistics, , University of British Columbia, ; Vancouver, Canada
                [13 ]GRID grid.248762.d, ISNI 0000 0001 0702 3000, Department of Molecular Oncology, BC Cancer Agency, ; Vancouver, Canada
                [14 ]GRID grid.17091.3e, ISNI 0000 0001 2288 9830, Data Science Institute, University of British Columbia, ; Vancouver, Canada
                [15 ]GRID grid.5801.c, ISNI 0000 0001 2156 2780, Department of Biosystems Science and Engineering, ETH Zurich, ; Basel, Switzerland
                [16 ]GRID grid.419765.8, ISNI 0000 0001 2223 3006, SIB Swiss Institute of Bioinformatics, ; Lausanne, Switzerland
                [17 ]GRID grid.10419.3d, ISNI 0000000089452978, Leiden Computational Biology Center, Leiden University Medical Center, ; Leiden, The Netherlands
                [18 ]GRID grid.5292.c, ISNI 0000 0001 2097 4740, Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, ; Delft, The Netherlands
                [19 ]GRID grid.32224.35, ISNI 0000 0004 0386 9924, Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, ; Charlestown, USA
                [20 ]GRID grid.38142.3c, ISNI 000000041936754X, Department of Pathology, , Harvard Medical School, ; Boston, USA
                [21 ]GRID grid.66859.34, Broad Institute of Harvard and MIT, ; Cambridge, MA USA
                [22 ]GRID grid.256304.6, ISNI 0000 0004 1936 7400, Department of Computer Science, , Georgia State University, ; Atlanta, USA
                [23 ]GRID grid.424699.4, ISNI 0000 0001 2275 2842, Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, ; Heidelberg, Germany
                [24 ]GRID grid.7892.4, ISNI 0000 0001 0075 5874, Institute for Theoretical Informatics, Karlsruhe Institute of Technology, ; Karlsruhe, Germany
                [25 ]GRID grid.7722.0, ISNI 0000 0001 1811 6966, Institute for Research in Biomedicine, The Barcelona Institute of Science and Technology, ; Barcelona, Spain
                [26 ]GRID grid.17091.3e, ISNI 0000 0001 2288 9830, Department of Pathology and Laboratory Medicine, , University of British Columbia, ; Vancouver, Canada
                [27 ]GRID grid.6054.7, ISNI 0000 0004 0369 4183, Life Sciences and Health, Centrum Wiskunde & Informatica, ; Amsterdam, The Netherlands
                [28 ]GRID grid.5477.1, ISNI 0000000120346234, Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, ; Utrecht, The Netherlands
                [29 ]GRID grid.7692.a, ISNI 0000000090126352, Center for Molecular Medicine, University Medical Center Utrecht, ; Utrecht, The Netherlands
                [30 ]GRID grid.499559.d, Oncode Institute, ; Utrecht, The Netherlands
                [31 ]GRID grid.419927.0, ISNI 0000 0000 9471 3191, Quantitative biology, Hubrecht Institute, ; Utrecht, The Netherlands
                [32 ]GRID grid.7177.6, ISNI 0000000084992262, Institute for Advanced Study, University of Amsterdam, ; Amsterdam, The Netherlands
                [33 ]GRID grid.7445.2, ISNI 0000 0001 2113 8111, Department of Surgery and Cancer, The Imperial Centre for Translational and Experimental Medicine, , Imperial College London, ; London, UK
                [34 ]GRID grid.10417.33, ISNI 0000 0004 0444 9382, Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, ; Nijmegen, The Netherlands
                [35 ]European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
                [36 ]GRID grid.4818.5, ISNI 0000 0001 0791 5666, Bioinformatics Group, Wageningen University, ; Wageningen, The Netherlands
                [37 ]GRID grid.4818.5, ISNI 0000 0001 0791 5666, Biometris, Wageningen University & Research, ; Wageningen, The Netherlands
                [38 ]GRID grid.10419.3d, ISNI 0000000089452978, Department of Immunohematology and Blood Transfusion, , Leiden University Medical Center, ; Leiden, The Netherlands
                [39 ]GRID grid.10419.3d, ISNI 0000000089452978, Department of Biomedical Data Sciences, , Leiden University Medical Center, ; Leiden, The Netherlands
                [40 ]GRID grid.4709.a, ISNI 0000 0004 0495 846X, Genome Biology Unit, European Molecular Biology Laboratory, ; Heidelberg, Germany
                [41 ]GRID grid.5292.c, ISNI 0000 0001 2097 4740, PRB lab, Delft University of Technology, ; Delft, The Netherlands
                [42 ]GRID grid.10419.3d, ISNI 0000000089452978, Division of Image Processing, Department of Radiology, , Leiden University Medical Center, ; Leiden, The Netherlands
                [43 ]GRID grid.63054.34, ISNI 0000 0001 0860 4915, Computer Science & Engineering Department, , University of Connecticut, ; Storrs, USA
                [44 ]GRID grid.470869.4, ISNI 0000 0004 0634 2060, Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, ; Cambridge, UK
                [45 ]GRID grid.10306.34, ISNI 0000 0004 0606 5382, Wellcome Trust Sanger Institute, Wellcome Genome Campus, ; Hinxton, UK
                [46 ]GRID grid.225360.0, ISNI 0000 0000 9709 7726, European Molecular Biology Laboratory, European Bioinformatics Institute, ; Hinxton, UK
                [47 ]GRID grid.11749.3a, ISNI 0000 0001 2167 7588, Center for Bioinformatics, Saarland University, ; Saarbrücken, Germany
                [48 ]GRID grid.419528.3, ISNI 0000 0004 0491 9823, Max Planck Institute for Informatics, ; Saarbrücken, Germany
                [49 ]Institute of Pathology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
                [50 ]GRID grid.425649.8, ISNI 0000 0001 1010 926X, Computation molecular design, Zuse Institute Berlin, ; Berlin, Germany
                [51 ]Mathematics Department, Mount Saint Vincent, New York, USA
                [52 ]GRID grid.498164.6, Helmholtz Institute for RNA-based Infection Research, Helmholtz-Center for Infection Research, ; Würzburg, Germany
                [53 ]GRID grid.7497.d, ISNI 0000 0004 0492 0584, Division of Computational Genomics and Systems Genetics, German Cancer Research Center–DKFZ, ; Heidelberg, Germany
                [54 ]GRID grid.4567.0, ISNI 0000 0004 0483 2525, Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, ; Neuherberg, Germany
                [55 ]GRID grid.5132.5, ISNI 0000 0001 2312 1970, Division of Drug Discovery and Safety, Leiden Academic Center for Drug Research–LACDR–Leiden University, ; Leiden, The Netherlands
                [56 ]GRID grid.256304.6, ISNI 0000 0004 1936 7400, Department of Computer Science, , Georgia State University, ; Atlanta, USA
                [57 ]GRID grid.448878.f, ISNI 0000 0001 2288 8774, The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, ; Moscow, Russia
                [58 ]GRID grid.16750.35, ISNI 0000 0001 2097 5006, Department of Computer Science, , Princeton University, ; Princeton, USA
                [59 ]GRID grid.51462.34, ISNI 0000 0001 2171 9952, Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, ; New York, USA
                Author information
                http://orcid.org/0000-0002-3048-5518
                Article
                1926
                10.1186/s13059-020-1926-6
                7007675
                32033589
                7e981f96-e630-4e06-8f78-0962ad261422
                © The Author(s) 2020

                Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 2 August 2019
                : 2 January 2020
                Categories
                Review
                Custom metadata
                © The Author(s) 2020

                Genetics
                Genetics

                Comments

                Comment on this article