4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      From sequence to information

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Today massive amounts of sequenced metagenomic and metatranscriptomic data from different ecological niches and environmental locations are available. Scientific progress depends critically on methods that allow extracting useful information from the various types of sequence data. Here, we will first discuss types of information contained in the various flavours of biological sequence data, and how this information can be interpreted to increase our scientific knowledge and understanding. We argue that a mechanistic understanding of biological systems analysed from different perspectives is required to consistently interpret experimental observations, and that this understanding is greatly facilitated by the generation and analysis of dynamic mathematical models. We conclude that, in order to construct mathematical models and to test mechanistic hypotheses, time-series data are of critical importance. We review diverse techniques to analyse time-series data and discuss various approaches by which time-series of biological sequence data have been successfully used to derive and test mechanistic hypotheses. Analysing the bottlenecks of current strategies in the extraction of knowledge and understanding from data, we conclude that combined experimental and theoretical efforts should be implemented as early as possible during the planning phase of individual experiments and scientific research projects.

          This article is part of the theme issue ‘Integrative research perspectives on marine conservation’.

          Related collections

          Most cited references157

          • Record: found
          • Abstract: found
          • Article: not found

          Cytoscape: a software environment for integrated models of biomolecular interaction networks.

          Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            KEGG: kyoto encyclopedia of genes and genomes.

            M Kanehisa (2000)
            KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

              S Altschul (1997)
              The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
                Bookmark

                Author and article information

                Journal
                Philos Trans R Soc Lond B Biol Sci
                Philos Trans R Soc Lond B Biol Sci
                RSTB
                royptb
                Philosophical Transactions of the Royal Society B: Biological Sciences
                The Royal Society
                0962-8436
                1471-2970
                21 December 2020
                2 November 2020
                2 November 2020
                : 375
                : 1814 , Theme issue ‘Integrative research perspectives on marine conservation’ compiled and edited by Helmut Hillebrand, Heather Leslie and Ute Jacob
                : 20190448
                Affiliations
                [1 ]Institute of Quantitative and Theoretical Biology , CEPLAS, Heinrich-Heine University Düsseldorf, Germany
                [2 ]Cluster of Excellence on Plant Sciences , CEPLAS, Heinrich-Heine University Düsseldorf, Germany
                Author notes
                Author information
                http://orcid.org/0000-0003-4470-0378
                http://orcid.org/0000-0002-0993-9247
                http://orcid.org/0000-0002-7229-7398
                Article
                rstb20190448
                10.1098/rstb.2019.0448
                7662195
                33131436
                e998aa25-9e8a-41de-9585-e8b288d1d722
                © 2020 The Authors.

                Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

                History
                : 17 March 2020
                Funding
                Funded by: Deutsche Forschungsgemeinschaft, http://dx.doi.org/10.13039/501100001659;
                Award ID: EXC 2048/1, Project ID 390686111
                Funded by: Strategischer Forschungsfond der Heinrich-Heine Universität Düsseldorf;
                Award ID: SFF-F 2019/1571-1 Popa
                Categories
                1001
                44
                203
                181
                22
                Articles
                Review Article
                Custom metadata
                December 21, 2020

                Philosophy of science
                data,sequence,information,entropy,genome,time-series,modelling
                Philosophy of science
                data, sequence, information, entropy, genome, time-series, modelling

                Comments

                Comment on this article