26
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Gaggle: An open-source software system for integrating bioinformatics software and data sources

      product-review

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and of predicted protein structure. A crucial challenge is to combine the capabilities of these (and other forthcoming) data resources and tools to create a data exploration and analysis environment that does justice to the variety and complexity of systems biology data sets. A solution to this problem should recognize that data types, formats and software in this high throughput age of biology are constantly changing.

          Results

          In this paper we describe the Gaggle -a simple, open-source Java software environment that helps to solve the problem of software and database integration. Guided by the classic software engineering strategy of separation of concerns and a policy of semantic flexibility, it integrates existing popular programs and web resources into a user-friendly, easily-extended environment.

          We demonstrate that four simple data types (names, matrices, networks, and associative arrays) are sufficient to bring together diverse databases and software. We highlight some capabilities of the Gaggle with an exploration of Helicobacter pylori pathogenesis genes, in which we identify a putative ricin-like protein -a discovery made possible by simultaneous data exploration using a wide range of publicly available data and a variety of popular bioinformatics software tools.

          Conclusion

          We have integrated diverse databases (for example, KEGG, BioCyc, String) and software (Cytoscape, DataMatrixViewer, R statistical environment, and TIGR Microarray Expression Viewer). Through this loose coupling of diverse software and databases the Gaggle enables simultaneous exploration of experimental data (mRNA and protein abundance, protein-protein and protein-DNA interactions), functional associations (operon, chromosomal proximity, phylogenetic pattern), metabolic pathways (KEGG) and Pubmed abstracts (STRING web resource), creating an exploratory environment useful to 'web browser and spreadsheet biologists', to statistically savvy computational biologists, and those in between. The Gaggle uses Java RMI and Java Web Start technologies and can be found at http://gaggle.systemsbiology.net.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: not found
          • Article: not found

          Helicobacter pylori infection.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The KEGG database.

            KEGG (http://www.genome.ad.jp/kegg/) is a suite of databases and associated software for understanding and simulating higher-order functional behaviours of the cell or the organism from its genome information. First, KEGG computerizes data and knowledge on protein interaction networks (PATHWAY database) and chemical reactions (LIGAND database) that are responsible for various cellular processes. Second, KEGG attempts to reconstruct protein interaction networks for all organisms whose genomes are completely sequenced (GENES and SSDB databases). Third, KEGG can be utilized as reference knowledge for functional genomics (EXPRESSION database) and proteomics (BRITE database) experiments. I will review the current status of KEGG and report on new developments in graph representation and graph computations.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              On the criteria to be used in decomposing systems into modules

              D. Parnas (1972)
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                2006
                28 March 2006
                : 7
                : 176
                Affiliations
                [1 ]Institute for Systems Biology, 1441 N 34 th Street, Seattle, WA 98103, USA
                [2 ]Department of Biology, New York University, 100 Washington Square E, New York, NY 10003, USA
                Article
                1471-2105-7-176
                10.1186/1471-2105-7-176
                1464137
                16569235
                9888a678-d606-4632-bdd3-87eee3f53fb5
                Copyright © 2006 Shannon et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 27 October 2005
                : 28 March 2006
                Categories
                Software

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article