56
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      NeXML: Rich, Extensible, and Verifiable Representation of Comparative Data and Metadata

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input–output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.

          Related collections

          Most cited references57

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Creating the CIPRES Science Gateway for inference of large phylogenetic trees

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              GenBank

              GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 300 000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank® staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
                Bookmark

                Author and article information

                Journal
                Syst Biol
                Syst. Biol
                sysbio
                sysbio
                Systematic Biology
                Oxford University Press
                1063-5157
                1076-836X
                July 2012
                22 February 2012
                22 February 2012
                : 61
                : 4
                : 675-689
                Affiliations
                [1 ]NCB Naturalis, Leiden, the Netherlands
                [2 ]National Evolutionary Synthesis Center, Durham, NC, USA
                [3 ]Department of Biology, University of North Carolina, Chapel Hill, NC, USA
                [4 ]Department of Biological Sciences, Wayne State University, USA
                [5 ]Department of Ecology and Evolutionary Biology, University of Kansas, USA
                [6 ]Departments of Zoology and Botany, and Beaty Biodiversity Museum, University of British Columbia, Canada
                [7 ]Department of Mechanical Engineering, Indian Institute of Technology Kharagpur, India
                [8 ]Department of Biology, University of Ottawa, Canada
                [9 ]Biochemical Science Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
                Author notes
                [* ]Correspondence to be sent to: NCB Naturalis, Postbus 9517, 2300 RA, Leiden, the Netherlands; E-mail: rutger.vos@ 123456ncbnaturalis.nl .

                Associate Editor: Peter Foster

                Article
                sys025
                10.1093/sysbio/sys025
                3376374
                22357728
                e95d9521-9fcd-4459-9aec-1152f405e0a5
                © The Author(s) 2012. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 16 May 2011
                : 29 July 2011
                : 7 February 2012
                Page count
                Pages: 15
                Categories
                Regular Articles

                Animal science & Zoology
                interoperability,phyloinformatics,semantic web,evolutionary informatics,syntax format,data standards

                Comments

                Comment on this article