13
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Identifying and relating biological concepts in the Catalogue of Life

      , 1 , 1 , 1 , 2

      Journal of Biomedical Semantics

      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          In this paper we describe our experience of adding globally unique identifiers to the Species 2000 and ITIS Catalogue of Life, an on-line index of organisms which is intended, ultimately, to cover all the world's known species. The scientific species names held in the Catalogue are names that already play an extensive role as terms in the organisation of information about living organisms in bioinformatics and other domains, but the effectiveness of their use is hindered by variation in individuals' opinions and understanding of these terms; indeed, in some cases more than one name will have been used to refer to the same organism. This means that it is desirable to be able to give unique labels to each of these differing concepts within the catalogue and to be able to determine which concepts are being used in other systems, in order that they can be associated with the concepts in the catalogue. Not only is this needed, but it is also necessary to know the relationships between alternative concepts that scientists might have employed, as these determine what can be inferred when data associated with related concepts is being processed. A further complication is that the catalogue itself is evolving as scientific opinion changes due to an increasing understanding of life.

          Results

          We describe how we are using Life Science Identifiers (LSIDs) as globally unique identifiers in the Catalogue of Life, explaining how the mapping to species concepts is performed, how concepts are associated with specific editions of the catalogue, and how the Taxon Concept Schema has been adopted in order to express information about concepts and their relationships. We explore the implications of using globally unique identifiers in order to refer to abstract concepts such as species, which incorporate at least a measure of subjectivity in their definition, in contrast with the more traditional use of such identifiers to refer to more tangible entities, events, documents, observations, etc.

          Conclusions

          A major reason for adopting identifiers such as LSIDs is to facilitate data integration. We have demonstrated the incorporation of LSIDs into the Catalogue of Life, in a manner consistent with the biodiversity informatics community's conventions for LSID use. The Catalogue of Life is therefore available as a taxonomy of organisms for use within various disciplines, including biomedical research, by software written with an awareness of these conventions.

          Related collections

          Most cited references 9

          • Record: found
          • Abstract: found
          • Article: not found

          Biodiversity informatics: the challenge of linking data and the role of shared identifiers.

           Roderic Page (2008)
          A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers [such as Digital Object Identifiers (DOIs) and Life Science Identifiers (LSIDs)], and the implementation of services that link those identifiers.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            MIRIAM Resources: tools to generate and resolve robust cross-references in Systems Biology

            Background The Minimal Information Requested In the Annotation of biochemical Models (MIRIAM) is a set of guidelines for the annotation and curation processes of computational models, in order to facilitate their exchange and reuse. An important part of the standard consists in the controlled annotation of model components, based on Uniform Resource Identifiers. In order to enable interoperability of this annotation, the community has to agree on a set of standard URIs, corresponding to recognised data types. MIRIAM Resources are being developed to support the use of those URIs. Results MIRIAM Resources are a set of on-line services created to catalogue data types, their URIs and the corresponding physical URLs (or resources), whether data types are controlled vocabularies or primary data resources. MIRIAM Resources are composed of several components: MIRIAM Database stores the information, MIRIAM Web Services allows to programmatically access the database, MIRIAM Library provides an access to the Web Services and MIRIAM Web Application is a way to access the data (human browsing) and also to edit or add entries. Conclusions The project MIRIAM Resources allows an easy access to MIRIAM URIs and the associated information and is therefore crucial to foster a general use of MIRIAM annotations in computational models of biological processes.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Globally distributed object identification for biological knowledgebases.

              The World-Wide Web provides a globally distributed communication framework that is essential for almost all scientific collaboration, including bioinformatics. However, several limits and inadequacies have become apparent, one of which is the inability to programmatically identify locally named objects that may be widely distributed over the network. This shortcoming limits our ability to integrate multiple knowledgebases, each of which gives partial information of a shared domain, as is commonly seen in bioinformatics. The Life Science Identifier (LSID) and LSID Resolution System (LSRS) provide simple and elegant solutions to this problem, based on the extension of existing internet technologies. LSID and LSRS are consistent with next-generation semantic web and semantic grid approaches. This article describes the syntax, operations, infrastructure compatibility considerations, use cases and potential future applications of LSID and LSRS. We see the adoption of these methods as important steps toward simpler, more elegant and more reliable integration of the world's biological knowledgebases, and as facilitating stronger global collaboration in biology.
                Bookmark

                Author and article information

                Journal
                J Biomed Semantics
                Journal of Biomedical Semantics
                BioMed Central
                2041-1480
                2011
                17 October 2011
                : 2
                : 7
                Affiliations
                [1 ]Cardiff School of Computer Science & Informatics, Cardiff University, Queen's Buildings, 5 The Parade, Cardiff CF24 3AA, UK
                [2 ]Covalent Software Ltd, 3 Hammet Street, Taunton, Somerset TA1 1RZ, UK
                Article
                2041-1480-2-7
                10.1186/2041-1480-2-7
                3245425
                22004596
                Copyright ©2011 Jones et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                Categories
                Research

                Bioinformatics & Computational biology

                Comments

                Comment on this article