Blog
About

31
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011

      , 1 , 2 , 3 , 1 , 2 , 4 , 4 , 4 , 4 , 5 , 1 , 2

      BMC Bioinformatics

      BioMed Central

      BioNLP Shared Task 2011

      23-24 June 2011

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties.

          Related collections

          Most cited references 39

          • Record: found
          • Abstract: not found
          • Article: not found

          Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            MTML-msBayes: Approximate Bayesian comparative phylogeographic inference from multiple taxa and multiple loci with rate heterogeneity

            Background MTML-msBayes uses hierarchical approximate Bayesian computation (HABC) under a coalescent model to infer temporal patterns of divergence and gene flow across codistributed taxon-pairs. Under a model of multiple codistributed taxa that diverge into taxon-pairs with subsequent gene flow or isolation, one can estimate hyper-parameters that quantify the mean and variability in divergence times or test models of migration and isolation. The software uses multi-locus DNA sequence data collected from multiple taxon-pairs and allows variation across taxa in demographic parameters as well as heterogeneity in DNA mutation rates across loci. The method also allows a flexible sampling scheme: different numbers of loci of varying length can be sampled from different taxon-pairs. Results Simulation tests reveal increasing power with increasing numbers of loci when attempting to distinguish temporal congruence from incongruence in divergence times across taxon-pairs. These results are robust to DNA mutation rate heterogeneity. Estimating mean divergence times and testing simultaneous divergence was less accurate with migration, but improved if one specified the correct migration model. Simulation validation tests demonstrated that one can detect the correct migration or isolation model with high probability, and that this HABC model testing procedure was greatly improved by incorporating a summary statistic originally developed for this task (Wakeley's ΨW ). The method is applied to an empirical data set of three Australian avian taxon-pairs and a result of simultaneous divergence with some subsequent gene flow is inferred. Conclusions To retain flexibility and compatibility with existing bioinformatics tools, MTML-msBayes is a pipeline software package consisting of Perl, C and R programs that are executed via the command line. Source code and binaries are available for download at http://msbayes.sourceforge.net/ under an open source license (GNU Public License).
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Stimulus perception in bacterial signal-transducing histidine kinases.

              Two-component signal-transducing systems are ubiquitously distributed communication interfaces in bacteria. They consist of a histidine kinase that senses a specific environmental stimulus and a cognate response regulator that mediates the cellular response, mostly through differential expression of target genes. Histidine kinases are typically transmembrane proteins harboring at least two domains: an input (or sensor) domain and a cytoplasmic transmitter (or kinase) domain. They can be identified and classified by virtue of their conserved cytoplasmic kinase domains. In contrast, the sensor domains are highly variable, reflecting the plethora of different signals and modes of sensing. In order to gain insight into the mechanisms of stimulus perception by bacterial histidine kinases, we here survey sensor domain architecture and topology within the bacterial membrane, functional aspects related to this topology, and sequence and phylogenetic conservation. Based on these criteria, three groups of histidine kinases can be differentiated. (i) Periplasmic-sensing histidine kinases detect their stimuli (often small solutes) through an extracellular input domain. (ii) Histidine kinases with sensing mechanisms linked to the transmembrane regions detect stimuli (usually membrane-associated stimuli, such as ionic strength, osmolarity, turgor, or functional state of the cell envelope) via their membrane-spanning segments and sometimes via additional short extracellular loops. (iii) Cytoplasmic-sensing histidine kinases (either membrane anchored or soluble) detect cellular or diffusible signals reporting the metabolic or developmental state of the cell. This review provides an overview of mechanisms of stimulus perception for members of all three groups of bacterial signal-transducing histidine kinases.
                Bookmark

                Author and article information

                Conference
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2012
                26 June 2012
                : 13
                : Suppl 11
                : S2
                Affiliations
                [1 ]School of Computer Science, University of Manchester, Manchester, UK
                [2 ]National Centre for Text Mining, University of Manchester, Manchester, UK
                [3 ]Department of Computer Science, University of Tokyo, Tokyo, Japan
                [4 ]Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, USA
                [5 ]Microsoft Research Asia, Beijing, China
                Article
                1471-2105-13-S11-S2
                10.1186/1471-2105-13-S11-S2
                3384257
                22759456
                Copyright ©2012 Pyysalo et al.; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                BioNLP Shared Task 2011
                Portland, OR, USA
                23-24 June 2011
                Categories
                Proceedings

                Bioinformatics & Computational biology

                Comments

                Comment on this article