Discovering and visualizing indirect associations between biomedical concepts

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Motivation: Discovering useful associations between biomedical concepts has been one of the main goals in biomedical text-mining, and understanding their biomedical contexts is crucial in the discovery process. Hence, we need a text-mining system that helps users explore various types of (possibly hidden) associations in an easy and comprehensible manner.

Results: This article describes FACTA+, a real-time text-mining system for finding and visualizing indirect associations between biomedical concepts from MEDLINE abstracts. The system can be used as a text search engine like PubMed with additional features to help users discover and visualize indirect associations between important biomedical concepts such as genes, diseases and chemical compounds. FACTA+ inherits all functionality from its predecessor, FACTA, and extends it by incorporating three new features: (i) detecting biomolecular events in text using a machine learning model, (ii) discovering hidden associations using co-occurrence statistics between concepts, and (iii) visualizing associations to improve the interpretability of the output. To the best of our knowledge, FACTA+ is the first real-time web application that offers the functionality of finding concepts involving biomolecular events and visualizing indirect associations of concepts with both their categories and importance.

Availability: FACTA+ is available as a web application at http://refine1-nactem.mc.man.ac.uk/facta/, and its visualizer is available at http://refine1-nactem.mc.man.ac.uk/facta-visualizer/.

Contact: tsuruoka@ 123456jaist.ac.jp

Related collections

Most cited references 66

Record: found
Abstract: found
Article: found

Is Open Access

Content-rich biological network constructed by mining PubMed abstracts

Hao Chen, Burt Sharp (2004)

Background The integration of the rapidly expanding corpus of information about the genome, transcriptome, and proteome, engendered by powerful technological advances, such as microarrays, and the availability of genomic sequence from multiple species, challenges the grasp and comprehension of the scientific community. Despite the existence of text-mining methods that identify biological relationships based on the textual co-occurrence of gene/protein terms or similarities in abstract texts, knowledge of the underlying molecular connections on a large scale, which is prerequisite to understanding novel biological processes, lags far behind the accumulation of data. While computationally efficient, the co-occurrence-based approaches fail to characterize (e.g., inhibition or stimulation, directionality) biological interactions. Programs with natural language processing (NLP) capability have been created to address these limitations, however, they are in general not readily accessible to the public. Results We present a NLP-based text-mining approach, Chilibot, which constructs content-rich relationship networks among biological concepts, genes, proteins, or drugs. Amongst its features, suggestions for new hypotheses can be generated. Lastly, we provide evidence that the connectivity of molecular networks extracted from the biological literature follows the power-law distribution, indicating scale-free topologies consistent with the results of previous experimental analyses. Conclusions Chilibot distills scientific relationships from knowledge available throughout a wide range of biological domains and presents these in a content-rich graphical format, thus integrating general biomedical knowledge with the specialized knowledge and interests of the user. Chilibot can be accessed free of charge to academic users.

0 comments Cited 96 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Corpus annotation for mining biomedical events from literature

Jin-Dong Kim, Tomoko Ohta, Jun'ichi Tsujii (2008)

Background Advanced Text Mining (TM) such as semantic enrichment of papers, event or relation extraction, and intelligent Question Answering have increasingly attracted attention in the bio-medical domain. For such attempts to succeed, text annotation from the biological point of view is indispensable. However, due to the complexity of the task, semantic annotation has never been tried on a large scale, apart from relatively simple term annotation. Results We have completed a new type of semantic annotation, event annotation, which is an addition to the existing annotations in the GENIA corpus. The corpus has already been annotated with POS (Parts of Speech), syntactic trees, terms, etc. The new annotation was made on half of the GENIA corpus, consisting of 1,000 Medline abstracts. It contains 9,372 sentences in which 36,114 events are identified. The major challenges during event annotation were (1) to design a scheme of annotation which meets specific requirements of text annotation, (2) to achieve biology-oriented annotation which reflect biologists' interpretation of text, and (3) to ensure the homogeneity of annotation quality across annotators. To meet these challenges, we introduced new concepts such as Single-facet Annotation and Semantic Typing, which have collectively contributed to successful completion of a large scale annotation. Conclusion The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain.

0 comments Cited 87 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Fish oil, Raynaud's syndrome, and undiscovered public knowledge.

K Swanson (1985)

0 comments Cited 87 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Bioinformatics

Journal ID (publisher-id): bioinformatics

Journal ID (hwp): bioinfo

Title: Bioinformatics

Publisher: Oxford University Press

ISSN (Print): 1367-4803

ISSN (Electronic): 1367-4811

Publication date (Print): 1 July 2011

Publication date (Electronic): 14 June 2011

Publication date PMC-release: 14 June 2011

Volume: 27

Issue: 13

Pages: i111-i119

Affiliations

¹School of Information Science, Japan Advanced Institute of Science and Technology (JAIST), Nomi, ²Department of Computer Science, The University of Tokyo, Tokyo, Japan, ³The National Centre for Text Mining (NaCTeM), ⁴School of Computer Science, The University of Manchester, Manchester, UK and ⁵Microsoft Research Asia, Beijing, China

Author notes

* To whom correspondence should be addressed.

Article

Publisher ID: btr214

DOI: 10.1093/bioinformatics/btr214

PMC ID: 3117364

PubMed ID: 21685059

SO-VID: 9ddc7aa4-e2fe-482a-a4b1-1fae1ddae7c2

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Discovering and visualizing indirect associations between biomedical concepts

Read this article at

Abstract

Related collections

Genetoberfest

Most cited references 66

Content-rich biological network constructed by mining PubMed abstracts

Corpus annotation for mining biomedical events from literature

Fish oil, Raynaud's syndrome, and undiscovered public knowledge.

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 111

Cited by 32

Most referenced authors 359