An ensemble of neural models for nested adverse drug events and medication extraction with subwords

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Objective

This article describes an ensembling system to automatically extract adverse drug events and drug related entities from clinical narratives, which was developed for the 2018 n2c2 Shared Task Track 2.

Materials and Methods

We designed a neural model to tackle both nested (entities embedded in other entities) and polysemous entities (entities annotated with multiple semantic types) based on MIMIC III discharge summaries. To better represent rare and unknown words in entities, we further tokenized the MIMIC III data set by splitting the words into finer-grained subwords. We finally combined all the models to boost the performance. Additionally, we implemented a featured-based conditional random field model and created an ensemble to combine its predictions with those of the neural model.

Results

Our method achieved 92.78% lenient micro F1-score, with 95.99% lenient precision, and 89.79% lenient recall, respectively. Experimental results showed that combining the predictions of either multiple models, or of a single model with different settings can improve performance.

Discussion

Analysis of the development set showed that our neural models can detect more informative text regions than feature-based conditional random field models. Furthermore, most entity types significantly benefit from subword representation, which also allows us to extract sparse entities, especially nested entities.

Conclusion

The overall results have demonstrated that the ensemble method can accurately recognize entities, including nested and polysemous entities. Additionally, our method can recognize sparse entities by reconsidering the clinical narratives at a finer-grained subword level, rather than at the word level.

Related collections

Most cited references 13

Record: found
Abstract: found
Article: found

Is Open Access

Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features

Azadeh Nikfarjam, Abeed Sarker, Karen O’Connor … (2015)

Objective Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. Methods We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words’ semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. Results ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. Conclusion It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets.

0 comments Cited 132 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances

Sumithra Velupillai, Hanna Suominen, Maria Liakata … (2018)

The importance of incorporating Natural Language Processing (NLP) methods in clinical informatics research has been increasingly recognized over the past years, and has led to transformative advances. Typically, clinical NLP systems are developed and evaluated on word, sentence, or document level annotations that model specific attributes and features, such as document content (e.g., patient status, or report type), document section types (e.g., current medications, past medical history, or discharge summary), named entities and concepts (e.g., diagnoses, symptoms, or treatments) or semantic attributes (e.g., negation, severity, or temporality). From a clinical perspective, on the other hand, research studies are typically modelled and evaluated on a patient- or population-level, such as predicting how a patient group might respond to specific treatments or patient monitoring over time. While some NLP tasks consider predictions at the individual or group user level, these tasks still constitute a minority. Owing to the discrepancy between scientific objectives of each field, and because of differences in methodological evaluation priorities, there is no clear alignment between these evaluation approaches. Here we provide a broad summary and outline of the challenging issues involved in defining appropriate intrinsic and extrinsic evaluation methods for NLP research that is to be used for clinical outcomes research, and vice versa. A particular focus is placed on mental health research, an area still relatively understudied by the clinical NLP research community, but where NLP methods are of notable relevance. Recent advances in clinical NLP method development have been significant, but we propose more emphasis needs to be placed on rigorous evaluation for the field to advance further. To enable this, we provide actionable suggestions, including a minimal protocol that could be used when reporting clinical NLP method development and its evaluation.

0 comments Cited 55 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

OSCAR4: a flexible architecture for chemical text-mining

David M Jessop, Sam E Adams, Egon Willighagen … (2011)

The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.

0 comments Cited 44 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): J Am Med Inform Assoc

Journal ID (iso-abbrev): J Am Med Inform Assoc

Journal ID (publisher-id): jamia

Title: Journal of the American Medical Informatics Association : JAMIA

Publisher: Oxford University Press

ISSN (Print): 1067-5027

ISSN (Electronic): 1527-974X

Publication date Collection: January 2020

Publication date (Electronic): 14 June 2019

Publication date PMC-release: 14 June 2019

Volume: 27

Issue: 1

Pages: 22-30

Affiliations

[1 ] National Centre for Text Mining, School of Computer Science, The University of Manchester , Manchester, UK

[2 ] Toyota Technological Institute , Nagoya, Japan

[3 ] Artificial Intelligence Research Centre (AIRC), National Institute of Advanced Industrial Science and Technology (AIST) , Tokyo, Japan

Author notes

Corresponding Author: Sophia Ananiadou, PhD, National Centre for Text Mining, School of Computer Science, The University of Manchester, Manchester Interdisciplinary Biocentre, 131 Princess Street, Manchester M1 7DN, UK( sophia.ananiadou@ 123456manchester.ac.uk )

Article

Publisher ID: ocz075

DOI: 10.1093/jamia/ocz075

PMC ID: 6913208

PubMed ID: 31197355

SO-VID: ee12cbff-58c6-481f-861d-5fa05cca45f6

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 30 January 2019

Date revision received : 22 March 2019

Date accepted : 07 May 2019

Page count

Pages: 9

Funding

Funded by: EMPATHY

Award ID: BB/M006891/1

Funded by: MMPathIC

Award ID: MR/N00583X/1

Comments

Comment on this article

scite_

Cited by 10

See all cited by

Most referenced authors 193

See all reference authors

An ensemble of neural models for nested adverse drug events and medication extraction with subwords

Read this article at

Abstract

Objective

Materials and Methods

Results

Discussion

Conclusion

Related collections

Radiology and Natural Language Processing

Most cited references 13

Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features

Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances

OSCAR4: a flexible architecture for chemical text-mining

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 82

Cited by 10

Most referenced authors 193