Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0)

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

This work describes the MADE 1.0 corpus and provides an overview of the MADE 2018 challenge for Extracting Medication, Indication and Adverse Drug Events from Electronic Health Record Notes. The goal of MADE is to provide a set of common evaluation tasks to assess the state of the art for NLP systems applied to electronic health records (EHRs) supporting drug safety surveillance and pharmacovigilance. We also provide benchmarks on the MADE dataset using the system submissions received in MADE 2018 challenge. The MADE 1.0 challenge has released an expert-annotated cohort of medication and adverse drug event information, comprised of 1,089 fully de-identified longitudinal EHR notes from 21 randomly selected cancer patients at the University of Massachusetts Memorial Hospital. Using this cohort as a benchmark, the MADE 1.0 challenge designed three shared NLP tasks. The named entity recognition (NER) task identifies medications and their attributes (dosage, route, duration, and frequency), indications, adverse drug events (ADEs) and severity. The relation identification (RI) task identifies relations between the named entities: medication-indication, medication-ADE, and attribute relations. The third shared task (NER-RI) evaluates NLP models that perform the NER and RI tasks jointly. Eleven teams from four countries participated in at least one of the three shared tasks and forty-one system submissions were received in total. The best systems f-scores for NER, RI, and NER-RI are 0.82, 0.86, and 0.61 respectively. Ensemble classifiers using the team submissions improved the performance further, with an f-score of 0.85, 0.87 and 0.66 for the three tasks respectively MADE results show that recent progress in NLP has led to remarkable improvements in NER and RI tasks for the clinical domain. However, there is still some room for improvement, particularly in the NER-RI task.