4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      MedLinker: Medical Entity Linking with Neural Representations and Dictionary Matching

      chapter-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Progress in the field of Natural Language Processing (NLP) has been closely followed by applications in the medical domain. Recent advancements in Neural Language Models (NLMs) have transformed the field and are currently motivating numerous works exploring their application in different domains. In this paper, we explore how NLMs can be used for Medical Entity Linking with the recently introduced MedMentions dataset, which presents two major challenges: (1) a large target ontology of over 2M concepts, and (2) low overlap between concepts in train, validation and test sets. We introduce a solution, MedLinker, that addresses these issues by leveraging specialized NLMs with Approximate Dictionary Matching, and show that it performs competitively on semantic type linking, while improving the state-of-the-art on the more fine-grained task of concept linking (+4 F1 on MedMentions main task).

          Related collections

          Most cited references2

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          BioBERT: a pre-trained biomedical language representation model for biomedical text mining

          Abstract Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processing (NLP), extracting valuable information from biomedical literature has gained popularity among researchers, and deep learning has boosted the development of effective biomedical text mining models. However, directly applying the advancements in NLP to biomedical text mining often yields unsatisfactory results due to a word distribution shift from general domain corpora to biomedical corpora. In this article, we investigate how the recently introduced pre-trained language model BERT can be adapted for biomedical corpora. Results We introduce BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining), which is a domain-specific language representation model pre-trained on large-scale biomedical corpora. With almost the same architecture across tasks, BioBERT largely outperforms BERT and previous state-of-the-art models in a variety of biomedical text mining tasks when pre-trained on biomedical corpora. While BERT obtains performance comparable to that of previous state-of-the-art models, BioBERT significantly outperforms them on the following three representative biomedical text mining tasks: biomedical named entity recognition (0.62% F1 score improvement), biomedical relation extraction (2.80% F1 score improvement) and biomedical question answering (12.24% MRR improvement). Our analysis results show that pre-training BERT on biomedical corpora helps it to understand complex biomedical texts. Availability and implementation We make the pre-trained weights of BioBERT freely available at https://github.com/naver/biobert-pretrained, and the source code for fine-tuning BioBERT available at https://github.com/dmis-lab/biobert.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            TaggerOne: joint named entity recognition and normalization with semi-Markov Models.

            Text mining is increasingly used to manage the accelerating pace of the biomedical literature. Many text mining applications depend on accurate named entity recognition (NER) and normalization (grounding). While high performing machine learning methods trainable for many entity types exist for NER, normalization methods are usually specialized to a single entity type. NER and normalization systems are also typically used in a serial pipeline, causing cascading errors and limiting the ability of the NER system to directly exploit the lexical information provided by the normalization.
              Bookmark

              Author and article information

              Contributors
              joemon.jose@glasgow.ac.uk
              emine.yilmaz@ucl.ac.uk
              jm.magalhaes@fct.unl.pt
              pablo.castells@uam.es
              ferro@dei.unipd.it
              mjs@inesc-id.pt
              flaviomartins@acm.org
              dloureiro@fc.up.pt
              amjorge@fc.up.pt
              Journal
              978-3-030-45442-5
              10.1007/978-3-030-45442-5
              Advances in Information Retrieval
              Advances in Information Retrieval
              42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part II
              978-3-030-45441-8
              978-3-030-45442-5
              24 March 2020
              : 12036
              : 230-237
              Affiliations
              [8 ]GRID grid.8756.c, ISNI 0000 0001 2193 314X, University of Glasgow, ; Glasgow, UK
              [9 ]GRID grid.83440.3b, ISNI 0000000121901201, University College London, ; London, UK
              [10 ]GRID grid.10772.33, ISNI 0000000121511713, Universidade NOVA de Lisboa, ; Lisbon, Portugal
              [11 ]GRID grid.5515.4, ISNI 0000000119578126, Universidad Autónoma de Madrid, ; Madrid, Spain
              [12 ]GRID grid.5608.b, ISNI 0000 0004 1757 3470, University of Padua, ; Padua, Italy
              [13 ]GRID grid.9983.b, ISNI 0000 0001 2181 4263, Universidade de Lisboa, ; Lisbon, Portugal
              [14 ]GRID grid.10772.33, ISNI 0000000121511713, Universidade NOVA de Lisboa, ; Lisbon, Portugal
              LIAAD - INESCTEC, Porto, Portugal
              Article
              29
              10.1007/978-3-030-45442-5_29
              7148021
              2d0802d8-6679-46c0-9007-8c34ef66a7cf
              © Springer Nature Switzerland AG 2020

              This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.

              History
              Categories
              Article
              Custom metadata
              © Springer Nature Switzerland AG 2020

              entity linking,bioinformatics,neural language models

              Comments

              Comment on this article