4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches

      research-article
      1 , , 2 , 2 , 2
      BMC Medical Informatics and Decision Making
      BioMed Central
      2018 International Workshop on Biomedical and Health Informatics (BHI) (BHI 2018)
      3-6 December 2018
      Clinical text mining, Unstructured electronic health records, Medical named entity recognition, Recurrent neural network

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Text mining and natural language processing of clinical text, such as notes from electronic health records, requires specific consideration of the specialized characteristics of these texts. Deep learning methods could potentially mitigate domain specific challenges such as limited access to in-domain tools and data sets.

          Methods

          A bi-directional Long Short-Term Memory network is applied to clinical notes in Spanish and Swedish for the task of medical named entity recognition. Several types of embeddings, both generated from in-domain and out-of-domain text corpora, and a number of generation and combination strategies for embeddings have been evaluated in order to investigate different input representations and the influence of domain on the final results.

          Results

          For Spanish, a micro averaged F1-score of 75.25 was obtained and for Swedish, the corresponding score was 76.04. The best results for both languages were achieved using embeddings generated from in-domain corpora extracted from electronic health records, but embeddings generated from related domains were also found to be beneficial.

          Conclusions

          A recurrent neural network with in-domain embeddings improved the medical named entity recognition compared to shallow learning methods, showing this combination to be suitable for entity recognition in clinical text for both languages.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Deep learning with word embeddings improves biomedical named entity recognition

          Abstract Motivation: Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. Results: We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall. Availability and implementation: The source code for LSTM-CRF is available at https://github.com/glample/tagger and the links to the corpora are available at https://corposaurus.github.io/corpora/. Contact: habibima@informatik.hu-berlin.de
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Named Entity Recognition with Bidirectional LSTM-CNNs

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Character-Level Neural Network for Biomedical Named Entity Recognition.

              Biomedical named entity recognition (BNER), which extracts important named entities such as genes and proteins, is a challenging task in automated systems that mine knowledge in biomedical texts. The previous state-of-the-art systems required large amounts of task-specific knowledge in the form of feature engineering, lexicons and data pre-processing to achieve high performance. In this paper, we introduce a novel neural network architecture that benefits from both word- and character-level representations automatically, by using a combination of bidirectional long short-term memory (LSTM) and conditional random field (CRF) eliminating the need for most feature engineering tasks. We evaluate our system on two datasets: JNLPBA corpus and the BioCreAtIvE II Gene Mention (GM) corpus. We obtained state-of-the-art performance by outperforming the previous systems. To the best of our knowledge, we are the first to investigate the combination of deep neural networks, CRF, word embeddings and character-level representation in recognising biomedical named entities.
                Bookmark

                Author and article information

                Contributors
                rebeckaw@dsv.su.se
                alicia.perez@ehu.eus
                arantza.casillas@ehu.eus
                maite.oronoz@ehu.eus
                Conference
                BMC Med Inform Decis Mak
                BMC Med Inform Decis Mak
                BMC Medical Informatics and Decision Making
                BioMed Central (London )
                1472-6947
                23 December 2019
                23 December 2019
                2019
                : 19
                Issue : Suppl 7 Issue sponsor : Publication of this supplement has not been supported by sponsorship. Information about the source of funding for publication charges can be found in the individual articles. The articles have undergone the journal's standard peer review process for supplements. The Supplement Editor declare that they have no competing interests.
                : 274
                Affiliations
                [1 ]ISNI 0000 0004 1936 9377, GRID grid.10548.38, Department of Computer and Systems Sciences, DSV, Stockholm University, ; Borgarfjordsgatan 12, Kista, Sweden
                [2 ]ISNI 0000000121671098, GRID grid.11480.3c, IXA (UPV/EHU), University of the Basque Country, ; M. Lardizabal 1, Donostia, 20080 Spain
                Article
                981
                10.1186/s12911-019-0981-y
                6927099
                31865900
                1d5fcb0f-00ce-477c-aea6-ef5de8117c20
                © The Author(s) 2019

                Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                2018 International Workshop on Biomedical and Health Informatics (BHI)
                BHI 2018
                Madrid, Spain
                3-6 December 2018
                History
                Categories
                Research
                Custom metadata
                © The Author(s) 2019

                Bioinformatics & Computational biology
                clinical text mining,unstructured electronic health records,medical named entity recognition,recurrent neural network

                Comments

                Comment on this article