23
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Entity recognition from clinical texts via recurrent neural network

      research-article
      1 , 2 , 1 , 1 , 1 , 3 , , 3 , 4
      BMC Medical Informatics and Decision Making
      BioMed Central
      The International Conference on Intelligent Biology and Medicine (ICIBM) 2016 (ICIBM 2016)
      08-10 December 2016
      Entity recognition, Recurrent neural network, Clinical notes, Deep learning, Sequence labeling

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Entity recognition is one of the most primary steps for text analysis and has long attracted considerable attention from researchers. In the clinical domain, various types of entities, such as clinical entities and protected health information (PHI), widely exist in clinical texts. Recognizing these entities has become a hot topic in clinical natural language processing (NLP), and a large number of traditional machine learning methods, such as support vector machine and conditional random field, have been deployed to recognize entities from clinical texts in the past few years. In recent years, recurrent neural network (RNN), one of deep learning methods that has shown great potential on many problems including named entity recognition, also has been gradually used for entity recognition from clinical texts.

          Methods

          In this paper, we comprehensively investigate the performance of LSTM (long-short term memory), a representative variant of RNN, on clinical entity recognition and protected health information recognition. The LSTM model consists of three layers: input layer – generates representation of each word of a sentence; LSTM layer – outputs another word representation sequence that captures the context information of each word in this sentence; Inference layer – makes tagging decisions according to the output of LSTM layer, that is, outputting a label sequence.

          Results

          Experiments conducted on corpora of the 2010, 2012 and 2014 i2b2 NLP challenges show that LSTM achieves highest micro-average F1-scores of 85.81% on the 2010 i2b2 medical concept extraction, 92.29% on the 2012 i2b2 clinical event detection, and 94.37% on the 2014 i2b2 de-identification, which is considerably competitive with other state-of-the-art systems.

          Conclusions

          LSTM that requires no hand-crafted feature has great potential on entity recognition from clinical texts. It outperforms traditional machine learning methods that suffer from fussy feature engineering. A possible future direction is how to integrate knowledge bases widely existing in the clinical domain into LSTM, which is a case of our future work. Moreover, how to use LSTM to recognize entities in specific formats is also another possible future direction.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          A Fast and Accurate Dependency Parser using Neural Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Named Entity Recognition with Bidirectional LSTM-CNNs

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Extracting medication information from clinical text.

              The Third i2b2 Workshop on Natural Language Processing Challenges for Clinical Records focused on the identification of medications, their dosages, modes (routes) of administration, frequencies, durations, and reasons for administration in discharge summaries. This challenge is referred to as the medication challenge. For the medication challenge, i2b2 released detailed annotation guidelines along with a set of annotated discharge summaries. Twenty teams representing 23 organizations and nine countries participated in the medication challenge. The teams produced rule-based, machine learning, and hybrid systems targeted to the task. Although rule-based systems dominated the top 10, the best performing system was a hybrid. Of all medication-related fields, durations and reasons were the most difficult for all systems to detect. While medications themselves were identified with better than 0.75 F-measure by all of the top 10 systems, the best F-measure for durations and reasons were 0.525 and 0.459, respectively. State-of-the-art natural language processing systems go a long way toward extracting medication names, dosages, modes, and frequencies. However, they are limited in recognizing duration and reason fields and would benefit from future research.
                Bookmark

                Author and article information

                Contributors
                liuzengjian.hit@gmail.com
                yangmingbright@163.com
                wangxl@insun.hit.edu.cn
                qingcai.chen@gmail.com
                tangbuzhou@gmail.com
                wz2000@jlu.edu.cn
                hua.xu@uth.tmc.edu
                Conference
                BMC Med Inform Decis Mak
                BMC Med Inform Decis Mak
                BMC Medical Informatics and Decision Making
                BioMed Central (London )
                1472-6947
                5 July 2017
                5 July 2017
                2017
                : 17
                Issue : Suppl 2 Issue sponsor : Publication of this supplement has not been supported by sponsorship. Information about the source of funding for publication charges can be found in the individual articles. The articles have undergone the journal's standard peer review process for supplements. The Supplement Editors declare that they have no competing interests.
                : 67
                Affiliations
                [1 ]GRID grid.452527.3, Key Laboratory of Network Oriented Intelligent Computation, , Harbin Institute of Technology Shenzhen Graduate School, ; Shenzhen, 518055 China
                [2 ]ISNI 0000 0001 0472 9649, GRID grid.263488.3, Pharmacy Department, Shenzhen Second People’s Hospital, , First Affiliated Hospital of Shenzhen University, ; Shenzhen, 518035 China
                [3 ]ISNI 0000 0004 1760 5735, GRID grid.64924.3d, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, , Jilin University, ; Changchun, 130012 China
                [4 ]ISNI 0000 0000 9206 2401, GRID grid.267308.8, School of Biomedical Informatics, , The University of Texas Health Science Center at Houston, ; Houston, TX USA
                Article
                468
                10.1186/s12911-017-0468-7
                5506598
                28699566
                150a6bea-c145-4de2-8b21-fd2815afb18a
                © The Author(s). 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                The International Conference on Intelligent Biology and Medicine (ICIBM) 2016
                ICIBM 2016
                Houston, Texas, USA
                08-10 December 2016
                History
                Categories
                Research
                Custom metadata
                © The Author(s) 2017

                Bioinformatics & Computational biology
                entity recognition,recurrent neural network,clinical notes,deep learning,sequence labeling

                Comments

                Comment on this article