12
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Electronic medical records are an integral part of medical texts. Entity recognition of electronic medical records has triggered many studies that propose many entity extraction methods. In this paper, an entity extraction model is proposed to extract entities from Chinese Electronic Medical Records (CEMR). In the input layer of the model, we use word embedding and dictionary features embedding as input vectors, where word embedding consists of a character representation and a word representation. Then, the input vectors are fed to the bidirectional long short-term memory to capture contextual features. Finally, a conditional random field is employed to capture dependencies between neighboring tags. We performed experiments on body classification task, and the F1 values reached 90.65%. We also performed experiments on anatomic region recognition task, and the F1 values reached 93.89%. On both tasks, our model had higher performance than state-of-the-art models, such as Bi-LSTM-CRF, Bi-LSTM-Attention, and Vote. Through experiments, our model has a good effect when dealing with small frequency entities and unknown entities; with a small training dataset, our method showed 2–4% improvement on F1 value compared to the basic Bi-LSTM-CRF models. Additionally, on anatomic region recognition task, besides using our proposed entity extraction model, 12 rules we designed and domain dictionary were adopted. Then, in this task, the weighted F1 value of the three specific entities extraction reached 84.36%.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Deep learning with word embeddings improves biomedical named entity recognition

          Abstract Motivation: Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. Results: We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall. Availability and implementation: The source code for LSTM-CRF is available at https://github.com/glample/tagger and the links to the corpora are available at https://corposaurus.github.io/corpora/. Contact: habibima@informatik.hu-berlin.de
            Bookmark
            • Record: found
            • Abstract: not found
            • Book: not found

            Attention is all you need

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Named Entity Recognition with Bidirectional LSTM-CNNs

                Bookmark

                Author and article information

                Journal
                Int J Environ Res Public Health
                Int J Environ Res Public Health
                ijerph
                International Journal of Environmental Research and Public Health
                MDPI
                1661-7827
                1660-4601
                14 April 2020
                April 2020
                : 17
                : 8
                : 2687
                Affiliations
                [1 ]School of Computer, University of South China, Hengyang 421001, China; dragonc.cxl@ 123456gmail.com (X.C.); yongbinliu03@ 123456gmail.com (Y.L.)
                [2 ]Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA; buyi@ 123456umail.iu.edu
                Author notes
                [* ]Correspondence: ouyangcp@ 123456gmail.com
                Author information
                https://orcid.org/0000-0003-2549-4580
                Article
                ijerph-17-02687
                10.3390/ijerph17082687
                7215438
                32295174
                489e3914-c76d-417d-91f2-840b82932961
                © 2020 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 04 March 2020
                : 09 April 2020
                Categories
                Article

                Public health
                entity recognition,electronic medical records,bi-lstm-crf,rules,domain dictionary
                Public health
                entity recognition, electronic medical records, bi-lstm-crf, rules, domain dictionary

                Comments

                Comment on this article