8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Unified Medical Language System SPECIALIST Lexicon and Lexical Tools: Development and applications

      case-report

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Natural language processing (NLP) plays a vital role in modern medical informatics. It converts narrative text or unstructured data into knowledge by analyzing and extracting concepts. A comprehensive lexical system is the foundation to the success of NLP applications and an essential component at the beginning of the NLP pipeline. The SPECIALIST Lexicon and Lexical Tools, distributed by the National Library of Medicine as one of the Unified Medical Language System Knowledge Sources, provides an underlying resource for many NLP applications. This article reports recent developments of 3 key components in the Lexicon. The core NLP operation of Unified Medical Language System concept mapping is used to illustrate the importance of these developments. Our objective is to provide generic, broad coverage and a robust lexical system for NLP applications. A novel multiword approach and other planned developments are proposed.

          Related collections

          Most cited references34

          • Record: found
          • Abstract: found
          • Article: not found

          A simple algorithm for identifying negated findings and diseases in discharge summaries.

          Narrative reports in medical records contain a wealth of information that may augment structured data for managing patient information and predicting trends in diseases. Pertinent negatives are evident in text but are not usually indexed in structured databases. The objective of the study reported here was to test a simple algorithm for determining whether a finding or disease mentioned within narrative medical reports is present or absent. We developed a simple regular expression algorithm called NegEx that implements several phrases indicating negation, filters out sentences containing phrases that falsely appear to be negation phrases, and limits the scope of the negation phrases. We compared NegEx against a baseline algorithm that has a limited set of negation phrases and a simpler notion of scope. In a test of 1235 findings and diseases in 1000 sentences taken from discharge summaries indexed by physicians, NegEx had a specificity of 94.5% (versus 85.3% for the baseline), a positive predictive value of 84.5% (versus 68.4% for the baseline) while maintaining a reasonable sensitivity of 77.8% (versus 88.3% for the baseline). We conclude that with little implementation effort a simple regular expression algorithm for determining whether a finding or disease is absent can identify a large portion of the pertinent negatives from discharge summaries.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

            We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies-the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Exploring and developing consumer health vocabularies.

              Laypersons ("consumers") often have difficulty finding, understanding, and acting on health information due to gaps in their domain knowledge. Ideally, consumer health vocabularies (CHVs) would reflect the different ways consumers express and think about health topics, helping to bridge this vocabulary gap. However, despite the recent research on mismatches between consumer and professional language (e.g., lexical, semantic, and explanatory), there have been few systematic efforts to develop and evaluate CHVs. This paper presents the point of view that CHV development is practical and necessary for extending research on informatics-based tools to facilitate consumer health information seeking, retrieval, and understanding. In support of the view, we briefly describe a distributed, bottom-up approach for (1) exploring the relationship between common consumer health expressions and professional concepts and (2) developing an open-access, preliminary (draft) "first-generation" CHV. While recognizing the limitations of the approach (e.g., not addressing psychosocial and cultural factors), we suggest that such exploratory research and development will yield insights into the nature of consumer health expressions and assist developers in creating tools and applications to support consumer health information seeking.
                Bookmark

                Author and article information

                Journal
                J Am Med Inform Assoc
                J Am Med Inform Assoc
                jamia
                Journal of the American Medical Informatics Association : JAMIA
                Oxford University Press
                1067-5027
                1527-974X
                October 2020
                29 May 2020
                29 May 2020
                : 27
                : 10
                : 1600-1605
                Affiliations
                Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health , Bethesda, Maryland, USA
                Author notes
                Corresponding Author: Chris J. Lu, PhD, Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bldg. 38A, Room 9S-911, 8600 Rockville Pike, Bethesda, MD 20894, USA; chlu@ 123456mail.nih.gov
                Article
                ocaa056
                10.1093/jamia/ocaa056
                7580801
                32472120
                c2ea47ff-9c47-42bf-bd82-5457a4c10047
                © The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                : 3 February 2020
                : 10 March 2020
                : 9 April 2020
                Page count
                Pages: 6
                Funding
                Funded by: Intramural Research Program of the National Library of Medicine, National Institutes of Health;
                Categories
                Case Report
                AcademicSubjects/MED00580
                AcademicSubjects/SCI01060
                AcademicSubjects/SCI01530

                Bioinformatics & Computational biology
                unified medical language system,natural language processing,lexicon,lexical tools,nlp tools

                Comments

                Comment on this article