8
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model

      Preprint
      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The COVID-19 pandemic swept across the world rapidly, infecting millions of people. An efficient tool that can accurately recognize important clinical concepts of COVID-19 from free text in electronic health records (EHRs) will be valuable to accelerate COVID-19 clinical research. To this end, this study aims at adapting the existing CLAMP natural language processing tool to quickly build COVID-19 SignSym, which can extract COVID-19 signs/symptoms and their 8 attributes (body location, severity, temporal expression, subject, condition, uncertainty, negation, and course) from clinical text. The extracted information is also mapped to standard concepts in the Observational Medical Outcomes Partnership common data model. A hybrid approach of combining deep learning-based models, curated lexicons, and pattern-based rules was applied to quickly build the COVID-19 SignSym from CLAMP, with optimized performance. Our extensive evaluation using 3 external sites with clinical notes of COVID-19 patients, as well as the online medical dialogues of COVID-19, shows COVID-19 SignSym can achieve high performance across data sources. The workflow used for this study can be generalized to other use cases, where existing clinical natural language processing tools need to be customized for specific information needs within a short time. COVID-19 SignSym is freely accessible to the research community as a downloadable package ( https://clamp.uth.edu/covid/nlp.php) and has been used by 16 healthcare organizations to support clinical research of COVID-19.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

          We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies-the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Digital biomarkers for Alzheimer’s disease: the mobile/ wearable devices opportunity

            Alzheimer’s Disease (AD) represents a major and rapidly growing burden to the healthcare ecosystem. A growing body of evidence indicates that cognitive, behavioral, sensory, and motor changes may precede clinical manifestations of AD by several years. Existing tests designed to diagnose neurodegenerative diseases, while well-validated, are often less effective in detecting deviations from normal cognitive decline trajectory in the earliest stages of the disease. In the quest for gold standards for AD assessment, there is a growing interest in the identification of readily accessible digital biomarkers, which harness advances in consumer grade mobile and wearable technologies. Topics examined include a review of existing early clinical manifestations of AD and a path to the respective sensor and mobile/wearable device usage to acquire domain-centric data towards objective, high frequency and passive digital phenotyping.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.

              The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks: a concept extraction task focused on the extraction of medical concepts from patient reports; an assertion classification task focused on assigning assertion types for medical problem concepts; and a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments. i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification. These systems showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Depending on the task, the rule-based systems can either provide input for machine learning or post-process the output of machine learning. Ensembles of classifiers, information from unlabeled data, and external knowledge sources can help when the training data are inadequate.
                Bookmark

                Author and article information

                Journal
                ArXiv
                ArXiv
                arxiv
                ArXiv
                Cornell University
                2331-8422
                13 July 2020
                : arXiv:2007.10286v4
                Affiliations
                [1 ]Melax Technologies, Inc, Houston, Texas, USA,
                [2 ]Division of Medical Informatics, University of Kansas Medical Center, Kansas City, Kansas, USA,
                [3 ]Johns Hopkins University School of Medicine, Baltimore, Maryland, USA,
                [4 ]School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA,
                [5 ]University of Missouri School of Medicine, Columbia, Missouri, USA
                Author notes

                AUTHOR CONTRIBUTIONS

                The work presented here was carried out in collaboration among all authors. YZ, MR, and HX designed methods and experiments. NA, MR, JG, HP, YZ, XS, ML, and JW annotated the datasets. HP, JW, NA, and MR analyzed the data and interpreted the results. YZ and FM drafted the article. All authors have contributed to, edited, reviewed, and approved the manuscript.

                [*]

                Contributed equally as first authors.

                [^]

                Contributed equally as corresponding authors.

                Author information
                http://orcid.org/0000-0002-3388-5867
                http://orcid.org/0000-0003-0889-2261
                http://orcid.org/0000-0002-8036-2110
                http://orcid.org/0000-0002-5274-4672
                http://orcid.org/0000-0001-9220-3101
                Article
                2007.10286
                10.1073/pnas.2026805118
                7480086
                32908948
                8b372102-2d40-4c0e-9a11-456c3c5caa73

                This work is licensed under a Creative Commons Attribution 4.0 International License, which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.

                For permissions, please journals.permissions@ 123456oup.com

                History
                Categories
                Article

                Comments

                Comment on this article