9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Contribution of Natural Language Processing in Predicting Rehospitalization Risk

      letter
      , MSc * , , , PharmD, MPH , § , , PhD *
      Medical Care
      Lippincott Williams & Wilkins

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          To the Editor: Greenwald et al1 propose using free text in patient records to estimate hospital readmission risk. They use expert knowledge to identify 35 groups of phrases indicative of 30-day rehospitalization, and use 16 of these in logistic regression. We believe the use of natural language processing (NLP) for predicting rehospitalization is an interesting approach, and provide suggestions to improve the model. NLP METHODS The proposed terms are all n-grams (n≤4) and therefore a subset of simpler bag-of-words,2 which can be extracted with lighter expert workload. Grouping terms to create variables can be done automatically using topic modeling.3 Taking context into account and normalizing abbreviations and word variants, as discussed by the authors, can be done using common-off-the-shelf software such as cTAKES.4 Graph modeling is another document representation for classification that has been shown to have good interpretability by experts.5 COLLINEARITY The distortion of the coefficients in table 3 and the modest improvements over the baseline suggest that the variables may share the same information. The Pearson correlation coefficients of all variables would help determine whether this is the case. MODEL EVALUATION Another concern is that the proposed method is only compared with a baseline of prior hospitalizations. To evaluate the added value of the proposed variables, a stronger baseline could use all available structured data in the patient records that have been shown to have predictive value, that is, age, sex, comorbidity index.6 This also contributes to measuring the true effect of the proposed variables when adjusting for potential confounders. CONCLUSIONS The study of rehospitalization risks presents an excellent opportunity to assess the contribution of NLP to predicting important clinical outcomes. With this letter we want to encourage a more thorough evaluation of NLP methods toward this goal. Christopher Norman, MSc*† Thu Van Nguyen, PharmD, MPH‡§ Aurélie Névéol, PhD**LIMSI, CNRS, Université Paris Saclay, Orsay, France†Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands‡Equipe METHODS, Sorbonne Paris Cité Epidemiology and Statistics Research Center, University Paris Descartes, Paris, France§University of Liverpool, Liverpool, UK

          Related collections

          Most cited references4

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Predicting early psychiatric readmission with natural language processing of narrative discharge summaries

          The ability to predict psychiatric readmission would facilitate the development of interventions to reduce this risk, a major driver of psychiatric health-care costs. The symptoms or characteristics of illness course necessary to develop reliable predictors are not available in coded billing data, but may be present in narrative electronic health record (EHR) discharge summaries. We identified a cohort of individuals admitted to a psychiatric inpatient unit between 1994 and 2012 with a principal diagnosis of major depressive disorder, and extracted inpatient psychiatric discharge narrative notes. Using these data, we trained a 75-topic Latent Dirichlet Allocation (LDA) model, a form of natural language processing, which identifies groups of words associated with topics discussed in a document collection. The cohort was randomly split to derive a training (70%) and testing (30%) data set, and we trained separate support vector machine models for baseline clinical features alone, baseline features plus common individual words and the above plus topics identified from the 75-topic LDA model. Of 4687 patients with inpatient discharge summaries, 470 were readmitted within 30 days. The 75-topic LDA model included topics linked to psychiatric symptoms (suicide, severe depression, anxiety, trauma, eating/weight and panic) and major depressive disorder comorbidities (infection, postpartum, brain tumor, diarrhea and pulmonary disease). By including LDA topics, prediction of readmission, as measured by area under receiver-operating characteristic curves in the testing data set, was improved from baseline (area under the curve 0.618) to baseline+1000 words (0.682) to baseline+75 topics (0.784). Inclusion of topics derived from narrative notes allows more accurate discrimination of individuals at high risk for psychiatric readmission in this cohort. Topic modeling and related approaches offer the potential to improve prediction using EHRs, if generalizability can be established in other clinical cohorts.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A Novel Model for Predicting Rehospitalization Risk Incorporating Physical Function, Cognitive Status, and Psychosocial Support Using Natural Language Processing.

            With the increasing focus on reducing hospital readmissions in the United States, numerous readmissions risk prediction models have been proposed, mostly developed through analyses of structured data fields in electronic medical records and administrative databases. Three areas that may have an impact on readmission but are poorly captured using structured data sources are patients' physical function, cognitive status, and psychosocial environment and support.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text

              Extracting medical knowledge from electronic medical records requires automated approaches to combat scalability limitations and selection biases. However, existing machine learning approaches are often regarded by clinicians as black boxes. Moreover, training data for these automated approaches at often sparsely annotated at best. The authors target unsupervised learning for modeling clinical narrative text, aiming at improving both accuracy and interpretability.
                Bookmark

                Author and article information

                Journal
                Med Care
                Med Care
                MLR
                Medical Care
                Lippincott Williams & Wilkins
                0025-7079
                1537-1948
                August 2017
                13 July 2017
                : 55
                : 8
                : 781
                Affiliations
                [* ]LIMSI, CNRS, Université Paris Saclay, Orsay, France
                []Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
                []Equipe METHODS, Sorbonne Paris Cité Epidemiology and Statistics Research Center, University Paris Descartes, Paris, France
                [§ ]University of Liverpool, Liverpool, UK
                Article
                00007
                10.1097/MLR.0000000000000750
                5510702
                28549001
                09a10c8d-432d-416f-88a9-8ea790996ea9
                Copyright © 2017 The Author(s). Published by Wolters Kluwer Health, Inc.

                This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. http://creativecommons.org/licenses/by-nc-nd/4.0/

                History
                Categories
                Letter to the Editor
                Custom metadata
                TRUE

                Comments

                Comment on this article