12
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech

      methods-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Introduction: Research related to the automatic detection of Alzheimer's disease (AD) is important, given the high prevalence of AD and the high cost of traditional diagnostic methods. Since AD significantly affects the content and acoustics of spontaneous speech, natural language processing, and machine learning provide promising techniques for reliably detecting AD. There has been a recent proliferation of classification models for AD, but these vary in the datasets used, model types and training and testing paradigms. In this study, we compare and contrast the performance of two common approaches for automatic AD detection from speech on the same, well-matched dataset, to determine the advantages of using domain knowledge vs. pre-trained transfer models.

          Methods: Audio recordings and corresponding manually-transcribed speech transcripts of a picture description task administered to 156 demographically matched older adults, 78 with Alzheimer's Disease (AD) and 78 cognitively intact (healthy) were classified using machine learning and natural language processing as “AD” or “non-AD.” The audio was acoustically-enhanced, and post-processed to improve quality of the speech recording as well control for variation caused by recording conditions. Two approaches were used for classification of these speech samples: (1) using domain knowledge: extracting an extensive set of clinically relevant linguistic and acoustic features derived from speech and transcripts based on prior literature, and (2) using transfer-learning and leveraging large pre-trained machine learning models: using transcript-representations that are automatically derived from state-of-the-art pre-trained language models, by fine-tuning Bidirectional Encoder Representations from Transformer (BERT)-based sequence classification models.

          Results: We compared the utility of speech transcript representations obtained from recent natural language processing models (i.e., BERT) to more clinically-interpretable language feature-based methods. Both the feature-based approaches and fine-tuned BERT models significantly outperformed the baseline linguistic model using a small set of linguistic features, demonstrating the importance of extensive linguistic information for detecting cognitive impairments relating to AD. We observed that fine-tuned BERT models numerically outperformed feature-based approaches on the AD detection task, but the difference was not statistically significant. Our main contribution is the observation that when tested on the same, demographically balanced dataset and tested on independent, unseen data, both domain knowledge and pretrained linguistic models have good predictive performance for detecting AD based on speech. It is notable that linguistic information alone is capable of achieving comparable, and even numerically better, performance than models including both acoustic and linguistic features here. We also try to shed light on the inner workings of the more black-box natural language processing model by performing an interpretability analysis, and find that attention weights reveal interesting patterns such as higher attribution to more important information content units in the picture description task, as well as pauses and filler words.

          Conclusion: This approach supports the value of well-performing machine learning and linguistically-focussed processing techniques to detect AD from speech and highlights the need to compare model performance on carefully balanced datasets, using consistent same training parameters and independent test datasets in order to determine the best performing predictive model.

          Related collections

          Most cited references54

          • Record: found
          • Abstract: not found
          • Article: not found

          Recent Trends in Deep Learning Based Natural Language Processing [Review Article]

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Transformers: State-of-the-Art Natural Language Processing

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Norms of valence, arousal, and dominance for 13,915 English lemmas.

              Information about the affective meanings of words is used by researchers working on emotions and moods, word recognition and memory, and text-based sentiment analysis. Three components of emotions are traditionally distinguished: valence (the pleasantness of a stimulus), arousal (the intensity of emotion provoked by a stimulus), and dominance (the degree of control exerted by a stimulus). Thus far, nearly all research has been based on the ANEW norms collected by Bradley and Lang (1999) for 1,034 words. We extended that database to nearly 14,000 English lemmas, providing researchers with a much richer source of information, including gender, age, and educational differences in emotion norms. As an example of the new possibilities, we included stimuli from nearly all of the category norms (e.g., types of diseases, occupations, and taboo words) collected by Van Overschelde, Rawson, and Dunlosky (Journal of Memory and Language 50:289-335, 2004), making it possible to include affect in studies of semantic memory.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Aging Neurosci
                Front Aging Neurosci
                Front. Aging Neurosci.
                Frontiers in Aging Neuroscience
                Frontiers Media S.A.
                1663-4365
                27 April 2021
                2021
                : 13
                : 635945
                Affiliations
                [1] 1Winterlight Labs Inc. , Toronto, ON, Canada
                [2] 2Department of Computer Science, University of Toronto , Toronto, ON, Canada
                [3] 3Vector Institute for Artificial Intelligence , Toronto, ON, Canada
                [4] 4Unity Health Toronto , Toronto, ON, Canada
                Author notes

                Edited by: Saturnino Luz, University of Edinburgh, United Kingdom

                Reviewed by: Kewei Chen, Banner Alzheimer's Institute, United States; Juan José García Meilán, University of Salamanca, Spain

                *Correspondence: Jekaterina Novikova jekaterina@ 123456winterlightlabs.com
                Article
                10.3389/fnagi.2021.635945
                8110916
                33986655
                205d8a90-e034-4212-9b64-91fc030637ac
                Copyright © 2021 Balagopalan, Eyre, Robin, Rudzicz and Novikova.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 30 November 2020
                : 24 March 2021
                Page count
                Figures: 2, Tables: 11, Equations: 0, References: 54, Pages: 12, Words: 8950
                Categories
                Neuroscience
                Methods

                Neurosciences
                alzheimer's disease,dementia detection,mmse regression,bert,feature engineering,transfer learning

                Comments

                Comment on this article