Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Enabling Early Health Care Intervention by Detecting Depression in Users of Web-Based Forums using Language Models: Longitudinal Analysis and Evaluation

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Major depressive disorder is a common mental disorder affecting 5% of adults worldwide. Early contact with health care services is critical for achieving accurate diagnosis and improving patient outcomes. Key symptoms of major depressive disorder (depression hereafter) such as cognitive distortions are observed in verbal communication, which can also manifest in the structure of written language. Thus, the automatic analysis of text outputs may provide opportunities for early intervention in settings where written communication is rich and regular, such as social media and web-based forums.

          Objective

          The objective of this study was 2-fold. We sought to gauge the effectiveness of different machine learning approaches to identify users of the mass web-based forum Reddit, who eventually disclose a diagnosis of depression. We then aimed to determine whether the time between a forum post and a depression diagnosis date was a relevant factor in performing this detection.

          Methods

          A total of 2 Reddit data sets containing posts belonging to users with and without a history of depression diagnosis were obtained. The intersection of these data sets provided users with an estimated date of depression diagnosis. This derived data set was used as an input for several machine learning classifiers, including transformer-based language models (LMs).

          Results

          Bidirectional Encoder Representations from Transformers (BERT) and MentalBERT transformer-based LMs proved the most effective in distinguishing forum users with a known depression diagnosis from those without. They each obtained a mean F 1-score of 0.64 across the experimental setups used for binary classification. The results also suggested that the final 12 to 16 weeks (about 3-4 months) of posts before a depressed user’s estimated diagnosis date are the most indicative of their illness, with data before that period not helping the models detect more accurately. Furthermore, in the 4- to 8-week period before the user’s estimated diagnosis date, their posts exhibited more negative sentiment than any other 4-week period in their post history.

          Conclusions

          Transformer-based LMs may be used on data from web-based social media forums to identify users at risk for psychiatric conditions such as depression. Language features picked up by these classifiers might predate depression onset by weeks to months, enabling proactive mental health care interventions to support those at risk for this condition.

          Related collections

          Most cited references48

          • Record: found
          • Abstract: not found
          • Article: not found

          Fitting Linear Mixed-Effects Models Usinglme4

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            BioBERT: a pre-trained biomedical language representation model for biomedical text mining

            Abstract Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processing (NLP), extracting valuable information from biomedical literature has gained popularity among researchers, and deep learning has boosted the development of effective biomedical text mining models. However, directly applying the advancements in NLP to biomedical text mining often yields unsatisfactory results due to a word distribution shift from general domain corpora to biomedical corpora. In this article, we investigate how the recently introduced pre-trained language model BERT can be adapted for biomedical corpora. Results We introduce BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining), which is a domain-specific language representation model pre-trained on large-scale biomedical corpora. With almost the same architecture across tasks, BioBERT largely outperforms BERT and previous state-of-the-art models in a variety of biomedical text mining tasks when pre-trained on biomedical corpora. While BERT obtains performance comparable to that of previous state-of-the-art models, BioBERT significantly outperforms them on the following three representative biomedical text mining tasks: biomedical named entity recognition (0.62% F1 score improvement), biomedical relation extraction (2.80% F1 score improvement) and biomedical question answering (12.24% MRR improvement). Our analysis results show that pre-training BERT on biomedical corpora helps it to understand complex biomedical texts. Availability and implementation We make the pre-trained weights of BioBERT freely available at https://github.com/naver/biobert-pretrained, and the source code for fine-tuning BioBERT available at https://github.com/dmis-lab/biobert.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Twelve-month and lifetime prevalence and lifetime morbid risk of anxiety and mood disorders in the United States.

              Estimates of 12-month and lifetime prevalence and of lifetime morbid risk (LMR) of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM-IV-TR) anxiety and mood disorders are presented based on US epidemiological surveys among people aged 13+. The presentation is designed for use in the upcoming DSM-5 manual to provide more coherent estimates than would otherwise be available. Prevalence estimates are presented for the age groups proposed by DSM-5 workgroups as the most useful to consider for policy planning purposes. The LMR/12-month prevalence estimates ranked by frequency are as follows: major depressive episode: 29.9%/8.6%; specific phobia: 18.4/12.1%; social phobia: 13.0/7.4%; post-traumatic stress disorder: 10.1/3.7%; generalized anxiety disorder: 9.0/2.0%; separation anxiety disorder: 8.7/1.2%; panic disorder: 6.8%/2.4%; bipolar disorder: 4.1/1.8%; agoraphobia: 3.7/1.7%; obsessive-compulsive disorder: 2.7/1.2. Four broad patterns of results are most noteworthy: first, that the most common (lifetime prevalence/morbid risk) lifetime anxiety-mood disorders in the United States are major depression (16.6/29.9%), specific phobia (15.6/18.4%), and social phobia (10.7/13.0%) and the least common are agoraphobia (2.5/3.7%) and obsessive-compulsive disorder (2.3/2.7%); second, that the anxiety-mood disorders with the earlier median ages-of-onset are phobias and separation anxiety disorder (ages 15-17) and those with the latest are panic disorder, major depression, and generalized anxiety disorder (ages 23-30); third, that LMR is considerably higher than lifetime prevalence for most anxiety-mood disorders, although the magnitude of this difference is much higher for disorders with later than earlier ages-of-onset; and fourth, that the ratio of 12-month to lifetime prevalence, roughly characterizing persistence, varies meaningfully in ways consistent with independent evidence about differential persistence of these disorders. Copyright © 2012 John Wiley & Sons, Ltd.
                Bookmark

                Author and article information

                Journal
                9918645789006676
                JMIR AI
                JMIR AI
                JMIR AI
                2817-1705
                24 March 2023
                10 July 2023
                31 July 2023
                : 2
                : e41205
                Affiliations
                [1 ]School of Computer Science and Informatics, Cardiff University, Cardiff, United Kingdom
                [2 ]Centre for Medical Education, School of Medicine, Cardiff University, Cardiff, United Kingdom
                [3 ]Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, United Kingdom
                Author notes
                Corresponding Author: David Owen, MSc, School of Computer Science and Informatics, Cardiff University, Abacws, Senghennydd Road, Cardiff, CF24 4AG, United Kingdom, Phone: 44 (0)29 2087 4812, owendw1@ 123456cardiff.ac.uk

                Edited by K El Emam;

                Article
                EMS178744
                10.2196/41205
                7614849
                d5a795ee-128d-4c0a-b2a7-ba7d70bc9bf7

                This work is licensed under a CC BY 4.0 International license.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR AI, is properly cited. The complete bibliographic information, a link to the original publication on https://www.ai.jmir.org/, as well as this copyright and license information must be included.

                History
                Categories
                Article

                mental health,depression,internet,natural language processing,transformers,language models,sentiment

                Comments

                Comment on this article