1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Chronic stress in practice assistants: An analytic approach comparing four machine learning classifiers with a standard logistic regression model

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Occupational stress is associated with adverse outcomes for medical professionals and patients. In our cross-sectional study with 136 general practices, 26.4% of 550 practice assistants showed high chronic stress. As machine learning strategies offer the opportunity to improve understanding of chronic stress by exploiting complex interactions between variables, we used data from our previous study to derive the best analytic model for chronic stress: four common machine learning (ML) approaches are compared to a classical statistical procedure.

          Methods

          We applied four machine learning classifiers (random forest, support vector machine, K-nearest neighbors’, and artificial neural network) and logistic regression as standard approach to analyze factors contributing to chronic stress in practice assistants. Chronic stress had been measured by the standardized, self-administered TICS-SSCS questionnaire. The performance of these models was compared in terms of predictive accuracy based on the ‘operating area under the curve’ (AUC), sensitivity, and positive predictive value.

          Findings

          Compared to the standard logistic regression model (AUC 0.636, 95% CI 0.490–0.674), all machine learning models improved prediction: random forest +20.8% (AUC 0.844, 95% CI 0.684–0.843), artificial neural network +12.4% (AUC 0.760, 95% CI 0.605–0.777), support vector machine +15.1% (AUC 0.787, 95% CI 0.634–0.802), and K-nearest neighbours +7.1% (AUC 0.707, 95% CI 0.556–0.735). As best prediction model, random forest showed a sensitivity of 99% and a positive predictive value of 79%. Using the variable frequencies at the decision nodes of the random forest model, the following five work characteristics influence chronic stress: too much work, high demand to concentrate, time pressure, complicated tasks, and insufficient support by practice leaders.

          Conclusions

          Regarding chronic stress prediction, machine learning classifiers, especially random forest, provided more accurate prediction compared to classical logistic regression. Interventions to reduce chronic stress in practice personnel should primarily address the identified workplace characteristics.

          Related collections

          Most cited references40

          • Record: found
          • Abstract: not found
          • Article: not found

          Predicting the Future - Big Data, Machine Learning, and Clinical Medicine.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Machine learning algorithm validation with a limited sample size

            Advances in neuroimaging, genomic, motion tracking, eye-tracking and many other technology-based data collection methods have led to a torrent of high dimensional datasets, which commonly have a small number of samples because of the intrinsic high cost of data collection involving human participants. High dimensional data with a small number of samples is of critical importance for identifying biomarkers and conducting feasibility and pilot work, however it can lead to biased machine learning (ML) performance estimates. Our review of studies which have applied ML to predict autistic from non-autistic individuals showed that small sample size is associated with higher reported classification accuracy. Thus, we have investigated whether this bias could be caused by the use of validation methods which do not sufficiently control overfitting. Our simulations show that K-fold Cross-Validation (CV) produces strongly biased performance estimates with small sample sizes, and the bias is still evident with sample size of 1000. Nested CV and train/test split approaches produce robust and unbiased performance estimates regardless of sample size. We also show that feature selection if performed on pooled training and testing data is contributing to bias considerably more than parameter tuning. In addition, the contribution to bias by data dimensionality, hyper-parameter space and number of CV folds was explored, and validation methods were compared with discriminable data. The results suggest how to design robust testing methodologies when working with small datasets and how to interpret the results of other studies based on what validation method was used.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Can machine-learning improve cardiovascular risk prediction using routine clinical data?

              Background Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Methods Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the ‘receiver operating curve’ (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). Findings 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723–0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739–0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755–0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755–0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759–0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Conclusions Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: MethodologyRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – original draft
                Role: Writing – review & editing
                Role: MethodologyRole: Project administrationRole: Supervision
                Role: Editor
                Journal
                PLoS One
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                4 May 2021
                2021
                : 16
                : 5
                : e0250842
                Affiliations
                [1 ] Institute of General Practice and Family Medicine, University Hospital Bonn, University of Bonn, Bonn, Germany
                [2 ] Institute for General Medicine, University Hospital Essen, University of Duisburg Essen, Essen, Germany
                Universitat Politecnica de Catalunya, SPAIN
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0003-3020-1332
                Article
                PONE-D-20-23593
                10.1371/journal.pone.0250842
                8096078
                33945572
                d60f64f2-1199-4ed9-9843-3023cf0c7520
                © 2021 Bozorgmehr et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 30 July 2020
                : 15 April 2021
                Page count
                Figures: 2, Tables: 7, Pages: 15
                Funding
                The authors received no specific funding for this work.
                Categories
                Research Article
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Computer and Information Sciences
                Artificial Intelligence
                Artificial Neural Networks
                Biology and Life Sciences
                Computational Biology
                Computational Neuroscience
                Artificial Neural Networks
                Biology and Life Sciences
                Neuroscience
                Computational Neuroscience
                Artificial Neural Networks
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Support Vector Machines
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Machine Learning Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Machine Learning Algorithms
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Machine Learning Algorithms
                Medicine and Health Sciences
                Mental Health and Psychiatry
                Psychological Stress
                Biology and Life Sciences
                Psychology
                Psychological Stress
                Social Sciences
                Psychology
                Psychological Stress
                Research and Analysis Methods
                Mathematical and Statistical Techniques
                Statistical Methods
                Forecasting
                Physical Sciences
                Mathematics
                Statistics
                Statistical Methods
                Forecasting
                Medicine and Health Sciences
                Health Care
                Health Care Providers
                Physicians
                People and Places
                Population Groupings
                Professions
                Medical Personnel
                Physicians
                Medicine and Health Sciences
                Vascular Medicine
                Blood Pressure
                Custom metadata
                The manuscript’s data cannot be shared publicly because of ethical restrictions as our dataset includes potentially identifying information of personnel in general practices. Data requests may be sent to the institutional ethics committee of Universitatsklinikum Bonn ( ethik@ 123456ukbonn.de ).

                Uncategorized
                Uncategorized

                Comments

                Comment on this article