15
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      To submit your manuscript to JMIR, please click here

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Supervised Machine Learning Algorithms Can Classify Open-Text Feedback of Doctor Performance With Human-Level Accuracy

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Machine learning techniques may be an effective and efficient way to classify open-text reports on doctor’s activity for the purposes of quality assurance, safety, and continuing professional development.

          Objective

          The objective of the study was to evaluate the accuracy of machine learning algorithms trained to classify open-text reports of doctor performance and to assess the potential for classifications to identify significant differences in doctors’ professional performance in the United Kingdom.

          Methods

          We used 1636 open-text comments (34,283 words) relating to the performance of 548 doctors collected from a survey of clinicians’ colleagues using the General Medical Council Colleague Questionnaire (GMC-CQ). We coded 77.75% (1272/1636) of the comments into 5 global themes (innovation, interpersonal skills, popularity, professionalism, and respect) using a qualitative framework. We trained 8 machine learning algorithms to classify comments and assessed their performance using several training samples. We evaluated doctor performance using the GMC-CQ and compared scores between doctors with different classifications using t tests.

          Results

          Individual algorithm performance was high (range F score=.68 to .83). Interrater agreement between the algorithms and the human coder was highest for codes relating to “popular” (recall=.97), “innovator” (recall=.98), and “respected” (recall=.87) codes and was lower for the “interpersonal” (recall=.80) and “professional” (recall=.82) codes. A 10-fold cross-validation demonstrated similar performance in each analysis. When combined together into an ensemble of multiple algorithms, mean human-computer interrater agreement was .88. Comments that were classified as “respected,” “professional,” and “interpersonal” related to higher doctor scores on the GMC-CQ compared with comments that were not classified ( P<.05). Scores did not vary between doctors who were rated as popular or innovative and those who were not rated at all ( P>.05).

          Conclusions

          Machine learning algorithms can classify open-text feedback of doctor performance into multiple themes derived by human raters with high performance. Colleague open-text comments that signal respect, professionalism, and being interpersonal may be key indicators of doctor’s performance.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: not found
          • Article: not found

          Singular value decomposition and least squares solutions

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            A study of cross-validation and bootstrap for accuracy estimation and model selection in

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Graded Response Model

                Bookmark

                Author and article information

                Contributors
                Journal
                J Med Internet Res
                J. Med. Internet Res
                JMIR
                Journal of Medical Internet Research
                JMIR Publications (Toronto, Canada )
                1439-4456
                1438-8871
                March 2017
                15 March 2017
                : 19
                : 3
                : e65
                Affiliations
                [1] 1Centre for Health Services Research University of Cambridge CambridgeUnited Kingdom
                [2] 2The Psychometrics Centre University of Cambridge CambridgeUnited Kingdom
                [3] 3Leeds Institute for Health Sciences University of Leeds LeedsUnited Kingdom
                [4] 4Primary Care Research Group University of Exeter ExeterUnited Kingdom
                Author notes
                Corresponding Author: Chris Gibbons cg598@ 123456cam.ac.uk
                Author information
                http://orcid.org/0000-0002-4732-7305
                http://orcid.org/0000-0003-1416-0569
                http://orcid.org/0000-0002-9299-1555
                http://orcid.org/0000-0002-6752-3493
                Article
                v19i3e65
                10.2196/jmir.6533
                5371715
                28298265
                9d1585fb-2ecd-4daa-bcfb-2a0b89506a1c
                ©Chris Gibbons, Suzanne Richards, Jose Maria Valderas, John Campbell. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 15.03.2017.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

                History
                : 23 August 2016
                : 12 September 2016
                : 30 September 2016
                : 29 November 2016
                Categories
                Original Paper
                Original Paper

                Medicine
                machine learning,surveys and questionnaires,feedback,data mining,work performance
                Medicine
                machine learning, surveys and questionnaires, feedback, data mining, work performance

                Comments

                Comment on this article