100
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Submit your digital health research with an established publisher
      - celebrating 25 years of open access

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View

      research-article

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs.

          Objective

          To attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence.

          Methods

          A multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method.

          Results

          The process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models.

          Conclusions

          A set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community.

          Related collections

          Most cited references37

          • Record: found
          • Abstract: not found
          • Article: not found

          A tutorial on support vector regression

            • Record: found
            • Abstract: found
            • Article: not found

            An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests.

            Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and bioinformatics within the past few years. High-dimensional problems are common not only in genetics, but also in some areas of psychological research, where only a few subjects can be measured because of time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications and to provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high-dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated with freely available implementations in the R system for statistical computing. (c) 2009 APA, all rights reserved.
              • Record: found
              • Abstract: not found
              • Article: not found

              The problem of overfitting.

                Author and article information

                Contributors
                Journal
                J Med Internet Res
                J. Med. Internet Res
                JMIR
                Journal of Medical Internet Research
                JMIR Publications (Toronto, Canada )
                1439-4456
                1438-8871
                December 2016
                16 December 2016
                : 18
                : 12
                : e323
                Affiliations
                [1] 1Centre for Pattern Recognition and Data Analytics School of Information Technology Deakin University GeelongAustralia
                [2] 2Deakin University GeelongAustralia
                [3] 3Philips Research Briarcliff Manor, NYUnited States
                [4] 4Japan Advanced Institute of Science and Technology NomiJapan
                Author notes
                Corresponding Author: Wei Luo wei.luo@ 123456deakin.edu.au
                Author information
                http://orcid.org/0000-0002-4711-7543
                http://orcid.org/0000-0002-9977-8247
                http://orcid.org/0000-0001-6531-8907
                http://orcid.org/0000-0002-3308-1930
                http://orcid.org/0000-0003-2247-850X
                http://orcid.org/0000-0003-1814-0856
                http://orcid.org/0000-0002-0849-3271
                http://orcid.org/0000-0002-7562-6767
                http://orcid.org/0000-0003-4091-9233
                http://orcid.org/0000-0001-5951-643X
                http://orcid.org/0000-0001-8675-6631
                http://orcid.org/0000-0002-5554-6946
                Article
                v18i12e323
                10.2196/jmir.5870
                5238707
                27986644
                bf567a2c-b473-41a3-800f-57f4a77a6eae
                ©Wei Luo, Dinh Phung, Truyen Tran, Sunil Gupta, Santu Rana, Chandan Karmakar, Alistair Shilton, John Yearwood, Nevenka Dimitrova, Tu Bao Ho, Svetha Venkatesh, Michael Berk. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.12.2016.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

                History
                : 12 April 2016
                : 27 July 2016
                : 4 November 2016
                : 23 November 2016
                Categories
                Original Paper
                Original Paper

                Medicine
                machine learning,clinical prediction rule,guideline
                Medicine
                machine learning, clinical prediction rule, guideline

                Comments

                Comment on this article

                Related Documents Log