Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

Electronic Health Record Driven Prediction for Gestational Diabetes Mellitus in Early Pregnancy

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      Gestational diabetes mellitus (GDM) is conventionally confirmed with oral glucose tolerance test (OGTT) in 24 to 28 weeks of gestation, but it is still uncertain whether it can be predicted with secondary use of electronic health records (EHRs) in early pregnancy. To this purpose, the cost-sensitive hybrid model (CSHM) and five conventional machine learning methods are used to construct the predictive models, capturing the future risks of GDM in the temporally aggregated EHRs. The experimental data sources from a nested case-control study cohort, containing 33,935 gestational women in West China Second Hospital. After data cleaning, 4,378 cases and 50 attributes are stored and collected for the data set. Through selecting the most feasible method, the cost parameter of CSHM is adapted to deal with imbalance of the dataset. In the experiment, 3940 samples are used for training and the rest 438 samples for testing. Although the accuracy of positive samples is barely acceptable (62.16%), the results suggest that the vast majority (98.4%) of those predicted positive instances are real positives. To our knowledge, this is the first study to apply machine learning models with EHRs to predict GDM, which will facilitate personalized medicine in maternal health management in the future.

      Related collections

      Most cited references 40

      • Record: found
      • Abstract: found
      • Article: not found

      Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.

      Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed variable that lies on a continuous or graded scale, an assessment of the overall value of the test can be made through the use of a receiver operating characteristic (ROC) curve. The curve is constructed by varying the cutpoint used to determine which values of the observed variable will be considered abnormal and then plotting the resulting sensitivities against the corresponding false positive rates. When two or more empirical curves are constructed based on tests performed on the same individuals, statistical analysis on differences between curves must take into account the correlated nature of the data. This paper presents a nonparametric approach to the analysis of areas under correlated ROC curves, by using the theory on generalized U-statistics to generate an estimated covariance matrix.
        Bookmark
        • Record: found
        • Abstract: found
        • Article: found

        pROC: an open-source package for R and S+ to analyze and compare ROC curves

        Background Receiver operating characteristic (ROC) curves are useful tools to evaluate classifiers in biomedical and bioinformatics applications. However, conclusions are often reached through inconsistent use or insufficient statistical analysis. To support researchers in their ROC curves analysis we developed pROC, a package for R and S+ that contains a set of tools displaying, analyzing, smoothing and comparing ROC curves in a user-friendly, object-oriented and flexible interface. Results With data previously imported into the R or S+ environment, the pROC package builds ROC curves and includes functions for computing confidence intervals, statistical tests for comparing total or partial area under the curve or the operating points of different classifiers, and methods for smoothing ROC curves. Intermediary and final results are visualised in user-friendly interfaces. A case study based on published clinical and biomarker data shows how to perform a typical ROC analysis with pROC. Conclusions pROC is a package for R and S+ specifically dedicated to ROC analysis. It proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve, allowing proper ROC interpretation. pROC is available in two versions: in the R programming language or with a graphical user interface in the S+ statistical software. It is accessible at http://expasy.org/tools/pROC/ under the GNU General Public License. It is also distributed through the CRAN and CSAN public repositories, facilitating its installation.
          Bookmark
          • Record: found
          • Abstract: not found
          • Article: not found

          Learning from Imbalanced Data

           Haibo He,  E.A. Garcia (2009)
            Bookmark

            Author and article information

            Affiliations
            [1 ]ISNI 0000 0004 0369 4060, GRID grid.54549.39, Big Data Research Center, University of Electronic Science and Technology of China, ; Chengdu, 611731 Sichuan China
            [2 ]ISNI 0000 0004 0369 4060, GRID grid.54549.39, School of Computer Science and Engineering, University of Electronic Science and Technology of China, ; Chengdu, 611731 Sichuan China
            [3 ]ISNI 0000 0001 0381 4112, GRID grid.411587.e, School of Economics and Management, Chongqing University of Posts and Telecommunications, Chongqing, ; 400065 Chongqing, China
            [4 ]ISNI 0000 0001 2097 4281, GRID grid.29857.31, Department of Statistics, The Pennsylvania State University, ; University Park, PA 16802-2111 United States
            [5 ]ISNI 0000 0001 0807 1581, GRID grid.13291.38, Division of Obstetrics, West China Second University Hospital, Sichuan University, ; Chengdu, 610041 Sichuan China
            [6 ]ISNI 0000 0001 0807 1581, GRID grid.13291.38, Division of Information Management, West China Second University Hospital, Sichuan University, ; Chengdu, 610041 Sichuan China
            [7 ]Chengdu Shulianyikang Technology Co., Ltd, Chengdu, 610041 Sichuan China
            [8 ]ISNI 0000 0004 1790 5236, GRID grid.411307.0, School of Computer Science, Chengdu University of Information Technology, ; Chengdu, 610225 Sichuan China
            Contributors
            yhy188@gmail.com
            tomlsd@163.com
            Journal
            Sci Rep
            Sci Rep
            Scientific Reports
            Nature Publishing Group UK (London )
            2045-2322
            27 November 2017
            27 November 2017
            2017
            : 7
            29180800
            5703904
            16665
            10.1038/s41598-017-16665-y
            © The Author(s) 2017

            Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

            Categories
            Article
            Custom metadata
            © The Author(s) 2017

            Uncategorized

            Comments

            Comment on this article