3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Distribution-Sensitive Unbalanced Data Oversampling Method for Medical Diagnosis.

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Aiming at the problem of low accuracy of classification learning algorithm caused by serious imbalance of sample set in medical diagnostic application, this paper proposes a distribution-sensitive oversampling algorithm for imbalanced data. The algorithm accurately divides the minority samples into noise samples, unstable samples, boundary samples and stable samples according to the location of the minority samples. Different samples are processed differently to select the most suitable sample for the synthesis of new samples. In the case of sample synthesis, a distribution-sensitive sample synthesis method is adopted. Different sample synthesis methods are selected according to their different distance from the surrounding minority samples, so as to ensure that the newly synthesized samples have the same characteristics with the original minority samples. The real medical diagnostic data test shows that this algorithm improves the accuracy rate of classification learning algorithm compared with the existing sampling algorithms, especially for the accuracy rate and recall rate of minority classes.

          Related collections

          Author and article information

          Journal
          J Med Syst
          Journal of medical systems
          Springer Science and Business Media LLC
          1573-689X
          0148-5598
          Jan 10 2019
          : 43
          : 2
          Affiliations
          [1 ] Institute of Advanced Technology in Cyberspace, Guangzhou University, Guangzhou, 510006, Guangdong, China. hanweihong@gzhu.edu.cn.
          [2 ] Institute of Electronic and Information Engineering of UESTC in Guangdong, Guangzhou, Guangdong, China. hanweihong@gzhu.edu.cn.
          [3 ] School of Computer of National University of Defense Technology, Changsha, 410073, Hunan, China.
          [4 ] Institute of Advanced Technology in Cyberspace, Guangzhou University, Guangzhou, 510006, Guangdong, China.
          Article
          10.1007/s10916-018-1154-8
          10.1007/s10916-018-1154-8
          30631957
          600e8679-fd7e-45b0-999a-7fb4137fa592
          History

          Classification learning,Undersampling,Oversampling,Medical diagnosis,Imbalanced data,Data resampling

          Comments

          Comment on this article