20
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A filter approach for feature selection in classification: application to automatic atrial fibrillation detection in electrocardiogram recordings

      research-article
      1 , , 2 , 3 , 4 , 5 , 6
      BMC Medical Informatics and Decision Making
      BioMed Central
      Computational Intelligence methods for Bioinformatics and Biostatistics (CIBB)
      4-6 September 2019
      $$ \gamma$$ -metric , Machine learning, Feature selection, Classification, Clinical decision making, Atrial fibrillation detection

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          In high-dimensional data analysis, the complexity of predictive models can be reduced by selecting the most relevant features, which is crucial to reduce data noise and increase model accuracy and interpretability. Thus, in the field of clinical decision making, only the most relevant features from a set of medical descriptors should be considered when determining whether a patient is healthy or not. This statistical approach known as feature selection can be performed through regression or classification, in a supervised or unsupervised manner. Several feature selection approaches using different mathematical concepts have been described in the literature. In the field of classification, a new approach has recently been proposed that uses the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} -metric, an index measuring separability between different classes in heart rhythm characterization. The present study proposes a filter approach for feature selection in classification using this \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} -metric, and evaluates its application to automatic atrial fibrillation detection.

          Methods

          The stability and prediction performance of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} -metric feature selection approach was evaluated using the support vector machine model on two heart rhythm datasets, one extracted from the PhysioNet database and the other from the database of Marseille University Hospital Center, France (Timone Hospital). Both datasets contained electrocardiogram recordings grouped into two classes: normal sinus rhythm and atrial fibrillation. The performance of this feature selection approach was compared to that of three other approaches, with the first two based on the Random Forest technique and the other on receiver operating characteristic curve analysis.

          Results

          The \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} -metric approach showed satisfactory results, especially for models with a smaller number of features. For the training dataset, all prediction indicators were higher for our approach (accuracy greater than 99% for models with 5 to 17 features), as was stability (greater than 0.925 regardless of the number of features included in the model). For the validation dataset, the features selected with the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} -metric approach differed from those selected with the other approaches; sensitivity was higher for our approach, but other indicators were similar.

          Conclusion

          This filter approach for feature selection in classification opens up new methodological avenues for atrial fibrillation detection using short electrocardiogram recordings.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12911-021-01427-8.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          SMOTE: Synthetic Minority Over-sampling Technique

          An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small percentage of ``abnormal'' or ``interesting'' examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Heart Disease and Stroke Statistics—2020 Update

            Circulation
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              PhysioBank, PhysioToolkit, and PhysioNet

              Circulation, 101(23)
                Bookmark

                Author and article information

                Contributors
                pierre.michel@univ-amu.fr
                Conference
                BMC Med Inform Decis Mak
                BMC Med Inform Decis Mak
                BMC Medical Informatics and Decision Making
                BioMed Central (London )
                1472-6947
                4 May 2021
                4 May 2021
                2021
                : 21
                Issue : Suppl 4 Issue sponsor : Publication of this supplement has not been supported by sponsorship. Information about the source of funding for publication charges can be found in the individual articles. The articles have undergone the journal's standard peer review process for supplements. The Supplement Editors declare that they have no competing interests.
                : 130
                Affiliations
                [1 ]GRID grid.5399.6, ISNI 0000 0001 2176 4817, CNRS, EHESS, Centrale Marseille, AMSE, , Aix-Marseille Univ, ; Marseille, France
                [2 ]GRID grid.5399.6, ISNI 0000 0001 2176 4817, INSERM, IRD, SESSTIM, Sciences Economiques & Sociales de la Santé & Traitement de l’Information Médicale, , Aix Marseille Univ, ; Marseille, France
                [3 ]WitMonki SAS, Marseille, France
                [4 ]GRID grid.5399.6, ISNI 0000 0001 2176 4817, INSERM, INRAE, C2VN, , Aix Marseille Univ, ; Marseille, France
                [5 ]GRID grid.414336.7, ISNI 0000 0001 0407 1584, Hôpital Nord, Service des Explorations Fonctionnelles Respiratoires, Pôle cardiovasculaire, , APHM, ; Marseille, France
                [6 ]GRID grid.5399.6, ISNI 0000 0001 2176 4817, APHM, INSERM, IRD, Sciences Economiques & Sociales de la Sante & Traitement de l’Information Médicale (SESSTIM), Hop Timone, Biostatistique et Technologies de l’Information et de la Communication (BioSTIC), , Aix Marseille Univ, ; Marseille, France
                Article
                1427
                10.1186/s12911-021-01427-8
                8094578
                33947379
                f203488e-18b0-48ef-b078-483347a3d7b7
                © The Author(s) 2021

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                Computational Intelligence methods for Bioinformatics and Biostatistics
                CIBB
                Bergamo, Italy
                4-6 September 2019
                History
                : 25 January 2021
                : 9 February 2021
                Categories
                Research
                Custom metadata
                © The Author(s) 2021

                Bioinformatics & Computational biology
                \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document}γ-metric,machine learning,feature selection,classification,clinical decision making,atrial fibrillation detection

                Comments

                Comment on this article