7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Multi-view feature selection for identifying gene markers: a diversified biological data driven approach

      research-article
      1 , 1 , , 2
      BMC Bioinformatics
      BioMed Central
      8th Workshop on Computational Advances in Molecular Epidemiology (CAME 2019)
      07 September 2019
      Gene selection, Sample classification, Gene ontology (GO), Protein–protein interaction network (PPIN), Multi-view learning, Multi-objective clustering, Gene similarity measures

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene expression data sets. In that context, designing an efficient feature selection algorithm exploiting knowledge from multiple potential biological resources may be an effective way to understand the spectrum of cancer or other diseases with applications in specific epidemiology for a particular population.

          Results

          In the current article, we design the feature selection and marker gene detection as a multi-view multi-objective clustering problem. Regarding that, we propose an Unsupervised Multi-View Multi-Objective clustering-based gene selection approach called UMVMO- select. Three important resources of biological data (gene ontology, protein interaction data, protein sequence) along with gene expression values are collectively utilized to design two different views. UMVMO- select aims to reduce gene space without/minimally compromising the sample classification efficiency and determines relevant and non-redundant gene markers from three cancer gene expression benchmark data sets.

          Conclusion

          A thorough comparative analysis has been performed with five clustering and nine existing feature selection methods with respect to several internal and external validity metrics. Obtained results reveal the supremacy of the proposed method. Reported results are also validated through a proper biological significance test and heatmap plotting.

          Related collections

          Most cited references34

          • Record: found
          • Abstract: not found
          • Article: not found

          Silhouettes: A graphical aid to the interpretation and validation of cluster analysis

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Cluster analysis and display of genome-wide expression patterns

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A cluster separation measure.

              A measure is presented which indicates the similarity of clusters which are assumed to have a data density which is a decreasing function of distance from a vector characteristic of the cluster. The measure can be used to infer the appropriateness of data partitions and can therefore be used to compare relative appropriateness of various divisions of the data. The measure does not depend on either the number of clusters analyzed nor the method of partitioning of the data and can be used to guide a cluster seeking algorithm.
                Bookmark

                Author and article information

                Contributors
                sudiptaacharya.2012@gmail.com
                cuilz@szu.edu.cn
                yipan@gsu.edu
                Conference
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                30 December 2020
                30 December 2020
                2020
                : 21
                : Suppl 18
                : 483
                Affiliations
                [1 ]GRID grid.263488.3, ISNI 0000 0001 0472 9649, College of Computer Science and Software Engineering, , Shenzhen University, ; Shenzhen, People’s Republic of China
                [2 ]GRID grid.256304.6, ISNI 0000 0004 1936 7400, Department of Computer Science, , Georgia State University, ; Atlanta, USA
                Article
                3810
                10.1186/s12859-020-03810-0
                7772934
                33375940
                7718f1a2-071a-400a-bc7e-079380d05e28
                © The Author(s) 2020

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                8th Workshop on Computational Advances in Molecular Epidemiology (CAME 2019)
                Niagara Falls, NY, USA
                07 September 2019
                History
                : 1 October 2020
                : 13 October 2020
                Funding
                Funded by: National Key R&D Program of China
                Award ID: 2018YFB1800302
                Award Recipient :
                Funded by: National Natural Science Foundation of China
                Award ID: 61772345
                Award Recipient :
                Funded by: Major Fundamental Research Project in the Science and Technology Plan of Shenzhen
                Award ID: JCYJ20190808142207420
                Award Recipient :
                Funded by: Pearl River Young Scholars funding of Shenzhen University
                Categories
                Methodology
                Custom metadata
                © The Author(s) 2020

                Bioinformatics & Computational biology
                gene selection,sample classification,gene ontology (go),protein–protein interaction network (ppin),multi-view learning,multi-objective clustering,gene similarity measures

                Comments

                Comment on this article