11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      CamurWeb: a classification software and a large knowledge base for gene expression data of cancer

      research-article
      1 , 2 , , 2 , 3 , 2 , 4 , 2
      BMC Bioinformatics
      BioMed Central
      Italian Society of Bioinformatics (BITS): Annual Meeting 2017 (Italian Society of Bioinformatics (BITS): Annual Meeting 2017)
      05-07 July 2017
      Classification, Knowledge extraction, Big data, Cancer

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The high growth of Next Generation Sequencing data currently demands new knowledge extraction methods. In particular, the RNA sequencing gene expression experimental technique stands out for case-control studies on cancer, which can be addressed with supervised machine learning techniques able to extract human interpretable models composed of genes, and their relation to the investigated disease. State of the art rule-based classifiers are designed to extract a single classification model, possibly composed of few relevant genes. Conversely, we aim to create a large knowledge base composed of many rule-based models, and thus determine which genes could be potentially involved in the analyzed tumor. This comprehensive and open access knowledge base is required to disseminate novel insights about cancer.

          Results

          We propose CamurWeb, a new method and web-based software that is able to extract multiple and equivalent classification models in form of logic formulas (“if then” rules) and to create a knowledge base of these rules that can be queried and analyzed. The method is based on an iterative classification procedure and an adaptive feature elimination technique that enables the computation of many rule-based models related to the cancer under study. Additionally, CamurWeb includes a user friendly interface for running the software, querying the results, and managing the performed experiments. The user can create her profile, upload her gene expression data, run the classification analyses, and interpret the results with predefined queries. In order to validate the software we apply it to all public available RNA sequencing datasets from The Cancer Genome Atlas database obtaining a large open access knowledge base about cancer. CamurWeb is available at http://bioinformatics.iasi.cnr.it/camurweb.

          Conclusions

          The experiments prove the validity of CamurWeb, obtaining many classification models and thus several genes that are associated to 21 different cancer types. Finally, the comprehensive knowledge base about cancer and the software tool are released online; interested researchers have free access to them for further studies and to design biological experiments in cancer research.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: not found

          Ensembl 2012

          The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.
            Bookmark
            • Record: found
            • Abstract: not found
            • Book Chapter: not found

            Fast Effective Rule Induction

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The NCI Genomic Data Commons as an engine for precision medicine.

              The National Cancer Institute Genomic Data Commons (GDC) is an information system for storing, analyzing, and sharing genomic and clinical data from patients with cancer. The recent high-throughput sequencing of cancer genomes and transcriptomes has produced a big data problem that precludes many cancer biologists and oncologists from gleaning knowledge from these data regarding the nature of malignant processes and the relationship between tumor genomic profiles and treatment response. The GDC aims to democratize access to cancer genomic data and to foster the sharing of these data to promote precision medicine approaches to the diagnosis and treatment of cancer.
                Bookmark

                Author and article information

                Contributors
                emanuel@iasi.cnr.it
                silviadilauro@hotmail.it
                eleonora.cappelli@uniroma3.it
                paola.bertolazzi@iasi.cnr.it
                giovanni.felici@iasi.cnr.it
                Conference
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                15 October 2018
                15 October 2018
                2018
                : 19
                Issue : Suppl 10 Issue sponsor : Publication of this supplement has not been supported by sponsorship. Information about the source of funding for publication charges can be found in the individual articles. The articles have undergone the journal's standard peer review process for supplements. The Supplement Editors declare that they have no competing interests.
                : 245-256
                Affiliations
                [1 ]Department of Engineering, Uninettuno International University, Corso Vittorio Emanuele II 39, Rome, 00186 Italy
                [2 ]ISNI 0000 0001 1940 4177, GRID grid.5326.2, Institute of Systems Analysis and Computer Science “A. Ruberti”, National Research Council, ; Via dei Taurini 19, Rome, 00185 Italy
                [3 ]ISNI 0000000121622106, GRID grid.8509.4, Department of Engineering, Roma Tre University, ; Via della Vasca Navale 79, Rome, 00146 Italy
                [4 ]ISNI 0000 0001 2174 1754, GRID grid.7563.7, SYSBIO.IT Center for Systems Biology, Milano Bicocca University, ; Piazza della Scienza 2, Milan, 20126 Italy
                Article
                2299
                10.1186/s12859-018-2299-7
                6191971
                1888bb9f-ca86-4eb9-813a-f4fb32211053
                © The Author(s) 2018

                Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                Italian Society of Bioinformatics (BITS): Annual Meeting 2017
                Italian Society of Bioinformatics (BITS): Annual Meeting 2017
                Cagliari, Italy
                05-07 July 2017
                History
                Categories
                Research
                Custom metadata
                © The Author(s) 2018

                Bioinformatics & Computational biology
                classification,knowledge extraction,big data,cancer
                Bioinformatics & Computational biology
                classification, knowledge extraction, big data, cancer

                Comments

                Comment on this article