30
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      GCNCDA: A new method for predicting circRNA-disease associations based on Graph Convolutional Network Algorithm

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Numerous evidences indicate that Circular RNAs (circRNAs) are widely involved in the occurrence and development of diseases. Identifying the association between circRNAs and diseases plays a crucial role in exploring the pathogenesis of complex diseases and improving the diagnosis and treatment of diseases. However, due to the complex mechanisms between circRNAs and diseases, it is expensive and time-consuming to discover the new circRNA-disease associations by biological experiment. Therefore, there is increasingly urgent need for utilizing the computational methods to predict novel circRNA-disease associations. In this study, we propose a computational method called GCNCDA based on the deep learning Fast learning with Graph Convolutional Networks (FastGCN) algorithm to predict the potential disease-associated circRNAs. Specifically, the method first forms the unified descriptor by fusing disease semantic similarity information, disease and circRNA Gaussian Interaction Profile (GIP) kernel similarity information based on known circRNA-disease associations. The FastGCN algorithm is then used to objectively extract the high-level features contained in the fusion descriptor. Finally, the new circRNA-disease associations are accurately predicted by the Forest by Penalizing Attributes (Forest PA) classifier. The 5-fold cross-validation experiment of GCNCDA achieved 91.2% accuracy with 92.78% sensitivity at the AUC of 90.90% on circR2Disease benchmark dataset. In comparison with different classifier models, feature extraction models and other state-of-the-art methods, GCNCDA shows strong competitiveness. Furthermore, we conducted case study experiments on diseases including breast cancer, glioma and colorectal cancer. The results showed that 16, 15 and 17 of the top 20 candidate circRNAs with the highest prediction scores were respectively confirmed by relevant literature and databases. These results suggest that GCNCDA can effectively predict potential circRNA-disease associations and provide highly credible candidates for biological experiments.

          Author summary

          The recognition of circRNA-disease association is the key of disease diagnosis and treatment, and it is of great significance for exploring the pathogenesis of complex diseases. Computational methods can predict the potential disease-related circRNAs quickly and accurately. Based on the hypothesis that circRNA with similar function tends to associate with similar disease, GCNCDA model is proposed to effectively predict the potential association between circRNAs and diseases by combining FastGCN algorithm. The performance of the model was verified by cross-validation experiments, different feature extraction algorithm and classifier models comparison experiments. Furthermore, 16, 15 and 17 of the top 20 candidate circRNAs with the highest prediction scores in disease including breast cancer, glioma and colorectal cancer were respectively confirmed by relevant literature and databases. It is anticipated that GCNCDA model can give priority to the most promising circRNA-disease associations on a large scale to provide reliable candidates for further biological experiments.

          Related collections

          Most cited references26

          • Record: found
          • Abstract: found
          • Article: not found

          Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine.

          The clinical performance of a laboratory test can be described in terms of diagnostic accuracy, or the ability to correctly classify subjects into clinically relevant subgroups. Diagnostic accuracy refers to the quality of the information provided by the classification device and should be distinguished from the usefulness, or actual practical value, of the information. Receiver-operating characteristic (ROC) plots provide a pure index of accuracy by demonstrating the limits of a test's ability to discriminate between alternative states of health over the complete spectrum of operating conditions. Furthermore, ROC plots occupy a central or unifying position in the process of assessing and using diagnostic tools. Once the plot is generated, a user can readily go on to many other activities such as performing quantitative ROC analysis and comparisons of tests, using likelihood ratio to revise the probability of disease in individual subjects, selecting decision thresholds, using logistic-regression analysis, using discriminant-function analysis, or incorporating the tool into a clinical strategy by using decision analysis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Measuring the accuracy of diagnostic systems.

            J Swets (1988)
            Diagnostic systems of several kinds are used to distinguish between two classes of events, essentially "signals" and "noise". For them, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy. It is the only measure available that is uninfluenced by decision biases and prior probabilities, and it places the performances of diverse systems on a common, easily interpreted scale. Representative values of this measure are reported here for systems in medical imaging, materials testing, weather forecasting, information retrieval, polygraph lie detection, and aptitude testing. Though the measure itself is sound, the values obtained from tests of diagnostic systems often require qualification because the test data on which they are based are of unsure quality. A common set of problems in testing is faced in all fields. How well these problems are handled, or can be handled in a given field, determines the degree of confidence that can be placed in a measured value of accuracy. Some fields fare much better than others.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Circ2Traits: a comprehensive database for circular RNA potentially associated with disease and traits

              Circular RNAs are new players in regulation of post transcriptional gene expression. Animal genomes express many circular RNAs from diverse genomic locations. A recent study has validated a fairly large number of circular RNAs in human, mouse, and nematode. Circular RNAs play a crucial role in fine tuning the level of miRNA mediated regulation of gene expression by sequestering the miRNAs. Their interaction with disease associated miRNAs indicates that circular RNAs are important for disease regulation. In this paper we studied the potential association of circular RNAs (circRNA) with human diseases in two different ways. Firstly, the interactions of circRNAs with disease associated miRNAs were identified, following which the likelihood of a circRNA being associated with a disease was calculated. For the miRNAs associated with individual diseases, we constructed a network of predicted interactions between the miRNAs and protein coding, long non-coding and circular RNA genes. We carried out gene ontology (GO) enrichment analysis on the set of protein coding genes in the miRNA- circRNA interactome of individual diseases to check the enrichment of genes associated with particular biological processes. Secondly, disease associated SNPs were mapped on circRNA loci, and Argonaute (Ago) interaction sites on circular RNAs were identified. We compiled a database of disease-circRNA association in Circ2Traits (http://gyanxet-beta.com/circdb/), the first comprehensive knowledgebase of potential association of circular RNAs with diseases in human.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Funding acquisitionRole: MethodologyRole: Writing – original draft
                Role: Funding acquisitionRole: Project administrationRole: Writing – review & editing
                Role: Formal analysisRole: Resources
                Role: Data curationRole: Software
                Role: InvestigationRole: Validation
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                20 May 2020
                May 2020
                : 16
                : 5
                : e1007568
                Affiliations
                [1 ] College of Information Science and Engineering, Zaozhuang University, Zaozhuang, China
                [2 ] Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
                [3 ] Department of Electrical Computer and Telecommunications Engineering Technology, Rochester Institute of Technology, Rochester, United States of America
                [4 ] School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
                [5 ] Department of Computing, Hong Kong Polytechnic University, Hong Kong, China
                University of Calgary, CANADA
                Author notes

                The authors declare that they have no competing interests.

                ‡ These authors are joint first authors on this work.

                Author information
                http://orcid.org/0000-0003-0184-307X
                http://orcid.org/0000-0003-1266-2696
                Article
                PCOMPBIOL-D-19-02030
                10.1371/journal.pcbi.1007568
                7266350
                32433655
                48dc4837-32fe-4c01-9a67-efc4f7a8e92e
                © 2020 Wang et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 22 November 2019
                : 23 March 2020
                Page count
                Figures: 6, Tables: 7, Pages: 19
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;
                Award ID: 61702444
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;
                Award ID: 61722212
                Award Recipient :
                Funded by: Chinese Postdoctoral Science Foundation
                Award ID: 2019M653804
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100013494, West Light Foundation of the Chinese Academy of Sciences;
                Award ID: 2018-XBQNXZ-B-008
                Award Recipient :
                This work is supported is supported in part by Awardee of the NSFC Excellent Young Scholars Program, under Grants 61722212, in part by the National Nature Science Foundation of China, under Grants 61702444, 61572506, in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciences, in part by the Chinese Postdoctoral Science Foundation, under Grant 2019M653804, in part by the West Light Foundation of The Chinese Academy of Sciences, under Grant 2018-XBQNXZ-B-008. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Social Sciences
                Linguistics
                Semantics
                Medicine and Health Sciences
                Infectious Diseases
                Disease Vectors
                Biology and Life Sciences
                Species Interactions
                Disease Vectors
                Medicine and Health Sciences
                Oncology
                Cancers and Neoplasms
                Breast Tumors
                Breast Cancer
                Biology and Life Sciences
                Ecology
                Ecosystems
                Forests
                Ecology and Environmental Sciences
                Ecology
                Ecosystems
                Forests
                Ecology and Environmental Sciences
                Terrestrial Environments
                Forests
                Engineering and Technology
                Management Engineering
                Decision Analysis
                Decision Trees
                Research and Analysis Methods
                Decision Analysis
                Decision Trees
                Medicine and Health Sciences
                Oncology
                Cancers and Neoplasms
                Colorectal Cancer
                Medicine and Health Sciences
                Oncology
                Cancers and Neoplasms
                Neurological Tumors
                Glioma
                Medicine and Health Sciences
                Neurology
                Neurological Tumors
                Glioma
                Custom metadata
                vor-update-to-uncorrected-proof
                2020-06-02
                All relevant files are available from https://github.com/look0012/GCNCDA/.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article