8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      FusionGDB 2.0: fusion gene annotation updates aided by deep learning

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A knowledgebase of the systematic functional annotation of fusion genes is critical for understanding genomic breakage context and developing therapeutic strategies. FusionGDB is a unique functional annotation database of human fusion genes and has been widely used for studies with diverse aims. In this study, we report fusion gene annotation updates aided by deep learning (FusionGDB 2.0) available at https://compbio.uth.edu/FusionGDB2/. FusionGDB 2.0 has substantial updates of contents such as up-to-date human fusion genes, fusion gene breakage tendency score with FusionAI deep learning model based on 20 kb DNA sequence around BP, investigation of overlapping between fusion breakpoints with 44 human genomic features across five cellular role's categories, transcribed chimeric sequence and following open reading frame analysis with coding potential based on deep learning approach with Ribo-seq read features, and rigorous investigation of the protein feature retention of individual fusion partner genes in the protein level. Among ∼102k fusion genes, about 15k kept their ORF as In-frames, which is two times compared to the previous version, FusionGDB. FusionGDB 2.0 will be used as the reference knowledgebase of fusion gene annotations. FusionGDB 2.0 provides eight categories of annotations and it will be helpful for diverse human genomic studies.

          Related collections

          Most cited references32

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          DrugBank 5.0: a major update to the DrugBank database for 2018

          Abstract DrugBank (www.drugbank.ca) is a web-enabled database containing comprehensive molecular information about drugs, their mechanisms, their interactions and their targets. First described in 2006, DrugBank has continued to evolve over the past 12 years in response to marked improvements to web standards and changing needs for drug research and development. This year’s update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years. In many cases, existing data content has grown by 100% or more over the last update. For instance, the total number of investigational drugs in the database has grown by almost 300%, the number of drug-drug interactions has grown by nearly 600% and the number of SNP-associated drug effects has grown more than 3000%. Significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions. A great deal of brand new data have also been added to DrugBank 5.0. This includes information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics) and protein expression levels (pharmacoprotoemics). New data have also been added on the status of hundreds of new drug clinical trials and existing drug repurposing trials. Many other important improvements in the content, interface and performance of the DrugBank website have been made and these should greatly enhance its ease of use, utility and potential applications in many areas of pharmacological research, pharmaceutical science and drug education.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Integrative analysis of 111 reference human epigenomes

            The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but a similar reference has lacked for epigenomic studies. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection to-date of human epigenomes for primary cells and tissues. Here, we describe the integrative analysis of 111 reference human epigenomes generated as part of the program, profiled for histone modification patterns, DNA accessibility, DNA methylation, and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically-relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation, and human disease.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants

              The information about the genetic basis of human diseases lies at the heart of precision medicine and drug discovery. However, to realize its full potential to support these goals, several problems, such as fragmentation, heterogeneity, availability and different conceptualization of the data must be overcome. To provide the community with a resource free of these hurdles, we have developed DisGeNET (http://www.disgenet.org), one of the largest available collections of genes and variants involved in human diseases. DisGeNET integrates data from expert curated repositories, GWAS catalogues, animal models and the scientific literature. DisGeNET data are homogeneously annotated with controlled vocabularies and community-driven ontologies. Additionally, several original metrics are provided to assist the prioritization of genotype–phenotype relationships. The information is accessible through a web interface, a Cytoscape App, an RDF SPARQL endpoint, scripts in several programming languages and an R package. DisGeNET is a versatile platform that can be used for different research purposes including the investigation of the molecular underpinnings of specific human diseases and their comorbidities, the analysis of the properties of disease genes, the generation of hypothesis on drug therapeutic action and drug adverse effects, the validation of computationally predicted disease genes and the evaluation of text-mining methods performance.
                Bookmark

                Author and article information

                Contributors
                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                07 January 2022
                10 November 2021
                10 November 2021
                : 50
                : D1
                : D1221-D1230
                Affiliations
                School of Biomedical Informatics, The University of Texas Health Science Center at Houston , Houston, TX 77030, USA
                School of Biomedical Informatics, The University of Texas Health Science Center at Houston , Houston, TX 77030, USA
                School of Biomedical Informatics, The University of Texas Health Science Center at Houston , Houston, TX 77030, USA
                Intellectual Information Team, Future Medicine Division, Korea Institute of Oriental Medicine , Daejeon, South Korea
                Department of Neurology, Asan Medical Center , Seoul, Korea
                School of Biomedical Informatics, The University of Texas Health Science Center at Houston , Houston, TX 77030, USA
                School of Biomedical Informatics, The University of Texas Health Science Center at Houston , Houston, TX 77030, USA
                McGovern Medical School, The University of Texas Health Science Center at Houston , Houston, TX 77030, USA
                School of Dentistry, The University of Texas Health Science Center at Houston , Houston, TX 77030, USA
                Author notes
                To whom correspondence should be addressed. Tel: +1 713 500 3636; Email: pora.kim@ 123456uth.tmc.edu
                Author information
                https://orcid.org/0000-0002-8321-6864
                https://orcid.org/0000-0003-4090-113X
                https://orcid.org/0000-0001-7191-6495
                Article
                gkab1056
                10.1093/nar/gkab1056
                8728198
                34755868
                9a2936a0-7f95-46ee-9296-3806ec58b83a
                © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License ( https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@ 123456oup.com

                History
                : 03 November 2021
                : 10 October 2021
                : 14 September 2021
                Page count
                Pages: 10
                Funding
                Funded by: National Institutes of Health, DOI 10.13039/100000002;
                Award ID: R35GM138184
                Funded by: University of Texas Health Science Center at Houston, DOI 10.13039/100012615;
                Categories
                AcademicSubjects/SCI00010
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article