Gene4Denovo: an integrated database and analytic platform for  de novo  mutations in humans

Zhao, Guihu; Li, Kuokuo; Li, Bin; Wang, Zheng; Fang, Zhenghuan; Wang, Xiaomeng; Zhang, Yi; Luo, Tengfei; Zhou, Qiao; Wang, Lin; Xie, Yali; Wang, Yijing; Chen, Qian Qian; Xia, Lu; Tang, Yu R.; Tang, Beisha; Xia, Kun; Li, Jinchen

doi:10.1093/nar/gkz923

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

views

recommends

Record: found
Abstract: found
Article: found

Is Open Access

Gene4Denovo: an integrated database and analytic platform for de novo mutations in humans

research-article

Author(s): Guihu Zhao ¹ ^, ² , Kuokuo Li ³ , Bin Li ¹ ^, ² , Zheng Wang ¹ , Zhenghuan Fang ³ , Xiaomeng Wang ³ , Yi Zhang ¹ , Tengfei Luo ³ , Qiao Zhou ¹ , Lin Wang ³ , Yali Xie ¹ , Yijing Wang ³ , Qian Chen ¹ , Lu Xia ³ , Yu Tang ¹ , Beisha Tang ¹ ^, ² , Kun Xia ³ , Jinchen Li ¹ ^, ² ^, ³

Publication date (Electronic): 23 October 2019

Journal: Nucleic Acids Research

Publisher: Oxford University Press

Read this article at

ScienceOpen Publisher PMC

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

De novo mutations (DNMs) significantly contribute to sporadic diseases, particularly in neuropsychiatric disorders. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) provide effective methods for detecting DNMs and prioritizing candidate genes. However, it remains a challenge for scientists, clinicians, and biologists to conveniently access and analyse data regarding DNMs and candidate genes from scattered publications. To fill the unmet need, we integrated 580 799 DNMs, including 30 060 coding DNMs detected by WES/WGS from 23 951 individuals across 24 phenotypes and prioritized a list of candidate genes with different degrees of statistical evidence, including 346 genes with false discovery rates <0.05. We then developed a database called Gene4Denovo ( http://www.genemed.tech/gene4denovo/), which allowed these genetic data to be conveniently catalogued, searched, browsed, and analysed. In addition, Gene4Denovo integrated data from >60 genomic sources to provide comprehensive variant-level and gene-level annotation and information regarding the DNMs and candidate genes. Furthermore, Gene4Denovo provides end-users with limited bioinformatics skills to analyse their own genetic data, perform comprehensive annotation, and prioritize candidate genes using custom parameters. In conclusion, Gene4Denovo conveniently allows for the accelerated interpretation of DNM pathogenicity and the clinical implication of DNMs in humans.

Related collections

Most cited references 76

Record: found
Abstract: found
Article: not found

Database resources of the National Center for Biotechnology Information

Richa Agarwala, Tanya Barrett, Jeff Beck … (2017)

Abstract The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. The Entrez system provides search and retrieval operations for most of these data from 39 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include PubMed Data Management, RefSeq Functional Elements, genome data download, variation services API, Magic-BLAST, QuickBLASTp, and Identical Protein Groups. Resources that were updated in the past year include the genome data viewer, a human genome resources page, Gene, virus variation, OSIRIS, and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

0 comments Cited 640 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

DANN: a deep learning approach for annotating the pathogenicity of genetic variants.

Daniel Quang, Yifei Chen, Xiaohui Xie (2015)

Annotating genetic variants, especially non-coding variants, for the purpose of identifying pathogenic variants remains a challenge. Combined annotation-dependent depletion (CADD) is an algorithm designed to annotate both coding and non-coding variants, and has been shown to outperform other annotation algorithms. CADD trains a linear kernel support vector machine (SVM) to differentiate evolutionarily derived, likely benign, alleles from simulated, likely deleterious, variants. However, SVMs cannot capture non-linear relationships among the features, which can limit performance. To address this issue, we have developed DANN. DANN uses the same feature set and training data as CADD to train a deep neural network (DNN). DNNs can capture non-linear relationships among features and are better suited than SVMs for problems with a large number of samples and features. We exploit Compute Unified Device Architecture-compatible graphics processing units and deep learning techniques such as dropout and momentum training to accelerate the DNN training. DANN achieves about a 19% relative reduction in the error rate and about a 14% relative increase in the area under the curve (AUC) metric over CADD's SVM methodology. All data and source code are available at https://cbcl.ics.uci.edu/public_data/DANN/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

0 comments Cited 418 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines.

Quan Li, Kai Wang (2017)

In 2015, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) published updated standards and guidelines for the clinical interpretation of sequence variants with respect to human diseases on the basis of 28 criteria. However, variability between individual interpreters can be extensive because of reasons such as the different understandings of these guidelines and the lack of standard algorithms for implementing them, yet computational tools for semi-automated variant interpretation are not available. To address these problems, we propose a suite of methods for implementing these criteria and have developed a tool called InterVar to help human reviewers interpret the clinical significance of variants. InterVar can take a pre-annotated or VCF file as input and generate automated interpretation on 18 criteria. Furthermore, we have developed a companion web server, wInterVar, to enable user-friendly variant interpretation with an automated interpretation step and a manual adjustment step. These tools are especially useful for addressing severe congenital or very early-onset developmental disorders with high penetrance. Using results from a few published sequencing studies, we demonstrate the utility of InterVar in significantly reducing the time to interpret the clinical significance of sequence variants.

0 comments Cited 410 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): 08 January 2020

Publication date (Electronic): 23 October 2019

Publication date PMC-release: 23 October 2019

Volume: 48

Issue: D1

Pages: D913-D926

Affiliations

[1 ] National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University , Changsha, Hunan, China

[2 ] Department of Neurology, Xiangya Hospital, Central South University , Changsha, Hunan 410008, China

[3 ] Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University , Changsha, Hunan, China

Author notes

To whom correspondence should be addressed. Tel: +8673189752406; Fax: +8673184327332; Email: lijinchen@ 123456csu.edu.cn

The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.

Author information

Guihu Zhao http://orcid.org/0000-0003-4033-1843

Article

Publisher ID: gkz923

DOI: 10.1093/nar/gkz923

PMC ID: 7145562

PubMed ID: 31642496

SO-VID: 27fe6f76-77bf-4b1e-9735-169f580c1dbb

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@ 123456oup.com

History

Date accepted : 08 October 2019

Date revision received : 19 September 2019

Date received : 15 August 2019

Page count

Pages: 14

Funding

Funded by: National Natural Science Foundation of China 10.13039/501100001809

Award ID: 81801133

Funded by: CAST 10.13039/100010097

Award ID: 2018QNRC001

Award ID: 20180033040004

Funded by: Natural Science Foundation for Young Scientists of Hunan Province, China

Gene4Denovo: an integrated database and analytic platform for de novo mutations in humans

Read this article at

Abstract

Related collections

G3: Genes|Genomes|Genetics

Most cited references 76

Database resources of the National Center for Biotechnology Information

DANN: a deep learning approach for annotating the pathogenicity of genetic variants.

InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines.

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 94

Cited by 27

Most referenced authors 5,124