CDD: a Conserved Domain Database for protein classification

Marchler-Bauer, Aron; Anderson, John B.; Cherukuri, Praveen F; DeWeese-Scott, Carol; Geer, Lewis Y; Gwadz, Marc; He, Siqian; Hurwitz, David I.; Jackson, John D.; Ke, Zhaoxi; Lanczycki, Christopher J; Liebert, Cynthia A.; Liu, Chunlei; Lu, Jiu-Fu; Marchler, Gabriele H.; Mullokandov, Mikhail; Shoemaker, Benjamin A.; Simonyan, Vahan; Song, James S.; Thiessen, Paul A.; Yamashita, Roxanne A.; Yin, Jodie J.; ZHANG, DACHUAN; Bryant, Stephen H

doi:10.1093/nar/gki069

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

276

views

recommends

Record: found
Abstract: found
Article: not found

CDD: a Conserved Domain Database for protein classification

research-article

Author(s): Aron Marchler-Bauer , John B. Anderson , Praveen F. Cherukuri , Carol DeWeese-Scott , Lewis Y. Geer , Marc Gwadz , Siqian He , David I. Hurwitz , John D. Jackson , Zhaoxi Ke , Christopher J. Lanczycki , Cynthia A. Liebert , Chunlei Liu , Fu Lu , Gabriele H. Marchler , Mikhail Mullokandov , Benjamin A. Shoemaker , Vahan Simonyan , James S. Song , Paul A. Thiessen , Roxanne A. Yamashita , Jodie J. Yin , Dachuan Zhang , Stephen H. Bryant

Publication date (Electronic): 17 December 2004

Journal: Nucleic Acids Research

Publisher: Oxford University Press

Read this article at

ScienceOpenPublisher PMC

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as Proteins, Taxonomy and PubMed®, and can be accessed at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. CD-Search, which is available at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, is a fast, interactive tool to identify conserved domains in new protein sequences. CD-Search results for protein sequences in Entrez are pre-computed to provide links between proteins and domain models, and computational annotation visible upon request. Protein–protein queries submitted to NCBI's BLAST search service at http://www.ncbi.nlm.nih.gov/BLAST are scanned for the presence of conserved domains by default. While CDD started out as essentially a mirror of publicly available domain alignment collections, such as SMART, Pfam and COG, we have continued an effort to update, and in some cases replace these models with domain hierarchies curated at the NCBI. Here, we report on the progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system.

Related collections

Most cited references 7

Record: found
Abstract: found
Article: not found

CDART: protein homology by domain architecture.

Lewis Geer, Michael Domrachev, David Lipman … (2002)

The Conserved Domain Architecture Retrieval Tool (CDART) performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved domains in proteins. The algorithm finds protein similarities across significant evolutionary distances using sensitive protein domain profiles rather than by direct sequence similarity. Proteins similar to a query protein are grouped and scored by architecture. Relying on domain profiles allows CDART to be fast, and, because it relies on annotated functional domains, informative. Domain profiles are derived from several collections of domain definitions that include functional annotation. Searches can be further refined by taxonomy and by selecting domains of interest. CDART is available at http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi.

0 comments Cited 233 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

CDD: a database of conserved domain alignments with links to domain three-dimensional structure.

A. Marchler-Bauer (2002)

The Conserved Domain Database (CDD) is a compilation of multiple sequence alignments representing protein domains conserved in molecular evolution. It has been populated with alignment data from the public collections Pfam and SMART, as well as with contributions from colleagues at NCBI. The current version of CDD (v.1.54) contains 3693 such models. CDD alignments are linked to protein sequence and structure data in Entrez. The molecular structure viewer Cn3D serves as a tool to interactively visualize alignments and three-dimensional structure, and to link three-dimensional residue coordinates to descriptions of evolutionary conservation. CDD can be accessed on the World Wide Web at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. Protein query sequences may be compared against databases of position-specific score matrices derived from alignments in CDD, using a service named CD-Search, which can be found at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. CD-Search runs reverse-position-specific BLAST (RPS-BLAST), a variant of the widely used PSI-BLAST algorithm. CD-Search is run by default for protein-protein queries submitted to NCBI's BLAST service at http://www.ncbi.nlm.nih.gov/BLAST.

0 comments Cited 207 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

CDD: a curated Entrez database of conserved domain alignments.

A. Marchler-Bauer (2003)

The Conserved Domain Database (CDD) is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R). This allows users to search for domain types by name, for example, or to view the domain architecture of any protein in Entrez's sequence database. CDD can be accessed on the WorldWideWeb at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. Users may also employ the CD-Search service to identify conserved domains in new sequences, at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. CD-Search results, and pre-computed links from Entrez's protein database, are calculated using the RPS-BLAST algorithm and Position Specific Score Matrices (PSSMs) derived from CDD alignments. CD-Searches are also run by default for protein-protein queries submitted to BLAST(R) at http://www.ncbi.nlm.nih.gov/BLAST. CDD mirrors the publicly available domain alignment collections SMART and PFAM, and now also contains alignment models curated at NCBI. Structure information is used to identify the core substructure likely to be present in all family members, and to produce sequence alignments consistent with structure conservation. This alignment model allows NCBI curators to annotate 'columns' corresponding to functional sites conserved among family members.

0 comments Cited 136 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): 1 January 2005

Publication date (Electronic): 17 December 2004

Volume: 33

Issue: Database Issue

Pages: D192-D196

Affiliations

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38 A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA

Author notes

[*]

To whom correspondence should be addressed. Tel: +1 301 435 4919; Fax: +1 301 480 9241; Email: bauer@ 123456ncbi.nlm.nih.gov

[a]

The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact journals.permissions@ 123456oupjournals.org .

[a]

Article

Publisher ID: gki069

DOI: 10.1093/nar/gki069

PMC ID: 540023

PubMed ID: 15608175

SO-VID: f2489435-b556-447f-9e19-fa5725ba209b

History

Date received : 22 September 2004

Date accepted : 5 October 2004

Comments

Comment on this article

scite_

Cited by 244

ARDB—Antibiotic Resistance Genes Database
Authors: Bo Liu, Mihai Pop
Database resources of the National Center for Biotechnology Information
Authors:
GenBank
Authors: Dennis A Benson, Ilene Karsch-Mizrachi, David Lipman …

See all cited by

Most referenced authors 645

See all reference authors

- Version 1
- Version 1

CDD: a Conserved Domain Database for protein classification

Read this article at

Abstract

Related collections

Genomic Prediction

Most cited references 7

CDART: protein homology by domain architecture.

CDD: a database of conserved domain alignments with links to domain three-dimensional structure.

CDD: a curated Entrez database of conserved domain alignments.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 83

Cited by 244

Most referenced authors 645