There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
The database of Clusters of Orthologous Groups of proteins (COGs), which represents
an attempt on a phylogenetic classification of the proteins encoded in complete genomes,
currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria,
archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih. gov/COG).
In addition, a supplement to the COGs is available, in which proteins encoded in the
genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the
fruit fly Drosophila melanogaster, and shared with bacteria and/or archaea were included.
The new features added to the COG database include information pages with structural
and functional details on each COG and literature references, improvements of the
COGNITOR program that is used to fit new proteins into the COGs, and classification
of genomes and COGs constructed by using principal component analysis.