There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
In conventional hierarchical clustering methods, any object can belong to only one
class or cluster. We present here an application of the pyramidal classification method
to biological objects, which illustrates the intuitively appealing idea that some
objects may belong simultaneously to two classes. In a first step, we performed an
all-by-all comparison of all the open reading frames in the genomes from S. cerevisiae,
M. jannaschii, E. coli, H. influenzae and Synechocystis. In a second step, a series
of connex classes was built, each connex class containing all those sequences that
were linked by a Z-value (obtained after 100 sequence shufflings) greater than a given
threshold. Finally, each connex class was submitted to a pyramidal classification.
Three examples of such classifications are given, concerning two sets of multi-domains
protein sequences and a family of aminoacyl-tRNA synthetases. They make it clear that
the linear order among the classified objects that results from the pyramidal classification
is useful in deciphering the multiple relationships that can exist between the objects
under study. A program for calculating and displaying a pyramidal classification from
a dissimilarity matrix is available from http:/(/)genome.genetique.uvsq.fr/Pyramids.
The pyramidal classifications of the connex classes from the five organisms (intra-
and inter-genomic comparisons) are available from http:/(/)www.gene-it.com under the
family item.