A Rotation Invariant Shape Representation based on Wavelet Transform

This paper introduces an object shape representation which provides similarity search for databases with a large number of images. The proposed algorithm uses the following techniques: 1) the complex-valued wavelet transform to utilize the phase information, 2) the translation-invariant wavelet representation to normalize the shape orientation. Experimental results show that the proposed algorithm be able to normalize shape orientation more accurately than the normalized Fourier descriptors[1] and has better performance in similarity search than the planar curve descriptors[2].


Introduction
Image database systems are required to provide the similarity search mechanism based on multiple attributes such as color, texture, shape, etc., in order to meet various content-based retrieval requirements.To retrieve the object which is similar in shape from the database, objects' shapes are needed to be characterized by the orientation invariant features.Moreover, when the number of objects in database is very large (e.g., more than 100,000 objects), brute force search or template matching are very time consuming and not appropriate for similarity search, even if their time complexity are O(N ).If the objects' features are represented by the vector space model, similarity search can be done efficiently using the multidimensional index such as R-tree or kd-tree.
We are developing ExSight, a prototype multimedia information retrieval system, which provides object-based image retrieval [3].The system extracts objects from an input image automatically and characterizes them using features, such as color, shape, size, position, etc.The similarity search can be done by comparing between the stored objects and the reference object specified by an user.More specifically, it is performed by looking for the k-nearest neighbors of the reference, based on the similarity measure defined in the multidimensional feature vector space.Euclidean distance or Manhattan distance is used for the similarity measure.The system also has high performance data access methods based on multidimensional index, in order to fulfill the similarity search efficiently.This paper introduces the vector space representation of the object shape which is rotation invariant and less sensitive to deformation.

A Rotation Invariant Shape Representation
A rotation invariant shape representation is calculated as follows: 1. parameterize the contour to the discrete periodic signal, 2. apply the complex-valued wavelet transform to utilize the phase information of the contour,

The Arc Length
If the object image is given, the object's contour can be detected using the contour tracking algorithm [7].To apply dyadic wavelet transformation to the contour, it must be sampled to have 2 n points.We use the arc length in order to parameterize and sample the contour.We trace the contour counterclockwise and denote its coordinates (x 0 i ; y 0 i ); i = 1; 111; M .The parameterized contour is denoted by where t is normalized arc length, l k is arc length along the contour from (x 0 1 ; y 0 1 ) to (x 0 k ; y 0 k ), and L is total arc length (L = l M ).Then, we can sample 2 n points from the contour by (x( j 2 n ); y ( j 1 illustrates the arc length parameterization.The contour (center) is detected from the given image (left).The contour is sampled so that the length between points along the contour is constant.
Figure 1: Image, contour and its parameterization.

The Wavelet Transform
The wavelet transform is the technique that decompose a given signal into the components in both time and frequency localization by the multiresolution analysis.The scaling function '(x) and the wavelet function (x) which constructs the multiresolution analysis must satisfy the two-scale relation: If we have a representation of a signal at resolution level j + 1:

A Rotation Invariant Shape Representation based on Wavelet Transform
then we can also decompose it into its low pass and high pass components: where with j; k 2 Z, i.e., ' j;k (t) is a translated and dilated version of '(t).If the scaling function '(x) is orthonormal, the coefficients j;l and j;l can be calculated by the Fast Wavelet Transform: where 1 3 denotes a complex conjugate.
The wavelets are qualified by the following properties: (i) compactness of the support, (ii) symmetry, (iii) accuracy of the approximation, (iv) orthonormality.It is well-known that only the complex-valued wavelets can achieve the above all properties [4].By analogy to the Fourier descriptors, we used the complex Daubechies wavelets [5] to characterize the contour in the complex plane.

The Shift Normalization
The wavelet coefficients of two signals may be quite different, even if the two signals just differ by a time shift.Figure 2 shows the contour of cat's image (left) and its wavelet coefficients (right).The black and the grey polylines correspond to the wavelet coefficients whose contours start at the black and the grey points, respectively.It can be shown that shift of the starting point makes the coefficients quite different from the original.
The translation-invariant representation is achieved by computing the wavelet transform for all the circular time shifts of a signal and selecting the wavelet coefficients of the time shift which minimizes the cost function [6].In this paper, we adopts the vector entropy, which is a measure of the concentration of the energy of the coefficients, as the cost function.The vector entropy e of the wavelet coefficients c = fc k g is given by e (c) = 0 X k j c k kck j 2 log 2 j c k kck j 2 ; where If the shift which minimizes the cost function is not unique, we choose the one that maximizes S, the area of triangle made by c 1 ; c 2 ; c 3 : where Im(1) denotes imaginary part.
Challenge of Image Retrieval, Newcastle upon Tyne, 1998 The period of tan is so that P () has 4 extrema -2 maxima and 2 minima.We select the that Re(c 1 1e i ) > 0.

The Size Normalization
The size normalization can be done simply by the wavelet coefficients are scaled to an unit vector: 3 Experiments This section reports the numerical experiments of the proposed algorithm.The proposed method was tested on digital clip arts 1 .By applying the contour tracking algorithm [7] to the image, the contour of the objects in the image was extracted from the background of an uniform color.Excluding the same shape with different colors, 699 shapes were selected.The object contours were sampled to have 256 data points with the method in x2.1.And the shape features were approximated by 30 coefficients.(we chose 16 complex coefficients of lower resolution which carries the global approximation and excluded the lowest one which contains the position of shape.The complex coefficient has a real and a imaginary part.(16 0 1) 2 2 = 30.)Thus all objects were represented by the vector in 30 dimension space.The number of the coefficients affect on the discernibility of shapes and the space (memory) requirement and the search cost of the multidimensional index.If the number of coefficients decrease, the accuracy of similarity search degrades because the differences between shapes become small.If the number of coefficients increase, the performance of the multidimensional index degrades.We chose 30 from the experience.The similarity of two objects were measured by the Euclidean distance.

Accuracy of the Shift Normalization
To evaluate the shape orientation normalization, we applied the proposed method to the contours of 15, 30, 45, 60 and 75 degrees' rotated images and compared the starting points with the original contour.(90 degrees' rotation only switches x and y coordinates.)The contour shape of the rotated image is a little different from the original due to the quantization distortion.We also evaluated the effect of simple deformation on the proposed method using the images changed in aspect ratio, i.e., enlargement on horizontal or vertical.We compared the proposed algorithm with the Normalized Fourier Descriptors [1].We measured the shift of the starting point from the original and classified it as an error if its shift is greater than 16 points.The experimental results are summarized in Table 1.These results show that the proposed method is more accurate than the NFD.
Table 1: The error rate of the starting point normalization.

Accuracy of Similarity Search
Next, we evaluated the accuracy of similarity search of the proposed method.The conventional measures, recall and precision, were used for our evaluation.recall(x) = number of objects found and relevant to x the total number of objects relevant to x precision(x) = number of objects found and relevant to x the total number of objects found To illustrate the effectiveness of the proposed method, we compared ours with the planar curve descriptors [2], which represents x and y coordinate components separately using the real-valued wavelets.In this experiments, the spline function was used as its wavelet base.Since the planar curve descriptors cannot normalize the starting points of contours, we added the shift normalization step described in x2.3 and modified the planar curve descriptors.Figure 4,5 and 6 show the average performance of similarity search.Figure 4 is the case where the database has 699 2 6 (0, 15, 30, 45, 60, 75 degrees' rotated shapes) objects (case 1).In this case, we measured the precisions and recalls for retrieving 6 relevant shapes from 4194 images and averaged the performance over 699 kind of images.The proposed method has better performance than the planar curve descriptors.The complex-valued wavelet transform is seemed to be superior to the real-valued wavelets for the contour representation.
Figure 5 is the case where the database has 699 2 6 2 3 (original size, 10% enlarge on horizontal, and 10% enlarge on vertical) objects (case 2).In this case, 18 relevant shapes were retrieved from 12582 images.And, figure 6 is the case where the database has 699 2 6 2 5 (original size, 10% and 20% enlarge on horizontal, and 10% and 20% enlarge on vertical) objects (case 3).The performance becomes worse as the change in aspect ratio increases.However, the proposed method always performs better than the planar curve descriptors because of the complex-valued wavelets.

Conclusion
In this paper, we proposed the object shape representation which provides similarity search for databases with a large number of images.This method used the following techniques: 1) the complex-valued wavelet transform to utilize the phase information, 2) the translation-invariant wavelet representation to normalize the shape orientation.
Experimental results showed the proposed method can normalize the shape orientation even if the aspect ratio changes and is more accurate than the normalized Fourier descriptor [1].And the performance measurements on similarity search showed the proposed method is superior to the conventional algorithm [2].
The proposed method is very suitable for the multidimensional indexing technique which provides the efficient similarity search, i.e., top k ranking of objects similar to the reference without a brute force computation.
Challenge of Image Retrieval, Newcastle upon Tyne, 1998

3 .
apply shift normalization to the wavelet coefficients, 4. apply rotation normalization, 5. apply size normalization.Challenge of Image Retrieval, Newcastle upon Tyne, 1998 A Rotation Invariant Shape Representation based on Wavelet Transform

Figure 3 :
Figure 3: The effect of rotation on the wavelet coefficients.