Performance Analysis of Three Likelihood Measures for Color Image Processing Performance Analysis of Three Likelihood Measures for Color Image Processing

Performance Analysis of Three Likelihood Measures for Color Image Processing

Arash Abadpour

Shohreh Kasaei

Mathematics Science Department, Sharif University of Technology, Tehran, Iran

Computer Engineering Department, Sharif University of Technology, Tehran, Iran

Abstract

Image segmentation is a low-level operation, concerned with partitioning an image into homogeneous regions. In a large number of applications, segmentation plays a fundamental role for the subsequent higher-level operations; such as recognition, object-based image/video compression, object tracking, scene analysis, and object-based image editing. Until recently, attention was focused on segmentation of grayscale images, but the advances in computational power and instrumentation has evolved the research on color image segmentation. Although, many researchers have tried to extend the methods of grayscale image segmentation to the color images, working in this multidimensional field, enables the implementation of more efficient methods. At the heart of any color image segmentation method, relies an appropriate likelihood measure or a suitable homogeneity criteria. In this paper, the performance of the three paradigms of Euclidean distance, Mahalonobis distance, and reconstruction error are analyzed in terms of achieving perfect likelihood measures, robustness, and leading to promising homogeneity decisions. While the Euclidean distance is proved to perform poor in all cases, the proposed reconstruction error out-performs the Mahalonobis distance with much lower computation cost.

Principle Component Analysis Color Image Processing Segmentation Fuzzy Sets Mahalonobis Distance Euclidean Distance.

1 Introduction

Image segmentation is a low-level operation, concerned with partitioning an image into homogeneous regions [1,2,3]. Although, there are many different descriptions for homogeneity, such as the intensity, texture, and color, to name a few, color is known as a more appropriate homogeneity descriptor, because of its simplicity and intuitionism [4]. In a Large number of applications (in image processing, computer vision, and computer graphics), segmentation plays a fundamental role before applying the higher-level operations like recognition, object-based image/video compression, object tracking, scene analysis, and object-based image editing [5,6].

As image segmentation is a spatial-spectral process (finding clusters in the spectral histogram which result in meaningful homogeneous regions in the spatial domain), the problem is solved in a give-and-take fashion between satisfying (sometimes) contradictory spectral and spatial concerns. It is proved that any image segmentation approach must solve the problem of finding either a suitable likelihood measure or a meaningful region homogeneity criteria [4]. In this paper, we refer to the problem of finding a meaningful likelihood measure and a region homogeneity criteria.

Some simple methods are reported in the literature, that extend grayscale segmentation methods to the color image segmentation era (e.g. see [7]). Investigating this extended-grayscale methods makes clear that they are also using a min/max fashion likelihood measure.

Using the Euclidean distance is generally accepted as a proper likelihood measure, while different authors add some extra processing to improve the results [4]. For example in [8] the author uses the CIE-Lab color space. Having in mind that the Euclidean distance only depends on the central point of the cluster and not on the marginal values and the spread of it, it is clear that (theoretically) The Euclidean distance is a simple but not a perfect solution. Although in [9] the author claims to use a better designed fuzzy scheme in the CIE-Lab space, at the heart of the method the Euclidean distance of the point and the cluster center are ruling the likelihood. In [10] the authors uses the Euclidean distance without any manipulations.

To overcome the shortcomings of the Euclidean distance-based likelihood measures, in [11] the authors uses a weighted Euclidean distance, which when tuned well, converges to the well-known Mahalonobis distance. In this approach, not only the central point of the cluster is considered, but also the spread of the data is incorporated as a set of normalizing factors.

As the likelihood measures tend to compute the distance between a point and a cluster, they rank better members with smaller numbers, as is the purpose of any distance measure. This is in fact in contrast with the ordinary fuzzyfication approach which marks better members with larger numbers in the range of [0,1] and vice versa. Thus, a mapping function is needed to produce the desired result. The general definition of such a mapping is y=f([x/(s)]), while f is an odd function, mapping [-¥,¥]®[0,1] in an upside down scheme. Considering a cluster r, on which the distance function is defined as e_r([c\vec]), leads to the likelihood measure of f([(e_r([c\vec]))/(s)]). It is clear that for all [c\vec] Î r, we expect the likelihood measure to be near 1. Thus, s must be statistically computed in terms of {e_r([c\vec])|[c\vec] Î r} while f must have a flat ceil; we expect f to map [-1,1] to the vicinity of 1. Note that, this is in contrast with the generally-used Gaussian functions [10].

The main purpose of this paper is to compare the three likelihood definitions of Euclidean distance ([e\tilde]^E_r,p([c\vec])), Mahalonobis distance ([e\tilde]^M_r,p([c\vec])), and the proposed reconstruction error [12] ([e\tilde]^R_r,p([c\vec])), in terms of image fuzzyfication and homogeneity decision. In this paper we compare [e\tilde]^E_r,p([c\vec]), [e\tilde]^M_r,p([c\vec]), and [e\tilde]^R_r,p([c\vec]) as likelihood measures. Also, we consider whether ||e^E_r||_r,p, ||e^M_r||_r,p, and ||e^R_r||_r,p are well-defined homogeneity criteria.

The following sections of this paper are organized as follows, Section 2 describes the method used in this paper, while Section 3 discusses the experimental results and Section 4 concludes the paper.

2 Proposed Algorithm

The Euclidean distance is the most generally used likelihood measure, defined as:

e^E_r(

)=(

)^T(

)

(1)

where [c\vec] is the vector, we want to measure its distance to the cluster r, and [(h)\vec] is the expectation of the color vectors of cluster r.

The Mahalonobis distance is a well-known likelihood measure, defining the membership of [c\vec] to the cluster r as:

e^M_r(

)=(

)^T C^-1 (

)

(2)

where, C is the covariance matrix of the color vectors of cluster r, with its columns indicating the direction of the principal components ordered in a descending fashion (corresponding to the associated eigenvalues).

In [12] the authors proposed to use the error made by neglecting the two less important principal components as a likelihood measure. The likelihood of the vector [c\vec] to the cluster r is defined as [12]:

e^R_r(

)=|| \acute

(

)

)||

(3)

while [v\vec] shows the direction of the first principal component and ||[x\vec]|| denotes the normalized L₁ norm:

||=

S_i=1^N |x_i|

(4)

Investigating (1), (2), and (3) makes clear that, to make e^X_r([c\vec]) comparable over different clusters, a normalization scheme is crucial (X Î {E,M,R}).

In [12] the authors proposed to use the following stochastic margin as the normalization factor:

||f||_r,p=arg_e

æ
è

P_{[x\vec] Î r}{f(

) £ e} ³ p

ö
ø

(5)

where p is the inclusion percentage. Equation (5) leads to the definition of the normalized likelihood functions [12]:

X
r,p

(

e^X_r(

)

||e^X_r||_r,p

(6)

Also, the ||e^X_r||_r,p can be used as a homogeneity criteria for usages such as Quad-tree decomposition [13] and region growth [2].

It is clear that all e^X_r,p([c\vec]) are giving lower values for the color vectors similar to those exist in r. Thus, a fuzzy membership function is needed to map [-1,1]®[1,1-e] and [-¥,-1]È[1,¥]®[1-e,0]. In [12] the authors proposed a manipulated form of the well-known low-pass Butterworth filter for the sake of tunability and simplicity, as:

B_{a, b}(x)=

æ
è

t_a,b

ö
ø

2N_{a, b}

ö
ø

-¹/₂

(7)

where N_{a, b} and t_{a, b} are defined as [12]:

N_{a, b}=

é
ë

log₂

æ
è

1-b²

1- a²

ö
ø

ù
û

(8)

t_{a, b} = a^{[1/(N_{a, b})]}

æ
è

1-a²

ö
ø

-[1/(2N_{a, b})]

(9)

where [x] denotes the nearest integer value to x. The function is designed in the way that satisfies:

B_{a, b}(1)=a, B_{a, b}(2)=b

(10)

It is clear that selecting a large member of ]0,1[ as the a value and an small member of ]0,a[ as the b value, leads to a desired fuzzyfication. Note that the above definition of membership functions is in contrast with the general selection of the Gaussian functions.

Although in the definition of the Euclidean distance and the Mahalonobis distance, Principal Component Analysis(PCA) [14,15,16] is not referred, but they are both marginal samples of using PCA for measuring the likelihood. In the terminology of PCA, the expectation of the cluster is defined as [16]:

= arg_[v\vec]

æ
è

min

é
ë

E {|

|²}

ù
û

ö
ø

(11)

which is the zero dimensional representation of the data. It can be proved that the above definition leads to the canonical definition of expectation [16]. Defining the k-th residue of [x\vec] as [16]:

D_k(

k
å
i=1

T
i

(

)

(12)

for k=0¼N (N is the original dimension of the cluster), where [v\vec]_i is the direction of the i-th principal component of the cluster, it is clear that the Euclidean distance equals ||D₀([x\vec])||₂ [16], while reconstruction equals ¹/₃||D₁([x\vec])||₁ [12]. The relation of the Mahalonobis distance to the PCA theory is straight-forward: The Mahalonobis distance is the square root of the squared principal components weighted by their importance (the eigenvalues) [16]. To see details about using the PCA in color image processing see [17,12].

Although RGB color space is used in this paper, but having in mind that the PCA representation is translation, rotation, and scale invariant [16], while all measures and criteria investigated in this paper rely on PCA, it is clear that the conclusions of this paper applies to any color space, linearly derived from RGB in a reversible fashion [18] (such as CMYK [19], YC_bC_r [20], YIQ [21], YUV [22], and I₁I₂I₃ [23]). This does not hold for non-linearly derived color spaces (e.g. HSI [24], HSV [25,26], CIE-XYZ, CIE-La^*b^*, CIE-Lu^*v^*, CIE-L^*H^oC^* [27,28,29], and HMMD [30]) or irreversible linear ones.

3 Experimental Results

To compare the performance of different likelihood measures, 42 color images (including the standard images Lena, Mandrill, Airplane, Peppers, Girl, and Couple), in RGB color space were used (all low-compressed jpeg files). In each image 10 rectangular homogeneous regions were selected, and the corresponding fuzzyfication results were compared subjectively. To test the robustness of the methods, tests were performed with regions containing up-to 25% irrelevant points and with different values of p. Also, the complexity of the methods were computed in terms of flops. To compare different homogeneity criteria, 49 randomly selected rectangular regions in each test image, were sorted according to each homogeneity criterion in a descending fashion. All experiments were performed on a large database of color images [31] and some samples representing the whole results were selected for better illustration of the results.

Figure 1 shows the log-magnitude of the membership function, B_a,b for a = 0.9 with different values of b. It is clear that while larger values of b increases the domain of the typical members (those with larger likelihood values), the function has an area of almost-one values for the interior points of the cluster. This event does not happen for the Gaussian function, in which one must increase the domain of better members, by increasing s, leading to an overall increase of membership values. Also, the sharp transition between the points (1,a) and (2,b), which enables a crisp classification of points in the proposed membership functions, does not exist in the Gaussian function.

Figure 1: Log-magnitude of the membership function (B_a,b) for a = 0.9 with different values of b.

The computational complexity of each likelihood method can be separated into two phases of: data-collection and utilization. It is clear that in terms of data-collection complexity, the Euclidean distance is the simplest (just needs the expectation of the cluster), while the Mahalonobis distance is the most complex (relies on computation of the expectation and also all eigenvalues and eigenvectors). The utilization phase of the Euclidean distance consists of 3+3+1=7 flops, while the Mahalonobis distance needs 2×9×(3+3)+3=111 flops, while the reconstruction error uses only 3+3+3+3+3+7=22 flops. Thus, in terms of utilization complexity, the Mahalonobis distance is the more consuming, while the Euclidean is the fastest. The same occurs for the amount of the memory needed to code the likelihood measure of a region in the three methods (3, 12, and 6 real numbers for the three methods of Euclidean distance, Mahalonobis distance, and reconstruction error, respectively).

Figure 2 shows the results of fuzzyfication one of the samples with these three methods, based on four typical base regions. In all cases, the parameters are set as a = 0.95, b = 0.4, p=0.9. Note that in the entire paper, to give a better printing result, the fuzzyficated images are printed in an inverted fashion. It is clear that while non of the figures 2-b, 2-f, 2-j, and 2-n (samples of The Euclidean fuzzyfication) have a desirable fuzzyfication and do not enabling the user to classify members and non members with a reasonable margin of certainty, samples corresponding to the Mahalonobis distance and the reconstruction error are promising.


(a)	(b)	(c)	(d)

(e)	(f)	(g)	(h)

(i)	(j)	(k)	(l)

(m)	(n)	(o)	(p)

Figure 2: Comparison of fuzzyfication methods, from left to right, original images adopted from [31] along with the base region, the Euclidean distance, the Mahalonobis distance, and the reconstruction error.

Figure 3 shows a few results of comparing the robustness of different methods against partially non-homogeneous base regions. In this test, base regions were contaminated with at most 25% irrelevant points. In figure 3-a a small portion of the leaf is added to the apple, resulting in a desperate Euclidean distance-based fuzzyfication show in figure 3-b, which both include the apple and the leaf. While the two other methods led to more acceptable results, the reconstruction error gave a better neglecting of the irrelevant points (compare Figures 3-c and 3-d). The same happens for the case of initializing all methods with almost entirely grass points and a small portion of the red ribbon (Figure 3-e). The Euclidean distance not only has failed to reject ribbon points, it also have marked the apple points with high likelihood (Figure 3-f). The two other methods have better compensated the effect of irrelevant point. Comparing Figures 3-g and 3-h makes it clear that the reconstruction error has neglected both the ribbon points and the apple points better than the Mahalonobis distance.


(a)	(b)	(c)	(d)

(e)	(f)	(g)	(h)

Figure 3: Comparison of the robustness of fuzzyfication methods against non-homogeneous base regions. From left to right, original images adopted from [31] along with the base region, the Euclidean distance, the Mahalonobis distance, and the reconstruction error. (Original image shows a traditional Iranian Seven-S tablecloth.).

Figure 4 shows the results of comparing the robustness of the three different methods against parameter selection. As the effects of selecting parameters a and b is the same for all methods, here just the effect of selecting p is investigated. Using the image and the base region shown in Figure 4-a, the set of images in Figures 4-b, 4-c, and 4-d show the results of fuzzyfication by the Euclidean distance, the Mahalonobis distance, and the reconstruction error, respectively: while p is set to the members of {0.9,0.8,¼,0.1} iteratively, in each case. In each set of images the top left image shows the result of selecting the largest value of p (0.9), while the bottom right image belongs to the smallest choice of p (0.1). Comparing the case of p=0.9 in the three sets, shows another shortcoming of the Euclidean distance-based fuzzyfication, in which increasing p results in inclusion of irrelevant points in the cluster. In the case of p=0.9 for the two other methods, the fuzzyfication result contains holes of low values corresponding to the yellow flowers, which are not the members of the clusters for sure, but in the Euclidean distance the holes are filled. Comparing the sequence of fuzzyfication results in the three methods shows another promising result of the Mahalonobis distance and reconstruction error. In contrast with the Euclidean distance, in which by decreasing p, the membership function declines in all points, in the two other methods, decreasing p forces the cluster to tune more fine, rejecting less typical members and drawing the cluster edge around the set of best members.


(a)



(b)



(c)



(d)

Figure 4: Robustness of different fuzzyfication Methods against selecting different values of p. (a) Original image adopted from [31]. (b) The Euclidean distance. (c) The Mahalonobis distance. (d) The reconstruction error.

Figure 5 shows the sample image, out of which 49 random rectangular regions are selected to compare different homogeneity criteria. Figure 6 shows the randomly selected regions sorted in terms of different homogeneity criteria in a descending fashion. In each set of regions, the top left image is known to be the most homogeneous, while the bottom right region is believed to have the poorest homogeneity. In the above discussion we call the regions by the order of their homogeneity computed by each method. Investigating the order of regions computed by the Euclidean distance (Figure 5-a) makes t clear that the ordering does not comply with the human perception. For example compare the regions 40 to 43 with the 37-th and 38-th one. Although, the Euclidean distance shows that the 37-th and 38-th regions are more homogeneous, the human observers order them inversely. The same contradiction happens in the set of regions ordered by the Mahalonobis distance (Figure 5-b). For example compare the 15-th and the 14-th or the 28-th and the 4-th regions. Investigating Figure 5-c shows the similarity of the human perception and the reconstruction error homogeneity criteria. Looking at the set of regions ordered by the reconstruction error, from the first to the last, makes clear that the first regions (1¼24-th) are the regions containing almost just one dominant color, while the regions of the two and three colors are coming afterwards.

Figure 5: Sample image for homogeneity criteria comparison. (Adopted from [31].).

(a)

(b)

(c)

Figure 6: Randomly selected regions of figure 5 sorted in terms of different homogeneity criteria in a descending fashion (a) The Euclidean distance. (b) The Mahalonobis distance. (c) the reconstruction error.

4 Conclusion

The three vector-to-cluster distance measures of Euclidean distance, Mahalonobis distance, and reconstruction error are compared, both as the likelihood measures and the homogeneity criteria. Although, the Euclidean distance (which is used commonly by the researchers) is the fastest method and needs the least memory capacity, it neither shows applicable fuzzyfication results, nor leads to a proper homogeneity criteria. Comparing the two other methods, the reconstruction error is proved to be more robust against improper base region selection, leading to a promising homogeneity criteria, while also needing less memory with low computational cost. In contrast with the common use of the Gaussian function as the membership function, the butter-worth low-pass is used and its higher performance is proved mathematically and experimentally.

Acknowledgement

The first author wishes to thank Ms. Azadeh Yadollahi for her encouragement and invaluable ideas. We also thank the generous photographers Ms. Shohreh Tabatabaii Seifi and Ali Qanavati for their incorporation of their rich digital image archive.

References

[1]: K. Fu, J. Mui, A survey on image segmentation, Pattern Recognition 13 (1981) 3-16.
[2]: A. Rosenfeld, A. Kak, Digital Picture Processing, Vol. 2, 2^nd ed., Academic Press, New York, NY, 1982.
[3]: R. Haralick, L. Shapiro, Image segmentation techniques, Computer Vision, Grpahics, and Image Processing 29 (1) (Jan. 1985) 100-132.
[4]: L. Lucchese, S. Mitra, Color image segmentation: A state-of-the-art survey (invited paper), Image Processing, Vision, and Pattern Recognition, Proc. of the Indian National Science Academy (INSA-A) 67, A, No. 2 (March 2001) pp. 207-221.
[5]: J. Foley, A. V. Dam, S. Feiner, J. Hughes, Computer Grpahics: Principles and Practice, Addison-Wesley, Reading MA, 1990.
[6]: M. A. Ruzon, C. Tomasi, Alpha estimation in natural images, in: IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head, South Carolina, June 13 - 15, 2000.
[7]: J. Bruce, T. Balch, M. Veloso, Fast and cheap color image segmentation for interactive robots, in: Proceedings of IROS-2000, Japan, 2000.
[8]: K.-K. Ma, J. Wang, Color distance histogram: A novel descriptor for color image segmentation, in: Seventh International Conference on Control, Automation, Robotics, and Vision (ICARCV'02), Singapore, December 2002, pp. 1228-1232.
[9]: X. Dai, I. Maeda, Fuzzy-based segmentation of color natural images, in: IEEE ICSP'02, 2002, pp. 969-972.
[10]: T. Q. Chen, Y. Lu, Color image segmentation - an innovative approach, Pattern Recognition 35 (2002) 395-405.
[11]: Q. Ye, W. Gao, W. Zeng, Color image segmentation using density-based clustering, in: Proceedings of 2003 IEEE International Conference on Acoustic and Signal Processing, Hog Kong, April 6-10 2003, pp. 401-404.
[12]: A. Abadpour, S. Kasaei, New principle component analysis based methods for color image processing, Submitted to the Journal of Visual Communication and Image Representation.
[13]: H. Samet, Region representation: Quadtrees from boundary codes, Comm. ACM 21 (March 1980) 163:170.
[14]: I. T. Jolliffe, Principal Component Anslysis, Springer-Verlag, Heidelberg-New York, 1986.
[15]: M. Kendall, Multivariate Analysis, Charlas Griffin and Co., 1975.
[16]: A. Hyvärinen, E. Oja, Independent component analysis: Algorithms and applications, Neural Networks 13(4-5) (2000) 411-430.
[17]: S.-C. Cheng, S.-C. Hsia, Fast algorithm's for color image processing by principal component analysis, Jurnal of Visual Communication and Image Representation 14 (2003) 184-203.
[18]: P. Hao, Q. Shi, Comparative study of colortransforms for image coding and derivation of integer reversible color transform, 2000, pp. 224-227.
[19]: G. W. W.S. Stiles, Color Science Concepts and Methods, Quantitative Data and Formula, John Wiley and Sons Inc., New York, 2000.
[20]: ITU-R Recomendation BT-601-5: Studio Encoding Parameters of Digital Television for Standard 4:3 and Widescreen 16:9 Aspect Ratios, http://www.itu.ch/, Geneva, 1994.
[21]: R. b. J. C. W. Benson, K.B., Television Engineering Handbook, Mc Graw-Hill, New York, London, 1992.
[22]: J. Slater, Modern Television System to HDTV and Beyond, Pitman, London, 1991.
[23]: Y. K. Y. I. Ohta, T. Saki, Color information for region segmentation, Computer Graphics and Image Processing 13, No.3 (July 1980) 222-241.
[24]: G. T. W. S. Tenenbaum, J.M., H. Welf, An interactive facility for scene analysis research, Tech. Rep. 87, Stanford Research Institute, AI Centre (1974).
[25]: J. Foley, A. Van Dam, Fundamentals of Interactive Computer Graphics, The System Programming Series,, Addison-Wesley, Reading, MA., 1982, Reprinted 1984 with corrections.
[26]: N. M. Ledley, S., T. Golab, Fundamentals of true color image processing, in: tenth International Conference on Pattern Recognition, Atlantic City, 1990, pp. 791-795.
[27]: CIE. International Lighting Vocabulary, CIE Publications, 4th edition, 1989.
[28]: H. G. Levkowitz H., Color scales for image data, IEEE Computer Graphics and Application.
[29]: G. W. W.S. Stiles, Color Science Concepts and Methods, Quantitative Data and Formula, John Wiley and Sons Inc., New York, 2000.
[30]: B. Manjunath, J.-R. Ohm, V. Vasuderan, A. Yamada, Color and texture descriptors, IEEE Transaction on Circuits and Systems for Video Processing 11 No. 6 (June 2001) 703-715.
[31]: S. T. Seifi, A. Qanavati, Digital color image archive, Qnavati@mehr.sharif.edu.

File translated from T_EX by T_TH, version 3.72.
On 01 Aug 2006, 12:13.