Performance Analysis of Three Likelihood Measures for Color Image Processing Performance Analysis of Three Likelihood Measures for Color Image Processing
Performance Analysis of Three Likelihood Measures for Color Image Processing
Arash Abadpour
,
Shohreh Kasaei
Mathematics Science Department, Sharif University of Technology, Tehran, Iran
Computer Engineering Department, Sharif University of Technology, Tehran, Iran
Abstract
Image segmentation is a low-level operation, concerned with
partitioning an image into homogeneous regions. In a large number
of applications, segmentation plays a fundamental role for the
subsequent higher-level operations; such as recognition,
object-based image/video compression, object tracking, scene
analysis, and object-based image editing. Until recently,
attention was focused on segmentation of grayscale images, but
the advances in computational power and instrumentation has
evolved the research on color image segmentation. Although, many
researchers have tried to extend the methods of grayscale image
segmentation to the color images, working in this
multidimensional field, enables the implementation of more
efficient methods. At the heart of any color image segmentation
method, relies an appropriate likelihood measure or a suitable
homogeneity criteria. In this paper, the performance of the three
paradigms of Euclidean distance, Mahalonobis
distance, and reconstruction error are analyzed in terms
of achieving perfect likelihood measures, robustness, and leading
to promising homogeneity decisions. While the Euclidean distance
is proved to perform poor in all cases, the proposed
reconstruction error out-performs the Mahalonobis distance with
much lower computation cost.
Principle Component Analysis Color Image Processing Segmentation Fuzzy Sets Mahalonobis Distance Euclidean Distance.
Image segmentation is a low-level operation, concerned with
partitioning an image into homogeneous
regions [1,2,3]. Although, there are
many different descriptions for homogeneity, such as the
intensity, texture, and color, to name a few, color is known as a
more appropriate homogeneity descriptor, because of its simplicity
and intuitionism [4]. In a Large number of
applications (in image processing, computer vision, and computer
graphics), segmentation plays a fundamental role before applying
the higher-level operations like recognition, object-based
image/video compression, object tracking, scene analysis, and
object-based image editing [5,6].
As image segmentation is a spatial-spectral process
(finding clusters in the spectral histogram which result in
meaningful homogeneous regions in the spatial domain), the problem
is solved in a give-and-take fashion between satisfying
(sometimes) contradictory spectral and spatial concerns. It is
proved that any image segmentation approach must solve the
problem of finding either a suitable likelihood measure or a
meaningful region homogeneity criteria [4]. In this
paper, we refer to the problem of finding a meaningful likelihood
measure and a region homogeneity criteria.
Some simple methods are reported in the literature, that extend
grayscale segmentation methods to the color image segmentation era
(e.g. see [7]). Investigating this
extended-grayscale methods makes clear that they are also
using a min/max fashion likelihood measure.
Using the Euclidean distance is generally accepted as a proper
likelihood measure, while different authors add some extra
processing to improve the results [4]. For example
in [8] the author uses the CIE-Lab color space.
Having in mind that the Euclidean distance only depends on the
central point of the cluster and not on the marginal
values and the spread of it, it is clear that (theoretically) The
Euclidean distance is a simple but not a perfect solution.
Although in [9] the author claims to use a better
designed fuzzy scheme in the CIE-Lab space, at the heart of the
method the Euclidean distance of the point and the cluster center
are ruling the likelihood. In [10] the authors uses the
Euclidean distance without any manipulations.
To overcome the shortcomings of the Euclidean distance-based
likelihood measures, in [11] the authors uses a
weighted Euclidean distance, which when tuned well, converges to
the well-known Mahalonobis distance. In this approach, not
only the central point of the cluster is considered, but also the
spread of the data is incorporated as a set of normalizing
factors.
As the likelihood measures tend to compute the distance between a
point and a cluster, they rank better members with smaller
numbers, as is the purpose of any distance measure. This is in
fact in contrast with the ordinary fuzzyfication approach which
marks better members with larger numbers in the range of [0,1]
and vice versa. Thus, a mapping function is needed to produce the
desired result. The general definition of such a mapping is
y=f([x/(s)]), while f is an odd function, mapping
[-¥,¥]®[0,1] in an upside down scheme.
Considering a cluster r, on which the distance function is
defined as er([c\vec]), leads to the likelihood measure of
f([(er([c\vec]))/(s)]). It is clear that for all [c\vec] Î r, we expect the likelihood measure to be near 1. Thus,
s must be statistically computed in terms of {er([c\vec])|[c\vec] Î r} while f must have a flat ceil; we expect f
to map [-1,1] to the vicinity of 1. Note that, this is in
contrast with the generally-used Gaussian
functions [10].
The main purpose of this paper is to compare the three likelihood
definitions of Euclidean distance ([e\tilde]Er,p([c\vec])), Mahalonobis distance ([e\tilde]Mr,p([c\vec])),
and the proposed reconstruction error [12]
([e\tilde]Rr,p([c\vec])), in terms of image fuzzyfication and
homogeneity decision. In this paper we compare [e\tilde]Er,p([c\vec]), [e\tilde]Mr,p([c\vec]), and [e\tilde]Rr,p([c\vec]) as likelihood measures. Also, we consider
whether ||eEr||r,p, ||eMr||r,p, and
||eRr||r,p are well-defined homogeneity criteria.
The following sections of this paper are organized as follows,
Section 2 describes the method used in this paper,
while Section 3 discusses the experimental results
and Section 4 concludes the paper.
The Euclidean distance is the most generally used likelihood
measure, defined as:
eEr(
®
c
)=(
®
c
-
®
h
)T(
®
c
-
®
h
)
(1)
where [c\vec] is the vector, we want to measure its distance to
the cluster r, and [(h)\vec] is the expectation of the color
vectors of cluster r.
The Mahalonobis distance is a well-known likelihood measure,
defining the membership of [c\vec] to the cluster r as:
eMr(
®
c
)=(
®
c
-
®
h
)T C-1 (
®
c
-
®
h
)
(2)
where, C is the covariance matrix of the color vectors of
cluster r, with its columns indicating the direction of the
principal components ordered in a descending fashion
(corresponding to the associated eigenvalues).
In [12] the authors proposed to use the error made
by neglecting the two less important principal components as a
likelihood measure. The likelihood of the vector [c\vec] to the
cluster r is defined as [12]:
eRr(
®
c
)=|| \acute
®
v
(
®
c
-
®
h
)
®
v
-(
®
c
-
®
h
)||
(3)
while [v\vec] shows the direction of the first principal
component and ||[x\vec]|| denotes the normalized L1 norm:
||
®
x
||=
1
N
Si=1N |xi|
(4)
Investigating (1), (2), and (3) makes
clear that, to make eXr([c\vec]) comparable over different
clusters, a normalization scheme is crucial (X Î {E,M,R}).
In [12] the authors proposed to use the following
stochastic margin as the normalization factor:
||f||r,p=arge
æ è
P[x\vec] Î r{f(
®
x
) £ e} ³ p
ö ø
(5)
where p is the inclusion percentage. Equation (5) leads
to the definition of the normalized likelihood
functions [12]:
~
e
X r,p
(
®
c
)=
eXr(
®
c
)
||eXr||r,p
(6)
Also, the ||eXr||r,p can be used as a homogeneity criteria
for usages such as Quad-tree decomposition [13]
and region growth [2].
It is clear that all eXr,p([c\vec]) are giving lower values
for the color vectors similar to those exist in r. Thus,
a fuzzy membership function is needed to map
[-1,1]®[1,1-e] and
[-¥,-1]È[1,¥]®[1-e,0].
In [12] the authors proposed a manipulated form of
the well-known low-pass Butterworth filter for the sake
of tunability and simplicity, as:
where [x] denotes the nearest integer value to x. The function
is designed in the way that satisfies:
Ba, b(1)=a, Ba, b(2)=b
(10)
It is clear that selecting a large member of ]0,1[ as the
a value and an small member of ]0,a[ as the b
value, leads to a desired fuzzyfication. Note that the above
definition of membership functions is in contrast with the
general selection of the Gaussian functions.
Although in the definition of the Euclidean distance and the
Mahalonobis distance, Principal Component
Analysis(PCA) [14,15,16] is not referred, but they
are both marginal samples of using PCA for measuring the
likelihood. In the terminology of PCA, the expectation of the
cluster is defined as [16]:
®
h
= arg[v\vec]
æ è
min
é ë
E {|
®
x
-
®
v
|2}
ù û
ö ø
(11)
which is the zero dimensional representation of the data. It can
be proved that the above definition leads to the canonical
definition of expectation [16]. Defining the k-th
residue of [x\vec] as [16]:
Dk(
®
x
)=
®
x
-
®
h
-
k å
i=1
®
v
T i
(
®
x
-
®
h
)
®
v
i
(12)
for k=0¼N (N is the original dimension of the cluster),
where [v\vec]i is the direction of the i-th principal
component of the cluster, it is clear that the Euclidean distance
equals ||D0([x\vec])||2 [16], while reconstruction
equals 1/3||D1([x\vec])||1 [12]. The
relation of the Mahalonobis distance to the PCA theory is
straight-forward: The Mahalonobis distance is the square root of
the squared principal components weighted by their importance (the
eigenvalues) [16]. To see details about using the PCA in
color image processing see [17,12].
Although RGB color space is used in this paper, but having in mind
that the PCA representation is translation, rotation, and scale
invariant [16], while all measures and criteria
investigated in this paper rely on PCA, it is clear that the
conclusions of this paper applies to any color space, linearly
derived from RGB in a reversible fashion [18] (such
as CMYK [19], YCbCr [20], YIQ [21],
YUV [22], and I1I2I3 [23]). This does not
hold for non-linearly derived color spaces (e.g.
HSI [24], HSV [25,26], CIE-XYZ,
CIE-La*b*, CIE-Lu*v*,
CIE-L*HoC* [27,28,29], and HMMD [30])
or irreversible linear ones.
To compare the performance of different likelihood measures, 42
color images (including the standard images Lena,
Mandrill, Airplane, Peppers, Girl,
and Couple), in RGB color space were used (all
low-compressed jpeg files). In each image 10
rectangular homogeneous regions were selected, and the
corresponding fuzzyfication results were compared subjectively.
To test the robustness of the methods, tests were performed with
regions containing up-to 25% irrelevant points and with
different values of p. Also, the complexity of the methods were
computed in terms of flops. To compare different
homogeneity criteria, 49 randomly selected rectangular regions
in each test image, were sorted according to each homogeneity
criterion in a descending fashion. All experiments were performed
on a large database of color images [31] and some
samples representing the whole results were selected for better
illustration of the results.
Figure 1 shows the log-magnitude of the membership
function, Ba,b for a = 0.9 with different
values of b. It is clear that while larger values of b
increases the domain of the typical members (those with larger
likelihood values), the function has an area of almost-one values
for the interior points of the cluster. This event does not happen
for the Gaussian function, in which one must increase the domain
of better members, by increasing s, leading to an overall
increase of membership values. Also, the sharp transition between
the points (1,a) and (2,b), which enables a
crisp classification of points in the proposed membership
functions, does not exist in the Gaussian function.
Figure 1: Log-magnitude of the membership function
(Ba,b) for a = 0.9 with different values of
b.
The computational complexity of each likelihood method can be
separated into two phases of: data-collection and
utilization. It is clear that in terms of data-collection
complexity, the Euclidean distance is the simplest (just needs the
expectation of the cluster), while the Mahalonobis distance is the
most complex (relies on computation of the expectation and also
all eigenvalues and eigenvectors). The utilization phase of the
Euclidean distance consists of 3+3+1=7 flops, while the
Mahalonobis distance needs 2×9×(3+3)+3=111 flops,
while the reconstruction error uses only 3+3+3+3+3+7=22 flops.
Thus, in terms of utilization complexity, the Mahalonobis distance
is the more consuming, while the Euclidean is the fastest. The
same occurs for the amount of the memory needed to code the
likelihood measure of a region in the three methods (3, 12,
and 6 real numbers for the three methods of Euclidean distance,
Mahalonobis distance, and reconstruction error, respectively).
Figure 2 shows the results of fuzzyfication one of the
samples with these three methods, based on four typical base
regions. In all cases, the parameters are set as a = 0.95,
b = 0.4, p=0.9. Note that in the entire paper, to give a
better printing result, the fuzzyficated images are printed in an
inverted fashion. It is clear that while non of the
figures 2-b, 2-f, 2-j, and
2-n (samples of The Euclidean fuzzyfication) have a
desirable fuzzyfication and do not enabling the user to classify
members and non members with a reasonable margin of certainty,
samples corresponding to the Mahalonobis distance and the
reconstruction error are promising.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
(p)
Figure 2: Comparison of fuzzyfication methods, from left to right,
original images adopted from [31] along with the base
region, the Euclidean distance, the Mahalonobis distance, and the
reconstruction error.
Figure 3 shows a few results of comparing the robustness
of different methods against partially non-homogeneous base
regions. In this test, base regions were contaminated with at
most 25% irrelevant points. In figure 3-a a small
portion of the leaf is added to the apple, resulting in a
desperate Euclidean distance-based fuzzyfication show in
figure 3-b, which both include the apple and the leaf.
While the two other methods led to more acceptable results, the
reconstruction error gave a better neglecting of the irrelevant
points (compare Figures 3-c and 3-d). The same
happens for the case of initializing all methods with almost
entirely grass points and a small portion of the red ribbon
(Figure 3-e). The Euclidean distance not only has failed
to reject ribbon points, it also have marked the apple points with
high likelihood (Figure 3-f). The two other methods
have better compensated the effect of irrelevant point. Comparing
Figures 3-g and 3-h makes it clear that the
reconstruction error has neglected both the ribbon points and the
apple points better than the Mahalonobis distance.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Figure 3: Comparison of the robustness of fuzzyfication methods
against non-homogeneous base regions. From left to right,
original images adopted from [31] along with the base
region, the Euclidean distance, the Mahalonobis distance, and the
reconstruction error. (Original image shows a traditional Iranian
Seven-S tablecloth.).
Figure 4 shows the results of comparing the robustness of
the three different methods against parameter selection. As the
effects of selecting parameters a and b is the same
for all methods, here just the effect of selecting p is
investigated. Using the image and the base region shown in
Figure 4-a, the set of images in Figures 4-b,
4-c, and 4-d show the results of
fuzzyfication by the Euclidean distance, the Mahalonobis distance,
and the reconstruction error, respectively: while p is set to
the members of {0.9,0.8,¼,0.1} iteratively, in each
case. In each set of images the top left image shows the result of
selecting the largest value of p (0.9), while the bottom
right image belongs to the smallest choice of p (0.1).
Comparing the case of p=0.9 in the three sets, shows another
shortcoming of the Euclidean distance-based fuzzyfication, in
which increasing p results in inclusion of irrelevant points in
the cluster. In the case of p=0.9 for the two other methods, the
fuzzyfication result contains holes of low values corresponding
to the yellow flowers, which are not the members of the clusters
for sure, but in the Euclidean distance the holes are filled.
Comparing the sequence of fuzzyfication results in the three
methods shows another promising result of the Mahalonobis
distance and reconstruction error. In contrast with the Euclidean
distance, in which by decreasing p, the membership function
declines in all points, in the two other methods, decreasing p
forces the cluster to tune more fine, rejecting less typical
members and drawing the cluster edge around the set of best
members.
(a)
(b)
(c)
(d)
Figure 4: Robustness of different fuzzyfication Methods against
selecting different values of p. (a) Original image adopted
from [31]. (b) The Euclidean distance. (c) The
Mahalonobis distance. (d) The reconstruction error.
Figure 5 shows the sample image, out of which 49 random
rectangular regions are selected to compare different homogeneity
criteria. Figure 6 shows the randomly selected regions
sorted in terms of different homogeneity criteria in a descending
fashion. In each set of regions, the top left image is known to be
the most homogeneous, while the bottom right region is believed to
have the poorest homogeneity. In the above discussion we call the
regions by the order of their homogeneity computed by each method.
Investigating the order of regions computed by the Euclidean
distance (Figure 5-a) makes t clear that the ordering
does not comply with the human perception. For example compare
the regions 40 to 43 with the 37-th and 38-th one.
Although, the Euclidean distance shows that the 37-th and
38-th regions are more homogeneous, the human observers order
them inversely. The same contradiction happens in the set of
regions ordered by the Mahalonobis distance
(Figure 5-b). For example compare the 15-th and the
14-th or the 28-th and the 4-th regions. Investigating
Figure 5-c shows the similarity of the human perception
and the reconstruction error homogeneity criteria. Looking at the
set of regions ordered by the reconstruction error, from the
first to the last, makes clear that the first regions
(1¼24-th) are the regions containing almost just one
dominant color, while the regions of the two and three colors are
coming afterwards.
Figure 5: Sample image for homogeneity criteria comparison.
(Adopted from [31].).
(a)
(b)
(c)
Figure 6: Randomly selected regions of figure 5 sorted in
terms of different homogeneity criteria in a descending fashion
(a) The Euclidean distance. (b) The Mahalonobis distance. (c) the
reconstruction error.
The three vector-to-cluster distance measures of
Euclidean distance, Mahalonobis distance, and reconstruction error
are compared, both as the likelihood measures and the homogeneity
criteria. Although, the Euclidean distance (which is used commonly
by the researchers) is the fastest method and needs the least
memory capacity, it neither shows applicable fuzzyfication
results, nor leads to a proper homogeneity criteria. Comparing the
two other methods, the reconstruction error is proved to be more
robust against improper base region selection, leading to a
promising homogeneity criteria, while also needing less memory
with low computational cost. In contrast with the common use of
the Gaussian function as the membership function, the
butter-worth low-pass is used and its higher performance is
proved mathematically and experimentally.
Acknowledgement
The first author wishes to thank Ms. Azadeh Yadollahi for her
encouragement and invaluable ideas. We also thank the generous
photographers Ms. Shohreh Tabatabaii Seifi and Ali Qanavati for
their incorporation of their rich digital image archive.
L. Lucchese, S. Mitra, Color image segmentation: A state-of-the-art survey
(invited paper), Image Processing, Vision, and Pattern Recognition, Proc. of
the Indian National Science Academy (INSA-A) 67, A, No. 2 (March 2001) pp.
207-221.
M. A. Ruzon, C. Tomasi, Alpha estimation in natural images, in: IEEE Conference
on Computer Vision and Pattern Recognition, Hilton Head, South Carolina, June
13 - 15, 2000.
K.-K. Ma, J. Wang, Color distance histogram: A novel descriptor for color image
segmentation, in: Seventh International Conference on Control, Automation,
Robotics, and Vision (ICARCV'02), Singapore, December 2002, pp. 1228-1232.
Q. Ye, W. Gao, W. Zeng, Color image segmentation using density-based
clustering, in: Proceedings of 2003 IEEE International Conference on Acoustic
and Signal Processing, Hog Kong, April 6-10 2003, pp. 401-404.
A. Abadpour, S. Kasaei, New principle component analysis based methods for
color image processing, Submitted to the Journal of Visual Communication and
Image Representation.
S.-C. Cheng, S.-C. Hsia, Fast algorithm's for color image processing by
principal component analysis, Jurnal of Visual Communication and Image
Representation 14 (2003) 184-203.
ITU-R Recomendation BT-601-5: Studio Encoding Parameters of Digital Television
for Standard 4:3 and Widescreen 16:9 Aspect Ratios, http://www.itu.ch/,
Geneva, 1994.
G. T. W. S. Tenenbaum, J.M., H. Welf, An interactive facility for scene
analysis research, Tech. Rep. 87, Stanford Research Institute, AI Centre
(1974).
J. Foley, A. Van Dam, Fundamentals of Interactive Computer Graphics, The System
Programming Series,, Addison-Wesley, Reading, MA., 1982, Reprinted 1984 with
corrections.
N. M. Ledley, S., T. Golab, Fundamentals of true color image processing, in:
tenth International Conference on Pattern Recognition, Atlantic City, 1990,
pp. 791-795.
B. Manjunath, J.-R. Ohm, V. Vasuderan, A. Yamada, Color and texture
descriptors, IEEE Transaction on Circuits and Systems for Video Processing 11
No. 6 (June 2001) 703-715.