Home > Engineering essays > Problems In Image Retrieval

Essay: Problems In Image Retrieval

Essay details and download:

  • Subject area(s): Engineering essays
  • Reading time: 44 minutes
  • Price: Free download
  • Published: 7 June 2012*
  • Last Modified: 23 July 2024
  • File format: Text
  • Words: 12,482 (approx)
  • Number of pages: 50 (approx)

Text preview of this essay:

This page of the essay has 12,482 words.

Problems In Image Retrieval

Local image features like as color, texture and shape have become pervasive in the areas of computer vision (CV) and image retrieval and classification (IRC). Robust local descriptors, like as Scale-Invariant Feature Transform (SIFT), Gradient Location and Orientation Histogram (GLOH), Speeded-up Robust Features (SURF) and wavelet histogram are used to overcome image variability caused by change viewpoints and angles, occlusions, and varying illumination[1].
Many learning models are make use of the idea that any machine learning problem can be made easy with the right set of texture and color features. The trick, of course, is discovering that ‘right set of features descriptors’, which in general is a very difficult thing to do. SVMs are another attempt at a model that does this [2]. The idea behind SVM is to make use of a (nonlinear) mapping function ?? that transforms data in input space to data in feature space in such a way as to render a problem non linearly separable[5]. The SVM then automatically discovers the optimal separating hyper plane (which, when mapped back into input space via ‘??1, can be a complex decision surface). SVM are rather interesting in that they enjoy both a sound theoretical basis as well as state-of-the-art success in real-world applications. To illustrate the basic ideas, we will begin with a RBF (Radial Basis Function) SVM (that is, a model that assumes the data is radial basis function separable).
The CBIR exist in number of area like video streaming, speech, image processing, Data mining and GIS (geographic information system). All these application are require a high degree of accuracy with a minimal user involvement. There are various methods being used for the retrieval and classification of images based on visual features such as color, texture and shape [6]. Most of the top methods use sophisticated, time consuming image retrieval and classification techniques to learn the semantic content of the image dataset. For example, if we want to study the separate regions of interest (ROI) of the image, then suitable color or texture segmentation algorithm being used to separate the homogeneous regions for further analysis to classify based on features.
More formally, a support vector machine (SVM) constructs a hyper plane or set of hyper planes in a high- or infinite-dimensional space, which can be used for classification, regression, or other relative tasks. Intuitively, a good separation is achieved by the hyper plane that has the largest distance to the nearest training data point of any class (so-called functional margin), since in general the larger, the margin, the lower, the generalization error of the classifier.
Whereas the original problem may be stated in a finite dimensional space, it often happens that the sets to discriminate are not linearly separable in that dimensional space. For this reason, it was proposed that the original finite-dimensional space be mapped into a much more higher-dimensional space, presumably making the separation easier in that space. To keep the computational load reasonable, the mappings used by SVM (Support Vector Machine) schemes are designed to ensure that dot products may be computed easily in terms of the variables in the original dimensional space, by defining them in terms of a kernel function K(x,y) selected to suitable for given problem. The hyper planes in the higher-dimensional space are defined as the set of points whose dot product with a vector in that space is constant [2]. The vectors defining the hyper planes can be chosen to be linear or non linear combinations with parameters ??i of images of feature vectors that occur in the data base. With this choice of a hyper plane, the points x in the feature dimensional space that are mapped into the hyper plane are defined by the relation: = constant Note that if K(x,y) becomes small as y grows further away from x, each term in the sum measures the degree of closeness of the test point x to the corresponding to the database point xi . In this way, the sum of kernels above can be used to measure the relative nearness of each test point to the data points originating in one or the other of the sets to be discriminated. Note that the fact, the set of points x mapped into any dimensional hyper plane can be quite convoluted as a result which allowing much more complex discrimination between sets which are not convex at all in the original dimensional space.

1.1 PROBLEMS IN IMAGE RETRIEVAL:
The generalize and existing CBIR schemes represent the relevance and similar output may suffer from some problems in complete their search in a single interaction especially on the internet. First, When search the image in google search engine [26]-[30], then much more images is retrieved after classify, from which inconsistency and redundancy arise. Second, it is too more time consuming and difficult to class a lot of negative and asymmetric examples with sufficient variety. Third, general and registered users may introduce some extra noisy examples into the query. To resolve the above problem, we use SIFT method for calculating texture feature descriptor. These descriptors of image dataset may be classify through SVM-COACO. In this proposed SVM use the concept of optimization.
1.2 MOTIVATION OF WORK:
For improve the performance of retrieved images, We propose describing input images as a dataset which consist of approx 20000. These images are classified into labels or class. Now find the SIFT descriptors of down-sampled image dataset[9]. SIFT descriptors are extracted from the original images and are used to represent the complete image which identify corresponding patches. The down-sampled image dataset play an important role in making the retrieve set of images corresponding query image. They are used to verify every retrieved image patch and guide how to find corresponding image patches like the given images dataset. The down-sampled image dataset consist of various correlated image. Using the correlation between the SIFT feature of down-sampled image dataset and SIFT descriptor of query image, justify the correct image patch[11]. We ‘rst identify locations, scales, and orientations of SIFT descriptors and then use them to classify prediction vectors from the extract feature of down-sampled image dataset so that high-dimension SIFT vectors can be efficiently classify and retrieve through the proposed method.
Image classification and retrieving is the task of generate set of correlated images which belongs to same class. Many approaches have been proposed to train the SIFT descriptors which belongs to specific region of image. The proposed approach uses SIFT descriptors for extract feature descriptor of images, then these features are optimize through COACO (Continuous Orthogonal Ant Colony Optimization) [14] and applied to proposed classifier method. Image composition is a long-standing research topic.
Classifying data is a common task in machine learning. Suppose some given data points each belong to one of two classes, and the goal is to decide which class a new data point will be in. In the case of support vector machines, a data point is viewed as a p-dimensional vector (a list of p numbers), and we want to know whether we can separate such points with a (p ‘ 1)-dimensional hyperplane. This is called a linear classifier. There are many hyperplanes that might classify the data. One reasonable choice as the best hyperplane is the one that represents the largest separation, or margin, between the two classes. So we choose the hyperplane so that the distance from it to the nearest data point on each side is maximized. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier; or equivalently, the perceptron of optimal stability[18].
1.3 ORGANIZATION OF THESIS:

This project comprises of seven chapters. The first chapter gives brief introduction to the need of pre image processing, classification and the environment used for project. In Chapter 2, give a detailed description of literature review about SIFT method, Support vector machine method and COAC Optimizations. In Chapter 3, Specify the basic description of techniques such as Scale Invariant Feature Descriptors, and the basic concept of support vector machine with COACO learning strategies. In Chapter 4, discuss the proposed method SVM-COACO. In Chapter 5, result and analysis will be presented. In Chapter 6, specify some future works and conclusion followed by the references.
1.4 SOFTWARE ISSUES:

MATLAB is a high-performance and interactive language for technical computing. It integrates Computation, visualization, graphical and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation and graphical form. Typical uses include Math and other computation Algorithm development Data acquisition Modeling, image processing, simulation, and prototyping Data analysis, exploration, and visualization Scientific and engineering drawing and graphics Application development, including graphical user interface building MATLAB is an interactive programming tool whose basic data element is an array (Matrix form) that does not require dimensioning. This allows you to solve many technical computing problems, especially those with matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar non interactive language like as C or Fortran. The name MATLAB stands for matrix laboratory. MATLAB was originally written to provide easy access to matrix software developed by the LINPACK and EISPACK and many other projects. Today, MATLAB engines incorporate the LAPACK libraries, embedding the state of the art in software for matrix computation and programming.

Figure 1.4.1: Matlab Command Window
MATLAB has evolved over a period of years with input from many more users. In university environments, it is the standard instructional tool for introductory and advanced courses in mathematics, engineering, and medical science. In industry, MATLAB is the tool of choice for high-productivity research, development, proactive and analysis. MATLAB features a family of add-on application-specific solutions called toolboxes. Very important to most users of MATLAB, toolboxes allow you to learn and apply specialized technology. Toolboxes are comprehensive collections of MATLAB functions (M-files) and MEX files that extend the MATLAB environment to solve particular classes of problems. Moreover Areas, in which toolboxes are available include signal processing, control systems, neural networks, fuzzy logic, wavelets, simulation, and many more others.
1.5 SUMMARY:

This chapter gives the brief description about correlated images. Section 1.1 describe the main problems concerning to correlated images. Section 1.2 listed the motivations for current work. Section 1.3 gives the organization of project. Section 1.4 discussed the software used in project, MATLAB provide very important image processing toolbox which is very useful platform for developing MATLAB codes.
CHAPTER 2
LITERATURE REVIEW
This dissertation work is essentially find the descriptors of different feature in an image, these feature are optimize and classify/retrieve best feature as per query feature descriptor. The methods and approaches are taken from existing literature work. The essential 8 research paper being entitled in this work is listed below.
(1)Topic: ‘Improved content-based classification and retrieval of images using support
vector machine’

The proposed work in this paper is described as: The basic steps of the proposed CBIR system using SVM include:

‘ Constructing a SVM model of a classification network using the text indexed image
histogram features and discrete wavelet decomposition of the training images.
‘ Classifying the input image using the trained model.
‘ Retrieving all the best matching images from the matching class of the input image using
a simple distance metric.

Supervised classification has been used in this research to categorize the images as the SIMPLIcity dataset contains images with well-defined labels. Input images to the supervised classifier are labelled as Africa, beach, buildings, buses, dinosaurs, elephants, flowers, horses, mountains and food[3]. The feature descriptors of the images, in the WH format are given to the supervised classifier to infer a rule that assigns a label to each image. The classifier assigns a class label to the output value y, that best matches the given input pattern and is denoted by Ck, k = 1, 2, ‘, K, where K is the number of classes. Here, K = 10. The input data source is the feature descriptor obtained by fusing the color indexed histogram and the output of the wavelet transformation [4]. A vector of 263 real numbers forms the input vector denoted by x. The vector format of the image features is suitable for an SVM classifier. The ability of a supervised classifier to map an input vector x to the desired output class y is based on the performance of the learning algorithm. SVM which was originally designed for binary classification has been extended to support multi-class classification through one-against-all (1AA) and one-against-one (1A1) strategies. The 1A1 strategy decomposes the multi- class problem into a set of binary classifiers [6]. For n number of output classes, n*(n ‘ 1)/2 classifiers are constructed and each one is trained with data from two classes i and j. A separate decision boundary is independently created between every pair of classes.

(2)Topic: ‘Cloud Based Image Coding for Mobile Devices Toward Thousand to
One Compression’

The proposed method in this paper is described as: cloud is characterized by a large amount images, e.g., Google street view images, when you randomly take a picture with your phone on the street, you can often ‘nd some highly correlated images in the cloud that were taken at the same location at different viewpoints and angles, focal lengths, and illuminations. If you try to share the photo with friends through the cloud [2], it is problematic to use conventional image coding (e.g., JPEG) that usually provides only 8:1 compression ratio. However, state-of-the-art image coding, consisting of directional intra prediction and transform, makes it hard to take advantage of highly correlated images in the cloud. Intra prediction uses decoded neighboring pixels from the same image to generate predictions and then the pixels are coded by subtracting the predictions. Image search has been demonstrated as a successful application on the Internet. By submitting the description of one image, including semantic content, outline, and local feature descriptors, one can easily retrieve many similar images. Near and partial duplicate image detection is a hot research topic in this ‘eld. Recent efforts have shed light on using a large-scale image database to recover, compose, and even reconstruct an image. In particular, Weinzaepfel et al. are the ‘rst to reconstruct an image by using local feature descriptors SIFT (Scale Invariant Feature Transform) [6]. The follow-up work tries to reconstruct an image using SIFT and SURF (Speed Up Robust Features) descriptors. However, it is a very challenging problem to reconstruct a visually pleasing image using local feature descriptors only. Second, we use all SIFT descriptors in a patch to estimate patch transformation with the RANSAC algorithm and a perspective projection model is used. Image imprinting is the task of ‘lling in or replacing a region of an image. SIFT descriptors, present distinctive invariant features of images that consist of location, scale, orientation, and feature vector [12]. Since SIFT descriptors have a good interpretation of the response properties of complex neurons in the visual cortex. Ke et al. propose applying Principal Components Analysis (PCA) to greatly reduce the dimension of the feature vector. Hua et al. propose linear discriminate analysis to reduce the dimension of the feature vector. Chandrasekhar et al. propose the transform coding of the feature vectors. For near and partial duplicate image retrieval, the geometric relationship of visual words plays an important role. To utilize this information [7], Wu et al. propose bundling a maximally stable region and visual words together. Zhou et al. propose using spatial coding to represent spatial relationships among SIFT descriptors in an image [10]. Image alignment is a historic research topic. When the SIFT descriptors are available for two images that are to be aligned, the most popular approach for estimating the transformation between them is RANSAC. Torr et al. improve the algorithm by using the maximum likelihood estimation instead of the number of inliers[12]. Philip et al. introduce a pyramid structure with ascending resolutions to improve the performance of RANSAC[14]. Chum et al. greatly speed up the approach by introducing condense match[15].

The proposed method in this paper is described as: One key design task, when constructing image databases, is the creation of an active relevance feedback component. While it is sometimes possible to arrange images within an image database by creating a hierarchy, or by hand-labeling each image with descriptive words, it is often time-consuming, costly and subjective[8]. Alternatively, requiring the end-user to specify an image query in terms of low level features (such as color and spatial relationships) is challenging to the end-user, because an image query is hard to articulate, and articulation can again be subjective. Thus, there is a need for a way to allow a user to implicitly inform a database of his or her desired output or query concept. To address this requirement, relevance feedback can be used as a query refinement scheme to derive or learn a user’s query concept. To solicit feedback, the refinement scheme displays a few image instances and the user labels each image as ‘relevant" or ‘not relevant." Based on the answers, another set of images from the database are brought up to the user for labeling. After some number of such querying rounds, the refinement scheme returns a number of items in the database that it believes will be of interest to the user [11].
The construction of such a query refinement scheme (we call it a query concept learner or learner hereafter) can be regarded as a machine learning task. In particular, it can be seen as a case of pool-based active learning. In pool-based active learning the learner has access to a pool of unlabeled data and can request the user’s label for a certain number of instances in the pool. In the image retrieval domain, the unlabeled pool would be the entire database of images [13]. An instance would be an image, and the two possible labeling of an image would be ‘relevant" and ‘not relevant." The goal for the learner is to learn the user’s query concept. In other words the goal is to give a label to each image within the database such that for any image, the learner’s labeling and the user’s labeling will agree.
The main issue with active learning is finding a way to choose informative images with in the pool to ask the user to label. We call such a request for the label of an image a pool-query. Most machine learning algorithms are passive in the sense that they are generally applied using a randomly selected training set. The key idea with active learning is that it should choose its next pool-query based upon the past answers to previous pool-queries.

SVMs are set of related supervised learning methods used for classification and regression. They belong to a family of generalized linear classification. A special property of SVM is, SVM simultaneously minimize the empirical classification error and maximize the geometric margin. So SVM called Maximum Margin Classifiers. SVM is based on the Structural risk Minimization (SRM). SVM map input vector to a higher dimensional space where a maximal separating hyperplane is constructed [7]. Two parallel hyperplanes are constructed on each side of the hyperplane that separate the data. The separating hyperplane is the hyperplane that maximize the distance between the two parallel hyperplanes. An assumption is made that the larger the margin or distance between these parallel hyperplanes the better the generalization error of the classifier.
Training vectors xi are mapped into a higher (may be infinite) dimensional space by the function ??. Then SVM finds a linear separating hyperplane with the maximal margin in this higher dimension space .C > 0 is the penality parameter of the error term. Furthermore, K(xi , xj) ‘ ??(xi)T ??(xj) is called the kernel function[10]. There are many kernel functions in SVM, so how to select a good kernel function is also a research issue. However, for general purposes, there are some popular kernel functions:

‘ Linear kernel: K(xi , xj) = xiT xj.
‘ Polynomial kernel: K(xi , xj) = (?? xiT xj + r)d , ?? > 0
‘ RBF kernel : K (xi , xj) = exp(-?? ‘xi – xj’2) , ?? > 0
‘ Sigmoid kernel: K (xi , xj) = tanh(?? xiT xj + r)
Here, ??, r and d are kernel parameters. In these popular kernel functions, RBF is the main kernel function because of following reasons:
1. The RBF kernel nonlinearly maps samples into a higher dimensional space unlike to linear kernel.

2. The RBF kernel has less hyper parameters than the polynomial kernel.
3. The RBF kernel has less numerical difficulties.
(5)Topic: ‘Orthogonal Methods Based Ant Colony Search for Solving Continuous
Optimization Problems’

The proposed method in this paper is described as: In order to solve continuous optimization problems e’ectively, this paper develops a continuous orthogonal ant colony (COAC) algorithm by using the orthogonal design method. The orthogonal design method, which was proposed more than ‘fty years ago, is an experimental design method and has since been widely applied in scienti’c research, manufacturing, agricultural experiments and quality management [6]. It can be used for planning experiments and provide an e’cient way to ‘nd a near-best sample for the multi factor experiments. As every variable in a problem can be regarded as a factor, the orthogonal design method can help solve optimization problems, Traditional ant colony optimization (ACO) is a framework for discrete optimization problems. The agents in ACO work as teammates in an exploration team. The cooperation between the ants is based on pheromone communication. In view of the traveling salesman problem (TSP), which is a discrete optimization problem, the mission for the agents is to ‘nd the shortest way to traverse all the cities and return to the original city. The number of roads inter-connecting the cities is limited and ‘xed. However, optimization problems in the continuous domain are di’erent. There are no longer ”xed’ roads for agents to explore, but walking in any directions in the n-dimensional (n > 1) domain can lead to quite a di’erent result. The di’culty in solving such a problem lies in how to ‘nd an e’cient way to direct these agents to search in the n-dimensional domain.
(6)Topic: ‘Continuous Function Optimization Using Hybrid Ant Colony Approach with
Orthogonal Design Scheme’

This paper proposed a hybrid ant colony algorithm with orthogonal scheme (OSACO) for continuous function optimization problems. The methodology integrates the advantages of Ant Colony Optimization (ACO) and Orthogonal Design Scheme (ODS). The proposed algorithm has been successfully applied to 10 bench- mark test functions. The performance and a comparison with CACO and FEP have been studied. The idea underlying ACO is to simulate the autocatalytic and positive feedback process of the forging behavior of real ants. Once an ant finds a path successfully, pheromone is deposited to the path. By sensing the pheromone ants can follow the path discovered by other ants [7]. This collective pheromone-laying and pheromone-following behavior of ants has become the inspiring source of ACO. The goal of orthogonal design is to perform a minimum number of tests but acquire the most valuable information of the considered problem. It performs by judiciously selecting a subset of level combinations using a particular type of array called the orthogonal array (OA). As a result, well-balanced subsets of level combinations will be chosen. The characteristics of OSACO are mainly in the following aspects: a) each independent variable space (IVS) of CFO is dispersed into a number of random nodes; b) the carriers of pheromone of ACO are shifted to the nodes; c) SP can be obtained by choosing one appropriate node from each IVS by ant; d) with the ODS, the best SP is further improved.
Informally, its procedural steps are summarized as follows. Step 1) Initialization: nodes and the pheromone values of nodes are initialized; Step 2) Solution Construction: Ants follow the mechanism of ACO to select nodes separately using pheromone values and form new SPs; Step 3) Sorting and Orthogonal Search: SPs in this iteration are sorted and the orthogonal search procedure is applied to the global best SP; Step 4) Pheromone Updating: Pheromone values on all nodes of all SPs are updated using the pheromone depositing rule and the pheromone evaporating rule; Step 5) SP Reconstruction: A number of the worst SPs are regenerated; Step 6) Termination Test: If the test is passed, stop; otherwise go to step 2). To facilitate understanding and explanation of the proposed algorithm, we take the optimization work as minimizing a D-dimension function f(X), X=(x1,x2,’,xD). The lower and upper bounds of variable xi are lowBoundi and upBoundi. Nevertheless, without loss of generality, this scheme can also be applied to other continuous space optimization problems.

(7)Topic: ‘Distinctive image feature from scale invariant keypoints’

The proposed method in this paper is described as: Image matching is a fundamental aspect of many problems in computer vision, including object or scene recognition, solving for 3D structure from multiple images, stereo correspondence, and motion tracking. This paper describes image features that have many properties that make them suitable for matching differing images of an object or scene. The features are invariant to image scaling and rotation, and partially invariant to change in illumination and 3D camera viewpoint. They are well localized in both the spatial and frequency domains, reducing the probability of disruption by occlusion, clutter, or noise. Large numbers of features can be extracted from typical images with efficient algorithms. In addition, the features are highly distinctive, which allows a single feature to be correctly matched with high probability against a large database of features, providing a basis for object and scene recognition[2], [10].
The cost of extracting these features is minimized by taking a cascade filtering approach, in which the more expensive operations are applied only at locations that pass an initial test. Following are the major stages of computation used to generate the set of image features:
Scale-space extrema detection: The first stage of computation searches over all scales and image locations. It is implemented efficiently by using a difference-of-Gaussian function to identify potential interest points that are invariant to scale and orientation.
Keypoint localization: At each candidate location, a detailed model is fit to determine location and scale. Keypoints are selected based on measures of their stability.
Orientation assignment: One or more orientations are assigned to each keypoint location based on local image gradient directions. All future operations are performed on image data that has been transformed relative to the assigned orientation, scale, and location for each feature, thereby providing invariance to these transformations.
Keypoint descriptor: The local image gradients are measured at the selected scale in the region around each keypoint. These are transformed into a representation that allows for significant levels of local shape distortion and change in illumination.
This approach has been named the Scale Invariant Feature Transform (SIFT), as it transforms image data into scale-invariant coordinates relative to local features.
(8)Topic: ‘SSVM: A Simple SVM Algorithm’
Author: S.V.N. Vishwanathan, M. Narasimha Murty
Publication: Journal of Theoretical and Applied Information Technology, 2003

Our algorithm maintains a candidate Support Vector set. It initializes the set with the closest pair of points from opposite classes like the DirectSVM algorithm. As soon as the algorithm ‘nds a violating point in the dataset it greedily adds it to the candidate set. It may so happen that addition of the violating point as a Support Vector may be prevented by other candidate Support Vectors already present in the set. We simply prune away all such points from the candidate set. To ensure that the KKT conditions are satis’ed we make repeated passes through the dataset until no violators can be found. We use the quadratic penalty formulation to ensure linear separability of the data points in the kernel space[11].
A. Finding the Closest Pair of Points: First of all, we observe that, ‘nding the closest pair of points in kernel space requires n2 kernel computations where n represents the total number of data points. But, in case we use a distance preserving kernel like the exponential kernel the nearest neighbors in the feature space are the same as the nearest neighbors in the kernel space. Hence we need not perform any costly kernel evaluations for the initialization step.
B. Adding a Point to the Support Vector Set: Given a set S which contains only Support Vectors, we wish to add another Support Vector c to S.
C. Pruning: In the discussion above we tacitly assumed that (?? + ‘??) > 0 ‘p ‘ S. But, this condition may be violated if any point in S blocks c. When we say that a point p to S is blocking the addition of c to S what we mean is that ??p of that point may become negative due to the addition of c to S. What it physically implies is that p is making a transition from S to the well classi’ed set R.
D. Our Algorithm: Using the ideas we discussed above an iterative algorithm can be designed which scans through the dataset looking for violators. Using ideas presented in Section II-B the violator is made a Support Vector. Blocking points are identify and pruned away by using the ideas presented in Section II-C. The algorithm stops when all points are classify within an error bound i.e. y, if (xi) > 1 ‘ ‘i. The outline of our algorithm is presented in following Algorithm.
Algorithm Simple SVM candidateSV = { closest pair from opposite classes }
while there are violating points do
Find a violator
candidateSV = candidateSV U S violator
if any ??p < 0 due to addition of c to S then
candidateSV = candidateSV p
repeat till all such points are pruned
end if
end while

CHAPTER 3

TECHNIQUES USED IN PROPOSED WORK

Some of the techniques from existing research work being used in my thesis work. These technique are listed below
3.1 SCALE INVARIENT FEATURE DESCRIPTOR (SIFT):
Images of one scene may be taken from different viewpoints or may suffer transformations such as rotation, noise etc. So it is likely that two images of the same scene will be different. The task of finding similarity correspondences between two images of the same scene or object has thus become a challenging problem in a number of vision applications. Such applications range from image registration, camera calibration, object recognition, scene localization in navigation systems, image retrieval based search engines etc. For image matching, extraction of such information (i.e. features) is required from the images which can provide reliable matching between different viewpoints of the same image. Feature detection occurs within an image and seeks to describe only those parts of that image where we can get unique information or signatures (i.e. feature descriptors). During training, feature descriptors are extracted from sample images and stored. In classi’cation, feature descriptors of a query image are then matched with all trained image features and the trained image giving maximum correspondence is considered the best match. Feature descriptor matching can be based on distances such as Euclidean, Mahalanobis or distance ratios. For detect feature descriptors, we use SIFT method [9]. SIFT Feature Descriptor detector has four main stages namely as
scale- space extrema detection
keypoint localization
orientation computation
key point descriptor extraction.
In SIFT, we have to perform following process
Take a 16 x16 window around interest point (i.e., at the scale detected).
Divide into a 4×4 grid of cells.
Compute histogram of image gradient directions in each cell (8 bins each).
16 histograms x 8 orientations = 128 features

Image Gradients Keypoint Descriptor
Figure 3.1.1 : Computing Process of Keypoint Descriptor

3.1.1 SCALE SPACE EXTREMA DETECTION:

In this step, Extract scale and rotation invariant interest points (i.e., keypoints).
Find local maxima of region of interest of image, Hessian in space
DoG (Difference of Gaussian) in scale

Figure 3.1.1.1: Evaluate Scale Space with its Octave

DoG images are grouped by octaves (i.e., doubling of ??0)
Fixed number of levels per octave

Gaussian Difference of
Gaussian (DoG)

Figure 3.1.1.2: Finding DoG from different Scales

Where
Images within each octave are separated by a constant factor k
If each octave is divided in s-1 intervals:
ks=2 or k=21/s
The Gaussian Scale space of first octave can be generated as below. This space is generated from every region of interest of images

Figure 3.1.1.3 Gaussian Scale Space

Parameters (i.e., scales per octave, ??0 etc.) were chosen experimentally based on keypoint
(i) Repeatability
(ii) Localization
(iii) matching accuracy.

From Lowe’s paper:

Number of scales per octave: 3

??0 =1.6

Extract local extrema (i.e., minima or maxima) in DoG pyramid. Compare each point to its 8 neighbors at the same level, 9 neighbors in the level above, and 9 neighbors in the level below (i.e., 26 total).

In following figure, represent 3 scale space, In such situation every descriptor is surrounded from nearest descriptor ( 26 descriptors). These descriptors are generated from the SIFT method. The descriptors of an image reflect the different feature of image like shape, texture, color.

Figure 3.1.1.4 Scale Space with its neighbors

3.1.2 KEYPOINTS LOCALIZATION:

Determine the location and scale of keypoints to sub-pixel and sub-scale accuracy by fitting a 3D quadratic polynomial
keypoint location

Offset

sub-pixel, sub-scale estimated location

Substantial improvement to matching and stability Reject keypoints having low contrast. i.e., sensitive to noise If reject keypoints i.e., assumes that image values have been normalized in [0,1] . Reject points lying on edges (or being close to edges) Harris uses the auto-correlation matrix

R(AW) = det(AW) ‘ ?? trace2(AW) or R(AW) = ??1 ??2- ?? (??1+ ??2)2
SIFT uses the Hessian matrix. i.e., Hessian encodes principal curvatures

??=largest eigenvalue (??max) ,??= smallest eigenvalue (??min) ,
(proportional to principal curvatures)

Where, r = ??/??, Reject keypoint if:
(SIFT uses r = 10)

Figure 3.1.2.1: (a) 233×189 image (b) 832 DoG extrema (c) 729 left after low contrast threshold (d) 536 left after testing ratio based on Hessian

3.1.3 ORIENTATION ASSIGNMENT:

Create histogram of gradient directions, within a region around the keypoint, at selected scale

Figure 3.1.3.1: Histogram form of Orientation

36 bins (i.e., 10o per bin) Histogram entries are weighted by (i) gradient magnitude and (ii) a Gaussian function with ?? equal to 1.5 times the scale of the keypoint Assign canonical orientation at peak of smoothed histogram (fit parabola to better localize peak).

Figure 3.1.3.2 : Processing of Orientation

In case of peaks within 80% of highest peak, multiple orientations assigned to keypoints. About 15% of keypoints has multiple orientations assigned. Significantly improves stability of matching.
3.1.4 KEYPOINTS DESCRIPTOR EXTRACTION:

Take a 16 x16 window around detected interest point. Divide into a 4×4 grid of cells. Compute histogram in each cell.

Figure 3.1.4.1: Alignment of Keypoint Descriptor

Each histogram entry is weighted by (i) gradient magnitude and (ii) a Gaussian function with ?? equal to 0.5 times the width of the descriptor window.

Figure 3.1.4.2: Process of finding Keypoint Descriptor
Partial Voting: distribute histogram entries into adjacent bins (i.e., additional robustness to shifts). Each entry is added to all bins, multiplied by a weight of 1-d, where d is the distance from the bin it belongs.

Figure 3.1.4.3: Histogram level of each keypoint in image ROI

Descriptor depends on two main parameters:

(1) number of orientations r
(2) n x n array of orientation histograms
The no. of feature is rn2, SIFT: r=8, n=4
Features = 128
Invariance to linear illumination changes: Normalization to unit length is sufficient 128 features Non-linear illumination changes. Finally, the feature vector is modi’ed to reduce the effects of illumination change. First, the vector is normalized to unit length. A change in image contrast in which each pixel value is multiplied by a constant will multiply gradients by the same constant, so this contrast change will be canceled by vector normalization. A brightness change in which a constant is added to each image pixel will not affect the gradient values, as they are computed from pixel differences. Therefore, the descriptor is invariant to af’ne changes in illumination. However, non-linear illumination changes can also occur due to camera saturation or due to illumination changes that affect 3D surfaces with differing orientations by different amounts.
3.2 CONTINUOUS ORTHOGONAL ANT COLONY OPTIMIZATION:

In the proposed method of my dissertation work use the concept of Optimization for finding optimum feature descriptor, In this method we explore the feature of image retrieval using optimizer, We use Continuous Orthogonal Ant Colony Optimizer (COACO) techniques with existing SVM method. This optimizer explores the more performance parameter of image retrieval from large image dataset. The basic concept of Continuous Orthogonal Ant Colony Optimizer (COACO) techniques is as follows[6],[7].
3.2.1 PRINCIPAL OF CONTINUOUS ORTHOGONAL DESIGN APPROACH:

The number of SIFT feature descriptors combinations to be tested in the orthogonal design is much fewer than the full-scale experiment. Generally, The orthogonal design method is a partial experimental method to all the levels of factors, but it is a full-scale experiment to any two factors. The levels of the orthogonal array are made to be mostly ‘orthogonal’ with each other. Take three factors to draw a cube, as shown in following Fig. The three factors are denoted as A, B and C with subscripts indexing the levels. There are totally 27 combinations of three factors, which are illustrated as spots in following Fig. Based on the first three columns of OA(9, 4, 3), nine combinations are selected, which are illustrated as hollow spots in following Figure. In the cube, there are three hollow spots on every face (including the inside faces) and one hollow spot on every edge (including the inside edges). These nine combinations can approximately reflect the solution space. Although the best combinations in these sampled experiments may not be the best one in the full-scale experiment, this method can reduce the number of tests and give a direction to the optimal combinations.
In fact, there are different types of orthogonal combinations with three factors. Any three columns without duplication of OA(9, 4, 3) can form nine orthogonal combinations which are composed of different spots. Any orthogonal arrays with equal or more columns than the number of factors in the given multifactor problem can be used. This means that an orthogonal array without some columns is still an orthogonal array. However, an orthogonal array with more factors is always accompanied with more combinations that will consume exponentially longer time to complete the experiment. The goal of orthogonal design is to perform a minimum number of tests but acquire the most valuable information of the considered problem. It performs by judiciously selecting a subset of level combinations using a particular type of array called the orthogonal array (OA). As a result, well-balanced subsets of level combinations will be chosen.

Figure. 3.2.1.1 Distribution model of three factors with three levels.
An orthogonal array complies with the three elementary transformations. If any two factors of an orthogonal array are swapped, or any two levels of an orthogonal array are swapped, or any two levels of a factor are swapped, the resulting array is still an orthogonal array. In this paper, the columns of the orthogonal array used are randomly chosen in order to construct various kinds of orthogonal neighboring points in the proposed algorithm.
3.2.2 CONTINUOUS ORTHOGONAL ANT COLONY OPTIMIZATION (COACO):
The ants which are sent to find the optimal location in the given domain use pheromone and the orthogonal exploration to accomplish the mission. The domain is divided into multiple regions of various sizes. Every region has multi-properties as the searching radiuses, the coordinate of the center, the amount of pheromone, and the ranks by its desirability. The desirability is evaluated by the objective function. This approach is consists of two parts.
3.2.2.1 ORTHOGONAL EXPLORATION:
In each iteration, m ants are dispatched. There are ?? regions in the domain, which are randomly generated or inherited from the previous iteration. Based on the amount of pheromone in the regions, the ants choose which region to explore first. A user-defined probability q0 is used to determine whether to choose the region with the highest pheromone or the region selected by the roulette wheel selection method. The rule an ant chooses region j is given by

where SR is the set of regions, ??j is the pheromone value in region j, q is a uniform random number in [0, 1]. RWS stands for the roulette wheel selection. Here the roulette wheel selection is based on the pheromone value of the regions.
The orthogonal exploration for an ant in a region can be outlined as follows.
Step 1: Choose a region by above equation.
Step 2: Randomly choose n di’erent columns of the given orthogonal array
OA(N,k,s) as a new orthogonal array.
Step 3: Generate N neighboring points.
Step 4: Adaptively adjust the radiuses of the region.
Step 5: Move the region center to the best point.
All m ants perform the above steps and then a globally best point will be found. If the best point ever been found is not changed for ?? iterations, this point will be discarded and replaced by a randomly new point. The parameter ?? is prede’ned and this step is to add diversity to the algorithm and avoid trapping in local optima.

3.2.2.2 GLOBAL MODULATION:
The Global Modulation is perform after each Orthogonal exploration. The global modulation can be outlined as follows.
Step 1. Set the variable ranking = 1. S’R= ‘.
Step 2. Find the best region j in S’R.
Step 3. Set rankj = ranking and update the pheromone value of region j. Move
region j into S’R.
Step 4. Update ranking = ranking + 1.
Step 5. If ranking > ”, goto Step 6. Else goto Step 2.
Step 6. Randomly generate regions to replace the regions left in SR. Move all
regions in S’R into the new SR.
3.2.2.3 BASIC CONCEPT OF COACO:
The main steps in continuous orthogonal ant colony (COAC) algorithm are the orthogonal exploration and the global modulation. An overall ‘owchart of COAC is illustrated in Figure 3.2.2.3.1, where MAXITER is the pre-de’ned maximum number of the iteration number.

Figure 3.2.2.3.1: Flowchart of the continuous orthogonal ant colony (COACO).
From the above flowchart, we explore Continuous Orthogonal Design Approach for determining best location of key point descriptor in an image. We take counter parameter t for search descriptor of all location. MAXITER represents total no. of location in orthogonal design. Ant Orthogonal Exploration Scheme determines the best path of key point descriptor and Global Modulation update the region of interest in an image. The characteristics of COACO are mainly in the following aspects: a) each independent variable space (IVS) of CFO is dispersed into a number of random nodes; b) the carriers of pheromone of ACO are shifted to the nodes; c) SP can be obtained by choosing one appropriate node from each IVS by ant; d) with the ODS, the best SP is further improved. Informally, its procedural steps are summarized as follows. Step 1) Initialization: nodes and the pheromone values of nodes are initialized; Step 2) Solution Construction: Ants follow the mechanism of ACO to select nodes separately using pheromone values and form new SPs; Step 3) Sorting and Orthogonal Search: SPs in this iteration are sorted and the orthogonal search procedure is applied to the global best SP; Step 4) Pheromone Updating: Pheromone values on all nodes of all SPs are updated using the pheromone depositing rule and the pheromone evaporating rule; Step 5) SP Reconstruction: A number of the worst SPs are regenerated; Step 6) Termination Test: If the test is passed, stop; otherwise go to step 2).

3.3 SUPPORT VECTOR MACHINES:

In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier[3],[15],[16]. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

Figure 3.3.1: Flowchart of SVM

Support vector machine is a machine learning technique that is well-founded in statistical learning theory. Statistical learning theory is not only a tool for the theoretical analysis but also a tool for creating practical algorithms for pattern recognition. This abstract theoretical analysis allows us to discover a general model of generalization. On the basis of the VC dimension concept, constructive distribution-independent bounds on the rate of convergence of learning processes can be obtained and the structural risk minimization principle has been found. The new understanding of the mechanisms behind generalization not only changes the theoretical foundation of generalization, but also changes the algorithmic approaches to pattern recognition[21],[22]. The generalize algorithm of Simple Support Vector Machine as follows:
Algorithm of Simple SVM
candidateSV = { closest pair from opposite classes }
while there are violating points do
Find a violator
candidateSV = candidateSV S violator
if any ??p < 0 due to addition of c to S then
candidateSV = candidateSV p
repeat till all such points are pruned
end if
end while

The Support Vector Machine (SVM) was first proposed by Vapnik and has since attracted a high degree of interest in the machine learning research community. Several recent studies have reported that the SVM (support vector machines) generally are capable of delivering higher performance in terms of classification accuracy than the other data classification algorithms. Sims have been employed in a wide range of real world problems such as text categorization, hand-written digit recognition, tone recognition, image classification and object detection, micro-array gene expression data analysis, data classification[23],[24],[25]. It has been shown that Sims is consistently superior to other supervised learning methods. However, for some datasets, the performance of SVM is very sensitive to how the cost parameter and kernel parameters are set. As a result, the user normally needs to conduct extensive cross validation in order to figure out the optimal parameter setting. This process is commonly referred to as model selection. One practical issue with model selection is that this process is very time consuming. We have experimented with a number of parameters associated with the use of the SVM algorithm that can impact the results. These parameters include choice of kernel functions, the standard deviation of the Gaussian kernel, relative weights associated with slack variables to account for the non-uniform distribution of labeled data, and the number of training examples.

Figure 3.3.2: Neural Network for SVM

SVMs are set of related supervised learning methods used for classification and regression. They belong to a family of generalized linear classification. A special property of SVM is, SVM simultaneously minimize the empirical classification error and maximize the geometric margin. So SVM called Maximum Margin Classifiers. SVM is based on the Structural risk Minimization (SRM). SVM map input vector to a higher dimensional space where a maximal separating hyperplane is constructed. Two parallel hyperplanes are constructed on each side of the hyperplane that separate the data. The separating hyperplane is the hyperplane that maximize the distance between the two parallel hyperplanes. An assumption is made that the larger the margin or distance between these parallel hyperplanes the better the generalization error of the classifier will be. We consider data points of the form

{(x1,y1),(x2,y2),(x3,y3),(x4,y4)”’.,(xn, yn)}.

Where yn=1 / -1 , a constant denoting the class to which that point xn belongs. n = number of sample. Each x n is p-dimensional real vector. The scaling is important to guard against variable (attributes) with larger varience. We can view this Training data , by means of the dividing (or seperating) hyperplane , which takes

w . x + b = 0 —– (1)

Where b is scalar and w is p-dimensional Vector. The vector w points perpendicular to the separating hyperplane . Adding the offset parameter b allows us to increase the margin. Absent of b, the hyperplane is forsed to pass through the origin , restricting the solution. As we are interesting in the maximum margin, we are interested SVM and the parallel hyperplanes. Parallel hyperplanes can be described by equation

w.x + b = 1
w.x + b = -1

If the training data are linearly separable, we can select these hyperplanes so that there are no points between them and then try to maximize their distance. By geometry, We find the distance between the hyperplane is 2 / ‘w’. So we want to minimize ‘w’. To excite data points, we need to ensure that for all I either

w. xi ‘ b ‘ 1 or w. xi ‘ b ‘ -1

This can be written as

yi ( w. xi ‘ b) ‘1 , 1 ‘ i ‘ n ——(2)

Figure 3.3.3: Maximum margin hyperplanes for a SVM trained with samples from two classes

Samples along the hyperplanes are called Support Vectors (SVs). A separating hyperplane with the largest margin defined by M = 2 / ‘w’ that is specifies support vectors means training data points closets to it. Which satisfy?

y i [wT . x j + b] = 1 , i =1 —–(3)

Optimal Canonical Hyperplane (OCH) is a canonical Hyperplane having a maximum margin. For all the data, OCH should satisfy the following constraints

yi[wT . xi + b] ‘1 ; i =1,2’l ——(4)

Where i is Number of Training data point. In order to find the optimal separating hyperplane having a maximal margin, A learning machine should minimize ‘w’2 subject to the inequality constraints

yi [wT . xi + b] ‘ 1 ; i =1,2”.n

Figure 3.3.4: (a) Training Data apply in SVM (b) Query Data use as Prediction

This optimization problem solved by the saddle points of the Lagrange’s Function

Where ??i is a Lagranges multiplier .The search for an optimal saddle points ( w0, b0, ??0 ) is necessary because Lagranges must be minimized with respect to w and b and has to be maximized with respect to nonnegative ??i (??i ‘ 0). This problem can be solved either in primal form (which is the form of w & b) or in a dual form (which is the form of ??i ).Equation number (4) and (5) are convex and KKT conditions, which are necessary and sufficient conditions for a maximum of equation (4). Partially differentiate equation (5) with respect to saddle points ( w0, b0, ??0 ).
‘L / ‘w0 = 0

i.e

and ‘L / ‘b0 = 0

i.e.

Substituting equation (6) and (7) in equation (5). We change the primal form into dual form.

Figure 3.3.5: Machine Learning Classifier

In order to find the optimal hyperplane, a dual lagrangian (Ld) has to be maximized with respect to nonnegative ??i (i .e. ??i must be in the nonnegative quadrant) and with respect to the equality constraints as follow

??i ‘ 0 , i = 1,2’…l

Note that the dual Lagrangian Ld(??) is expressed in terms of training data and depends only on the scalar products of input patterns (xiT, xj). Training vectors xi are mapped into a higher (may be infinite) dimensional space by the function ??. Then SVM finds a linear separating hyperplane with the maximal margin in this higher dimension space. C > 0 is the penality parameter of the error term. Furthermore, K(xi ,xj) ‘ ??(xi)T??(xj) is called the kernel function. There are many kernel functions in SVM, so how to select a good kernel function is also a research issue. However, for general purposes, there are some popular kernel functions

‘ Linear kernel: K(xi ,xj) = xiT xj.
‘ Polynomial kernel: K (xi , xj) = (?? xiT xj + r)d , ?? > 0
‘ RBF kernel : K (xi , xj) = exp(-?? ‘xi – xj’2) , ?? > 0
‘ Sigmoid kernel: K (xi , xj) = tanh(?? xiT xj + r)

Here, ??, r and d are kernel parameters. In these popular kernel functions, RBF is the main kernel function because of following reasons :

1. The RBF kernel nonlinearly maps samples into a higher dimensional space unlike to linear kernel.
2. The RBF kernel has less hyper parameters than the polynomial kernel. 3. The RBF kernel has less numerical difficulties.
3. The RBF kernel has less numerical difficulties.

The Support Vector Machine is w SVM solves non-separable feature vectors by relaxing the constraints of the hyperplane. A cost function is added into the separating margin regions. In most practical applications where classification is required, the data to be separated are not linearly separable. A solution that ignores the few weird instances would be preferable. The SVM approach is considered a good candidate for classification because of its high generalization performance without the need to add a priori knowledge, even when the dimension of the input space is very high. Although the SVM can be applied to various optimization problems such as regression, the classic problem is that of data classification. SVM has been largely used in CBIR as a learning technique based on relevance feedback. All these methods pose the limitation that it requires the users’ involvement. The decisions made by the system will be tailored to the needs of individuals and may not be applicable generally.

Figure 3.3.6: SVM as a Supervised Learning Model
Model selection is also an important issue in SVM. Recently, SVM have shown good performance in data classification. Its success depends on the tuning of several parameters which affect the generalization error. We often call this parameter tuning procedure as the model selection. If you use the linear SVM, you only need to tune the cost parameter C. Unfortunately, linear SVM are often applied to linearly separable problems. Many problems are non-linearly separable. For example, Satellite data and Shuttle data are not linearly separable. Therefore, we often apply nonlinear kernel to solve classification problems, so we need to select the cost parameter (C) and kernel parameters (??, d). We usually use the grid-search method in cross validation to select the best parameter set. Then apply this parameter set to the training dataset and then get the classifier. After that, use the classifier to classify the testing dataset to get the generalization accuracy.
CHAPTER 4

PROPOSED METHOD (SVM-COACO)

4.1 INTRODUCTION:

The proposed method of my thesis work is SVM-COACO (Support Vector Machine ‘ Continuous Orthogonal Ant Colony Optimization). SVM technique is the supervised machine learning approach, which is used for classify and retrieve the image descriptor. COACO (Continuous Orthogonal Ant Colony Optimization) used for optimize the descriptor, from which descriptor complexity is reduced. This method increases the accuracy of Precision ‘ Recall curve. Hence, we are explain the proposed CBIR (Content Based Image Retrieval) based on SIFT (Scale Invariant Feature Transform) method. The potential of the SVM-COACO is illustrated on a 3D object recognition task using the Coil database and on a image classi’cation and retrieval task using the Corel database. The images are either representated by a matrix of their pixel values (bitmap representation) or by a color histogram. In both cases, the proposed system requires feature extraction and performs recognition on images regarded as points of a space of high dimension. The feature extraction is perform by SIFT scheme. We also purpose an extension of the basic color histogram which keeps more about the information contained in the images.

4.2 ALGORITHM OF PROPOSED WORK:

The Algorithm of proposed method is explained below:

Algorithm: [retimageArray] = SVM_COACO(query_image, imagedataset)
Step 1: We take query image in 200×300 size and also the image dataset are resize
into 200×300 size.
Step 2: Given query image and all the images of image dataset is partitioned into
2×3 array.
Step 3: Now we find out the SIFT descriptors of each image patch of cell array for
query image as well as images of image dataset. SIFT method perform the
following sequence of steps for find the keypoint descriptors for texture
feature.
Scale-space extrema detection: The first stage of computation searches over all scales and image locations. It is implemented efficiently by using a difference-of-Gaussian function to identify potential interest points that are invariant to scale and orientation.
Keypoint localization: At each candidate location, a detailed model is fit to determine location and scale. Keypoints are selected based on measures of their stability.
Orientation assignment: One or more orientations are assigned to each keypoints location based on local image gradient directions. All future operations are performed on image data that has been transformed relative to the assigned orientation, scale, and location for each feature, thereby providing invariance to these transformations.
Keypoint descriptor: The local image gradients are measured at the selected scale in the region around each keypoint. These are transformed into a representation that allows for significant levels of local shape distortion and change in illumination.
Step 4: Above step are perform in repeated form, then all the descriptor of images are store,
Now apply COACO method on image dataset for retrieving images. The sequence of
steps for find the best descriptor point using COAC Optimizer as follows.
1.1 Ant Orthogonal Exploration: Decide the no. of iteration for each region of interest. Now apply iterative procedure of an ant orthogonal exploration in following steps.
1.1.1 Choose a region in image patch.
1.1.2 Randomly choose n different columns of the given orthogonal
Array OA(N,k,s) as a new orthogonal array.
1.1.3 Generate N neighboring points.
1.1.4 Adaptively adjust the radiuses of the region.
1.1.5 Move the region center to the best point.
1.2 Global Modulation: from the above procedure find the best point of image descriptors in selected image patch, Now apply global modulation can be outlined as follows.
1.2.1 Set the variable ranking = 1. S’R= ‘.
1.2.2 Find the best region j in S’R.
1.2.3 Set rankj = ranking and update the pheromone value of region j.
Move region j into S’R.
1.2.4 Update ranking = ranking + 1.
1.2.5 If ranking > ”, goto Step 6. Else goto Step 2.
1.2.6 Randomly generate regions to replace the regions left in SR. Move
all regions in S’R into the new SR.
1.3 Now create the optimize dataset of image feature descriptors.
Step 5: Train the SVM network with training feature set.
Step 6: Find the class label of query image using SVM.
Step 7: Find all images of the same as the query image from the image database.
Step 8: Find the nearest matching images to the query image with in that class of
images using simple distance metric.
Step 9: The resultant best matching images as per the query image.
We perform the above steps of desired algorithm for retrieving best matching images. SVM-COACO can be trained even if the number of images is much lower that the dimensionality of the input space. We also pointed out the need to investigate into kernels which are well-suited for the data representation. We can use different kernel function for implement SVM-COACO. The best matching images can be match with edge cutting object, color, or texture feature. We can select number of images for retrieve images from image dataset.
4.3 FLOWCHART OF PROPOSED WORK:
The algorithm of proposed work can also be explained through the flowchart. The flowchart of proposed method is explained below. Firstly, we take image dataset approx 20000, these images are classified into 10 classes. Each class consist approx 2000 images. Images are divide into 2×3 array and now each image ROI (region of interest) attempt as a input into SIFT method and find the image descriptor. These descriptors are attempt as an input into COACO for optimizing the descriptor of image ROI. After optimizing the image descriptors, now attempt as a input into SVM classifier and finding best matching descriptor as per the query image descriptors.
The basic flowchart of proposed work is as follows. These is use with call off operator where SIFT, COACO and SVM method is explained through flowchart.

Figure 4.3.1: Flowchart of proposed method SVM-COACO

The reference of above flowchart is explained below.

Figure 4.3.2: Flowchart of SIFT Technique

Figure 4.3.3: Flowchart of COACO

Figure 4.3.4: Flowchart of Ant Orthogonal Exploration

Y

Figure 4.3.5: Flowchart of GM

Figure 4.3.6: Flowchart of Support Vector Machine

The above mentioned proposed method give the better result as compare than existing CBIR, The SIFT descriptor of every image apply as a input in COACO technique for optimize them, then the outcome of COACO is apply to SVM method. The SVM is use as a classifier and retrieve of images.

CHAPTER 5

RESULT AND DISCUSSION

We can retrieve more images through the proposed method SVM-COACO. The input query image with different sizes , each image is resize into standard size and applied to SIFT method then store descriptor in row vector. In a similar way image dataset (approx 20000) descriptors is store in mat file and apply COACO. Now, obtain the best point of descriptor, these descriptor point are use for matching with query descriptors and retrieve best images. The kernel function is implementing with train the data with SVM. Mostly we use Radial Basis Function (RBF) and Polynomial kernel. Here, we discuss these kernels in following category. We obtain the following results:

5.1 RETRIEVE IMAGES ANALYSIS:

In this case, we use distance metric as 1 for implementing Manhattan distance metric and 2 for implementing Euclidean distance metric. This distance metric effect the SVM classifier. The Kernel function is used in two ways, we use Radial Basis Function (RBF) and Polynomial Kernel with special kernel parameters.

A. MANHATTAN DISTANCE METRIC:

The GUI layout for retrieving images through SVM and SVM-COACO with RBF kernel is as follows:

Figure 5.1.1: Retrieve Best Matching images using SVM (RBF Kernel)

Figure 5.1.2: Retrieve Best Matching images using SVM-COACO (RBF Kernel)

In a similar way, The GUI layout for retrieving images through SVM and SVM-COACO with Polynomial kernel is as follows:

Figure 5.1.3: Retrieve Best Matching images using SVM (Polynomial Kernel)

Figure 5.1.4: Retrieve Best Matching images using SVM-COACO (Polynomial Kernel)
In above retrieve scheme, we use SVM and SVM-COACO technique with RBF and Polynomial kernel. We observe that SVM CBIR produce the inconsistent results as compare than SVM-COACO. The retrieving accuracy of CBIR (SVM) is quite low as compare than SVM-COACO CBIR.

B. EUCLIDEAN DISTANCE METRIC:

In this case, we use distance metric as 2 for implementing Euclidean distance metric. In a similar way, we use Radial Basis Function (RBF) and Polynomial with special kernel parameters. The GUI layout for retrieving images through SVM and SVM-COACO with RBF kernel and Polynomial Kernel is as follows:

Figure 5.1.5: Retrieve Best Matching images using SVM (RBF Kernel)

Figure 5.1.6: Retrieve Best Matching images using SVM-COACO (RBF Kernel)
In a similar way, The GUI layout for retrieving images through SVM and SVM-COACO with Polynomial kernel is as follows:

Figure 5.1.7: Retrieve Best Matching images using SVM (Polynomial Kernel)

Figure 5.1.8: Retrieve Best Matching images using SVM-COACO (Polynomial Kernel)

The MDM (Manhatten Distance Metric) and EDM (Euclidean Distance Metric) are different to each other. Both metrics produce best results but quite different. We set the distance metric at Euclidean distance mode / Manhatten distance mode and number of retrieve images . The above results show that consistent and inconsistent retrieve images from image dataset. The accuracy of consistent image retrieval method SVM and SVM – COACO is discussing in section 5.3.

5.2 PRECISION-RECALL RELATIONSHIP ANALYSIS:

Recall is defined as the number of relevant documents retrieved by a search divided by the total number of existing relevant documents, while precision is defined as the number of relevant documents retrieved by a search divided by the total number of documents retrieved by that search. In a classification task, the precision for a class is the number of true positives divided by the total number of elements labeled as belonging to the positive class. Recall in this context is defined as the number of true positives divided by the total number of elements that actually belong to the positive class.

A. MANHATTAN DISTANCE METRIC:

Figure 5.2.1: PR Curve for SVM and SVM-COACO (RBF Kernel)

Figure 5.2.2: PR Curve for SVM and SVM-COACO (Polynomial Kernel)
In case of MDM (Manhatten Distance Metric), The Precision and Recall of SVM is quite high than SVM-COACO. Therefore, SVM method retrieves more images as compare than SVM-COACO. Images which are retrieving from SVM are less relevant with query image as compare than SVM-COACO because SVM-COACO has less inconsistency retrieve images.

B. EUCLIDEAN DISTANCE METRIC:

Figure 5.2.3: PR Curve for SVM and SVM-COACO (RBF Kernel)

Figure 5.2.4: PR Curve for SVM and SVM-COACO (Polynomial Kernel)
In case of EDM (Euclidean Distance Metric), The Precision and Recall of SVM is quite high than SVM-COACO. Therefore, SVM method retrieves more images as compare than SVM-COACO. Images which are retrieving from SVM are less relevant with query image as compare than SVM-COACO because SVM-COACO has less inconsistency retrieve images.

5.3 CONFUSION MATRIX ANALYSIS
A confusion matrix, is a table with two rows and two columns that reports the number of false positives, false negatives, true positives, and true negatives. This allows more detailed analysis than mere proportion of correct guesses (accuracy). Retrieve accuracy is not a reliable metric for the real performance of a classifier, because it will yield misleading results if the data set is unbalanced In confusion matrix, the row which labeled as 0,1,’,9 are considered as actual class and the column which labeled as 0,1,’,9 are considered as predicted class.
The confusion matrix shows number of matching images in each class as per query image. In following figure shows the different confusion matrix, these matrix is represent in eight ways because number of distance metric is 2 (MDM and EDM), number of kernel tricks is 2 (RBF and Polynomial) and number of CBIR is 2 (SVM and SVM-COACO).

A. MANHATTAN DISTANCE METRIC:

Figure 5.3.1: Confusion Matrix for SVM (RBF Kernel, MDM)

Figure 5.3.2: Confusion Matrix for SVM-COACO (RBF Kernel, MDM)

Figure 5.3.3: Confusion Matrix for SVM (Polynomial Kernel, MDM)

Figure 5.3.4: Confusion Matrix for SVM-COACO (Polynomial Kernel, MDM)

In this metric, Figure 5.3.1 represents 41 images of actual class 1 is correctly match with predicted class 1. Figure 5.3.2 represents 10 images of actual class 1 is correctly match with predicted class. In a similar way, Figure 5.3.3 represents 95 images of actual class 1 is correctly match with predicted class 1. Figure 5.3.4 represents 6 images of actual class 1 is correctly match with predicted class. Therefore, consistent accuracy of SVM-COACO is better than SVM.
B. EUCLIDEAN DISTANCE METRIC:

Figure 5.3.5: Confusion Matrix for SVM (RBF Kernel, EDM)

Figure 5.3.6: Confusion Matrix for SVM-COACO (RBF Kernel, EDM)

Figure 5.3.7: Confusion Matrix for SVM (Polynomial Kernel, EDM)

Figure 5.3.8: Confusion Matrix for SVM-COACO (Polynomial Kernel, EDM)

In this metric, Figure 5.3.5 represents 32 images of actual class 1 is correctly match with predicted class 1. Figure 5.3.6 represents 10 images of actual class 1 is correctly match with predicted class. In a similar way, Figure 5.3.7 represents 43 images of actual class 1 is correctly match with predicted class 1. Figure 5.3.4 represents 9 images of actual class 1 is correctly match with predicted class. Therefore, consistent accuracy of SVM-COACO is better than SVM.

5.4 CONSISTENT RETRIEVE ACCURACY ANALYSIS

In following table represents various accuracy of consistent retrieving images (in percentage) from image data set. Basically accuracy reflects the performance of CBIR. The accuracy measures on three types of image size. A image dataset consist of various images which classified in classes or labels. Every image of dataset has same size. For this purpose we use resize function in MATLAB. Here, we take three types of image dataset in which images belongs to 384×256, 410×320 and 500×360 sizes. As per previous discussion we use MDM and EDM for evaluate distance metric of feature descriptor. The Kernel tricks RBF and POLY are implementing with svmtrain function in MATLAB. Apart from this accuracy scheme, We are use Mean Squared Error as a performance parameter. If we are calculate the MSE for MDM and EDM then value of MSE for SVM-COACO (Proposed Method) is quite low as conventional SVM scheme. Therefore, SVM-COACO is best approach as compare than SVM on the basis of MSE.

Figure 5.4.1 Consistent Accuracy of Retrieved images

From above table, the first four column is implement in following Figure 5.4.2 barchart and from 5th to 8th column is implement in following Figure 5.4.3 barchart.

Figure 5.4.2: Comparison b/w consistent accuracy of SVM and SVM-COACO in MDM

In Figure 5.4.2, The first color band group (3 color) shows the results of SVM (with RBF and MDM) for each image size, which is abbreviated as 1 at x-axis. In a similar way, the results of SVM-COACO (with RBF and MDM), SVM (with POLY and MDM), SVM-COACO (with POLY and MDM) are abbreviated as 2, 3 and 4 on x-axis.

Figure 5.4.3: Comparison b/w consistent accuracy of SVM and SVM-COACO in EDM

In Figure 5.4.3, The first color band group (3 color) shows the results of SVM (with RBF and EDM) for each image size, which is abbreviated as 1 at x-axis. In a similar way, the results of SVM-COACO (with RBF and EDM), SVM (with POLY and EDM), SVM-COACO (with POLY and EDM) are abbreviated as 2, 3 and 4 on x-axis.

The above results clearly shows, the accuracy of consistent retrieve images is keep high for SVM-COACO (proposed method) in both kernel tricks says RBF and Polynomial and also both distance metrics says as MDM ( Manhatten Distance Metric ) and EDM ( Euclidean Distance Metric ) as compare than conventional SVM CBIR.
CHAPTER 6

CONCLUSION AND FUTURE WORK

The basic conclusion of my thesis work is as follows:

We have demonstrated that active learning with support vector machines can provide a powerful tool for searching image databases, outperforming a number of traditional query refinement schemes. SVM-COACO not only achieves consistently high accuracy on a wide variety of desired returned results, but also does it quickly and maintains high precision when asked to deliver consistently retrieved of images. Also, unlike recent systems such as SVM, it does not require an explicit semantically layer to perform well. There are a number of interesting directions that we wish to pursue. The running time of our algorithm scales linearly with the size of the image database both for the relevance feedback phase and for the retrieval of the top-k images. This is because, for each querying round, we have to scan through the database for the twenty images that are closest to the current SVM boundary, and in the retrieval phase we have to scan the entire database for the top k most relevant images with respect to the learned concept. SVM-COACO is practical for image databases that contain a few thousand images; however, we would like to find ways for it to scale to larger sized databases. In the proposed scheme, feature aggregation was formulated as a binary classification and retrieval problem and solved by support vector machine-continuous orthogonal ant colony optimization (SVM-COACO) in a feature dissimilarity space. Incorporating the methods of data cleaning and noise tolerant classifier, a new two-step strategy was proposed to handle the noisy positive examples. In step 1, an ensemble of SVM-COACO trained in a feature dissimilarity space is used as consensus filters to identify and eliminate the noisy positive examples. In step 2, the noise tolerant relevance calculation was performed, which associated each retained positive example with a relevance probability to further alleviate the noise influence. The experimental results show that the proposed scheme outperforms the competing feature aggregation based image retrieval schemes when noisy positive examples present in the query. The best output of this proposed method as follows:

1. We have to reduce the time for retrieving images from dataset through COACO method. This method uses the concept of orthogonal array. Hence we find best key point descriptors in orthogonal design matrix of descriptors.
2. We obtain significantly best Confusion matrix, which represent the matching retrieve images from different images.
3. The SVM-COACO provide best accuracy of consistent image retrieval from databases as compare than traditional SVM CBIR.

The future work of current thesis work is as follows:

1. We can use SURF, CHOG, Fast SIFT or Dense SIFT method for find the keypoint descriptors.
2. We can use Max-Min ACO (Ant Colony Optimization) or Ranked Based ACO techniques for evaluating optimize descriptors.
3. We can use Artificial Neural Network as a supervised learning for classify and retrieve image.

Basically, We have presented a new algorithm SVM-COACO that is efficient, intuitive and fast. We show that the algorithm signi’cantly outperforms other iterative algorithms like the SVM in terms of the number of kernel computations. Because of the approach used to build the Support Vector, set our algorithm does not suffer from numerical instabilities and round off errors that plague other numerical algorithms for the SVM problem. In this work, we have demonstrated the potential of Support Vector Machines in the problems of image retrieval and image classi’cation. It appears that unlike most learning techniques, SVM-COACO can be trained even if the number of examples is much lower that the dimensionality of the input space. We also pointed out the need to investigate into kernels which are well-suited for the data representation. We use two kernel tricks namely as Radial Basis Function (RBF) and Polynomial (POLY). Thus this result can be extended to other problems and provides a general technique for histogram and density classi’cation. Nevertheless, the image classi’cation problem is still open since a color histogram may not provide enough information to obtain an efficient classi’er.

CHAPTER 7

REFERENCES

[1] ‘Improved content-based classification and retrieval of images using support
vector machine’Author: V. Karpagam and R. RangarajanPublication: Current Science,
Vol. 105, No. 9, 10 November 2013

[2] ‘Cloud Based Image Coding for Mobile Devices Toward Thousand to One Compression’,Author: Huanjing Yue, Xiaoyan Sun, Jingyu Yang, and Feng Wu,Publication: IEEE Transaction on Multimedia, Vol 15 No 4,june 2013
[3] ‘Support Vector Machine Active Learning for Image Retrieval’,Author: Simon Tong and Edward Chang,Publication: IEEE international conference on pattern recognition and computer vision, 2011

[4] John C.Platt, Fast Training of Support Vector Machines using Sequential Minimal Optimization , 2010

[5] ‘Data Classification using Support Vector Machine’ Author: Durgesh K. Shrivastava, Lekha Bhambhu Publication: Journal of Theoretical and Applied Information Technology, 2009

[6] ‘Orthogonal Methods Based Ant Colony Search for Solving Continuous Optimization Problems’ Author: Xiao-Min Hu, Jun Zhang, Yun Li Publication: Journal of Computer Science and Technology, 23(1): 2’18 Jan. 2008

[7] ‘Continuous Function Optimization Using Hybrid Ant Colony Approach with Orthogonal Design Scheme’ Author: Jun Zhang1, Wei-neng Chen1, Jing-hui Zhong1, Xuan Tan1, and Yun Li2 Publication: Springer-Verlag Berlin Heidelberg 2006

[8] M. S. Lew, N. Sebe, C. Djeraba, and R. Jain, ‘Content-based mul- timedia information retrieval: State of the art and challenges,’ ACM Trans. Multimedia Computing, Commun. Appl., vol. 2, pp. 1’19, 2006.

[9] ‘Distinctive image feature from scale invariant keypoints’,Author: David G. Lowe, Publication: International journal of computer vision, 2004

[10] ‘Feature Subset Selection for Support Vector Machines Through Discriminative Function Pruning Analysis’. IEEE Trans. Systems, Man, and Cybernetics, vol. 34, no. 1, (2004) 60-67

[11] ‘SSVM: A Simple SVM Algorithm’Author: S.V.N. Vishwanathan, M. Narasimha Murty Publication: Journal of Theoretical and Applied Information Technology, 2003

[12] Image Representations and Feature Selection for Multimedia Database Search. IEEE Trans. Knowledge and Data Engineering, vol. 15, no. 4, (2003) 911-920

[13] C.-W. Hsu and C. J. Lin. A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13(2):415-425, 2002.

[14] S. V. N. Vishwanathan and M. Narasimha Murty. Geometric SVM: A fast and intuitive SVM algorithm. Technical Report IISC-CSA-2001-14, Dept. of CSA, Indian Institute of Science, Bangalore, India, November 2001. Submitted to ICPR 2002.

[15] Danny Roobaert. DirectSVM: A simple support vector machine perceptron. Journal of VLSI Signal Processing Systems, 2001.

[16]Chang, C.-C. and C. J. Lin (2001). LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm .

[17] V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 2nd edition, 2000.

[18] Kristin P. Bennett and Erin J. Bredensteiner. Duality and geometry in SVM classi’ers. In P. Langley, editor, Proceedings of the 17th International Conference on Machine Learning, pages 57’64, San Francisco, California, 2000. Morgan Kaufmann.

[19] S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy. A fast iterative nearest point algorithm for support vector machine classi’er design. IEEE Transactions on Neural Networks, 11(1):124’136, 2000.

[20] Danny Roobaert. DirectSVM: A fast and simple support vector machine perceptron. In Proceedings of IEEE International Workshop on Neural Networks for Signal Processing, Sydney, Australia, December 2000.

[21] Y. Rui, T. S. Huang, and S. F. Chang, ‘Image retrieval: Current tech- niques, promising directions, open issues,’ J. Visual Commun. Image Representation, vol. 10, pp. 39’62, 1999.

[22] J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Sch??olkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods: Support Vector Machines. Cambridge MA: MIT Press, December 1998.

[23] J. R. Smith and S. F. Chang, ‘Visualseek: A fully automated contentbased image query system,’ ACM Multimedia, pp. 87’98, 1996.

[24] Wikipedia’KNN, http://en.wikipedia.org/wiki/KNN

[25]Wikipedia’SVM,http://en.wikipedia.org/wiki/Support_vector_machine

[26]http://john.cs.olemiss.edu/~ychen/DDSVM

[27]http://svr-www.eng.cam.ac.uk/~kkc21/thesis_main/node31.html

[28] http://www.cs.waikato.ac.nz/ml/weka/

About this essay:

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, Problems In Image Retrieval. Available from:<https://www.essaysauce.com/engineering-essays/problems-image-retrieval/> [Accessed 22-04-26].

These Engineering essays have been submitted to us by students in order to help you with your studies.

* This essay may have been previously published on EssaySauce.com and/or Essay.uk.com at an earlier date than indicated.