1 Classification of Bee Images Using Support Vector Machines and Principal Component Analysis Sharvan Jeet Prasoon Tripathi B


Classification of Bee Images Using Support Vector Machines and
Principal Component Analysis

Sharvan Jeet Prasoon Tripathi
B.Tech IT, MSIT Delhi B.Tech IT, MSIT Delhi
imsh [email protected] tripathi. [email protected]

Object Classification is an important task within the field of computer vision. Image
classification refers to the labelling of images into one of several predefined categories.
Classification includes image sensors, image pre -processing, object detection, object
segmentation, feature extraction and object classification. Many classification techniques have
been developed for image classification. In this project our task is to develop a model which
can classify a given bee image and tell whether it is Bombus ( Bumble bee ) or Apis ( Honey
bee ). We used Histogram Oriented Gradients (HOG) and Dense -Daisy for feature extraction.
We used Support Vector Machine (SVM) and Pr inciple Component Analysis (PCA) and
developed a model which can classify bee images and predict whether it is a Bumble bee or
Honey bee. Then we fine -tuned our model and cross validated to finally obtain accuracy of
91 %.
1 Introduction
1.1 Motivation
Being able to identify bee species from images is a task that ultimately would allow researchers
to more quickly and effectively collect field data. Pollinating bees have critical roles in both
ecology and agriculture, and diseases like colony co llapse disorder threaten these species.
Identifying different species of bees in the wild means that we can better understand the
prevalence and growth of these important insects.

Figure 1 Bee classes ( Bumble bee and Honey bee )
1.2 Task Definition
Our basic task is to create a model which can classify a bee image. The input for this task is
images of bees from training dataset, while the output is the predicted species on test dataset.


The given dataset for this project is taken from BeeSpotter website . Our training set contains
20,000 images, including 10 ,000 images of Bumble bee and 10 ,000 images of Honey bee. The
images are of different sizes and resolutions, so they have been scaled and cropped to 100×100.
Our learning task is to learn a classification model to determine the decision boundary for the
trai ning dataset. The whole process is illustrated in Figure 1, from which we can see the input
for the learning task is images from the training dataset, while the output is the learned
classification model.

Figure 2 Architecture fo r Task

Our performance task is to apply the learned classification model to classify images from the
test dataset, and then evaluate the classification accuracy. As seen from Figure 1, the input is
images from the test dataset, and the outpu t is predicted species .
1.3 Our Solution
In our solution, we first did our prepossessing for which we scaled all the images and cropped
them to 100×100 size. For feature extraction we first used HOG then also used Dense Daisy
algorithm. As our feature matrix was too large w e converted it to square matrix using PCA and
then trained our model using SVM. Then we did our fine -tuning and cross -validation to achieve
high accuracy.
The outline of our paper is as follows. We introduce the first feature extraction approach in
section 3.1 . The second approach is described in section 3 .2. Finally, we summarize our work
and po tential future work in section 9 .
2 Image Manipulation
2.1 rgb2grey
We have used rgb2grey function to return greyscale image for every coloured image .
The rgb2grey function computes the luminance of an RGB image using the following formula :


Y = 0.2125 R + 0.7154 G + 0.0721 B
Image data is represented as a matrix, where the depth is the number of channels. An RGB
image has three channels (red, green, and blue) whereas the returned greyscale image has only
one channel. Accordingly, the original color image has the dimensions 10 0x100x3 but after
calling rgb2grey , the resulting greyscale image has only one channel, making the
dimensions 100x100x1 .

Figure 3 rgb2grey operation on raw image
3 Feature Extraction
3.1 Method One: Using Histogram of Oriented Gradients
Images need to be turned into something that a machine learning algorithm can understand.
Traditional computer vision techniques have relied on mathematical transforms to turn images
into useful features. For example, we may want to detect edges of objects in an image, increase
the contrast, or filter out particular colors.

Figure 4 HOG features


The idea behind HOG is that an object’s shape within an image can be inferred by its edges,
and a way to identify edges is by looking at the direction of intensity gradients (i.e. change s in
luminescence). An image is divided in a grid fashion into cells, and for the pixels within each
cell, a histogram of gradient directions is compiled. To improve invariance to highlights and
shadows in an image, cells are block normalized, meaning an i ntensity value is calculated for
a larger region of an image called a block and used to contrast normalize all cell -level
histograms within each block. The HOG feature vector for the image is the concatenation of
these cell -level histograms .
3.1.1 Result
W hen we used HOG feature along with PCA to train our SVM model we were able to attain
79% accuracy. The performance was not improving further so we had to try other methods

Figure 5 AUC for HOG+PCA+SVM model
3.2 Method Two: Using Daisy Features
The DAIS Y algorithm consists of the three building block s T, S, N . Typically, the input is a
square monochrome image patch and the output is a vector of bytes. For every pixel in the
input patch, the T -block computes a vector of k feature detector responses. These responses
have positive values and we allow a c hoice of three different feature algorithms based on
gradients or steerable filters, described shortly. The S -block is used to combine T -block filter
responses spatially by pooling them using 2D Gaussian weighting profiles. The Gaussian
pooling centres are arranged in a log -polar configuration like flower petals with size increasing
away from the centre of the patch. For each of these N pooling centres , the k feature responses
are independently pooled over space, resulting in a vector of kN numbers. We prov ide spatial
parameters for Gaussian weighting functions appropriate for various values of N. The
normalization, or N -block, involves unit -normalization of the vector coming from the S -block


which introduces robustness to lighting changes. This includes a c lipping stage which was
proposed by David Lowe and is very important to the performance of the resulting descriptors .

Figure 6 DAISY features
3.2.1 Result
When we used HOG feature along with PCA to train our SVM model we were able to attain
71% accuracy but what we observed was q uite interesting; the images which were not been
classified correctly were now getting correctly classified. So, we decided to combine both

Figure 7 AUC for DAISY +PCA+SVM model


4 Feature Reduction
4.1 Scale feature matrix + Principal Component Analysis
Ou r features weren’t ready till now . Many machine learning methods are built to work best
with data that has a mean of 0 and unit variance. So, we applied StandardScaler on our feature
PCA is a way of linearly transforming the data such that most of the information in the data is
contained within a smaller number of features called components. Below is a
visual example from an image dataset containing handwritten numbers. The image on the left
is the original image with 784 components. We can see that the image on the right (post PCA)
captures the shape of the number quite effectively even with only 59 components.

Figure 8 Feature Reduction using PCA
The original data of this paper is almost 20000 , even if t he extracted HOG and DAISY feature
is 40000 dimensions. As our feature matrix was too large we converted it to square matrix
using PCA.
5 Training Model
5.1 Support Vector Machine
SVM is a type of supervised machine learning model used for regression, classification, and
outlier detection.” An SVM model is a representation of the examples as points in space,
mapped so that the examples of the separate categories are divided by a clear gap that is as
wide as possible. New examples are then mapped into th at same space and predicted to belong
to a category based on w hich side of the gap they fall.
Here’s a visualization of the maximum margin separating two classes using an SVM classifier
with a linear kernel.


Figure 9 SVM classifier with linear kernel
Since we had a classification task — honey or bumble bee — we use d the support vector
classifier (SVC), a type of SVM.
6 Fine -tuning parameters and cross -validatio n
6.1 GridSearchCV
GridSearchCV combine s an estimator with a grid search preamble to tune hyper -parameters.
The method picks the optimal parameter from the grid search and uses it with the estimator
selected by the user. GridSearchCV inh erits the methods from the classifier, so we can use the
.score, .predict, etc.. methods directly through the GridSearchCV interface. We use d
.best_params_ to return th e best hyper -parameter. T hen we pass ed that hyper -parameter to our
estimator separately. We have used 3-fold, 5 -fold and 8-fold cross -validation for our three
models respectively .
7 Accuracy metrics
7.1 ROC curve + AUC
W e use d svm.predict_proba to get the probability that each class is the true label. For
example, predict_proba returns 0.46195176, 0.53804824 for the first image, meaning there is
a 46% chance the bee in the image is an Apis (0.0 ) and a 53% chance the bee in the image is a
Bombus (1.0 ). Note that the two probabilities for each image always sum to 1.
We used the default settings, probabilities of 0 .5 or above are assigned a class label of 1.0 and
those below are assigned a 0.0 . However, this threshold can be adjusted.
The receiver operating characteristic cur ve (ROC curve) plots the false positive rate and true
positive rate at different thresholds. ROC curves are judged visually by how close they are to
the upper left -hand corner.
The area under the curve (AUC) is also calculated, where 1 means every predicted label was
correct. Generally, the worst score for AUC is 0.5, which is the performance of a model that
randomly guesses.


8 Final Result
W e combined our both previous models by stacking both feature matrices together i.e. HOG
feature matrix and DAISY feature matrix as we observed recognition pattern for both mode ls
were behaving different and one model was able to classify images correctly which other
wasn’t . So , after combining both feature matrices and doing all steps above -mentioned; we
were able to attain 91% accuracy.

Figure 10 AUC for HOG+ DAISY+PCA+SVM model
9 Conclusion and Future Scope
9.1 Conclusion
In this report, we first briefly explained our motivation of this project. Then, we illustrated our
task, including the learning task and the performance task. After that, we introduced our
solution in detail.
Our approach is a traditional pattern recognition model, by which we learned the classification
model from HOG features and DAISY features. For better performance we had done scaling
and cropped image to 100×100 as images are of different resolutions, sizes and background.
We conver ted images to greyscale before feature extraction. After feature extraction we did
standard scaling and applied PCA. W e trained our SVM model on this feature matrix . T hen we
did fine -tuning and 8 -fold cross validation using GridSearchCV and fin ally measured accuracy
using ROC curve + AUC. We trained three SVM models with different feature ex traction


Table 1 Best Performance on the Test Dataset for Different Models

9.2 Future Scope
In the future, we can explore more to achieve better pe rformance. For instance, we can try to
change the architecture and parameter se ttings. Also, t his model can be used for identifying
sub -species if trained with more da ta and fine -tuned properly. This model gives great output
which can be used as input for more complex models like neural networks. We can use this for
classification of other objects by training with different images.


• F. L. da Silva, Escola Polytechnic University, Sao Paulo, AUTOMATED BEES
(Doutora do) – The Middle East Techincal Universty, Turkey, 2004.
NO. 10, OCTOBER 2012
• Bohyung Han, Member, IEEE, and Larry S. Davis, Fellow, IEEE, “D ENSITY -BASED
MACHINE” Ieee Transactions On Pattern Analysis And Machine Intelligence, Vol. 34,
No. 5, May 2012