Skip to content

In this project I have used four different types of fruits, you can increase the types of fruits by adding images to the training and testing data. For image classification the steps taken are 1) resizing the training and testing images 2) feature elimination using PCA 3) predicting the class label using K-Nearest Neighbors. The dataset is downl…

Notifications You must be signed in to change notification settings

ashwin2507/PCA-KNN_imageclassifier

Repository files navigation

Image classifier using PCA and KNN

In this project I have used four different types of fruits, you can increase the types of fruits by adding images to the training and testing data. For image classification the steps taken are 1) resizing the training and testing images 2) feature elimination using PCA  3) predicting the class label using K-Nearest Neighbors. The dataset is downloaded from Kaggle.
Learning outcomes:
1) Understanding the importance of PCA by implementing the algorithm from scratch.
2) Feature elimination using the scree plot and visualizing the transformed data.
3) KNN algorithm by implementing the algorithm from scratch.


For in-depth explanation please refer to the project report.

Programming Language Used: Python 3.6, Inc.
--------------------------

----------------
CODE STRUCTURE
-----------------
1) Loading the Training Data and extracting the labels of each:Used open CV (cv2) to load the image data and extract the labels of each image, resized the image to 28*28. 
2)	Implementing PCA as a function: First Flatten the image data to 1899(no. of training samples) by (28*28*3), then standardize the data, find co-variance matrix, find eigen-value, eigen-vector and lastly plotted the proportion of variance graph to decide how many Principal components to pick, in this case I picked 2 principal components, transformed the data to 2-D using the eigen-vector and eigen-value. 
3) Load the Testing data, resize to 28* 28 and perform PCA on it as well. The code looks like this.
4) Find K-nearest data points (using Euclidean distance) and take a majority vote to predict the label.

--------------------
HOW TO RUN THE CODE
--------------------
Please ensure you have these packages:
import numpy as np
import cv2
import matplotlib.pyplot as plt
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
import glob
import os
from sklearn.decomposition import PCA # To check if the output of using .pca() matches my implementaion
from sklearn.preprocessing import StandardScaler # For using sklearn.pca()
import math
import operator

Data set is also in the folder copy and paste it to your machine and change the path of the dataset in the code according to your local computer path stored in.

Command
-------
python3 image_classifier.py

The program will output graphs and plots and lastly will prompt the user for entering the value of k desired for KNN.

About

In this project I have used four different types of fruits, you can increase the types of fruits by adding images to the training and testing data. For image classification the steps taken are 1) resizing the training and testing images 2) feature elimination using PCA 3) predicting the class label using K-Nearest Neighbors. The dataset is downl…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages