Semester Project: Multi-Class Image Classification

Computer VisionPythonStreamLitLBPHOGPCATensorflow

Friday, May 10, 2024

This project focuses on developing a multi-class image classification model using Python, OpenCV, NumPy, and TensorFlow. The primary objective is to classify images into one of the ten classes provided by the CIFAR-10 dataset, which includes categories such as dog, cat, airplane, fish, and more.

Dataset

The CIFAR-10 dataset is a well-known dataset for image classification tasks, consisting of 60,000 32x32 color images in 10 different classes. It is structured in a specific format that requires some preprocessing to convert it into a usable image format for our model.

Data Preparation

  • Extracting Data: The first step involves extracting the CIFAR-10 data from its batch files.
  • Correcting Image Dimensions: After extraction, an additional step is necessary to correct the dimensions of the images, as described on the official CIFAR-10 dataset page.

Data Augmentation

To enhance the dataset, data augmentation techniques were applied using TensorFlow's ImageDataGenerator. This step is crucial for improving the generalization of the model by introducing variations in the training data.

Feature Extraction

Local features of the images were extracted using Histogram of Oriented Gradients (HOG) and Local Binary Patterns (LBP). These features help in capturing important characteristics of the images that aid in classification.

Model Training

The VGG16 model was chosen for training the dataset. VGG16 is a pre-trained model with a feature vector of size 1x1000, which is sufficient for this project.

Dimensionality Reduction

Though optional, Principal Component Analysis (PCA) was performed to reduce the feature vector from 1000 to 256 dimensions to meet project guidelines and improve computational efficiency.

Model Evaluation and Saving

The model was evaluated for its performance, and the final trained model was saved using Python's pickle module for future use.

User Interface

A Streamlit interface was created to allow users to upload an image and receive its predicted class. The only requirement is that the image must have dimensions of 32x32x3.

Getting Started

  • Clone the Repository:
git clone https://github.com/yourusername/multiclass-image-classification.git
  • Install Dependencies:
 pip install -r requirements.txt
  • Run the Streamlit Interface:
streamlit run app.py
  • Upload an Image:

Use the provided interface to upload a 32x32x3 image and get the classification result.

Conclusion

This project demonstrates a complete pipeline for multi-class image classification, from data preparation and augmentation to feature extraction, model training, and deployment with a user-friendly interface.