CIFAR-10 Classification on Symmetric and Asymmetric Noise

Overview

This project explores the impact of label noise on the CIFAR-10 dataset and evaluates strategies to mitigate its effects. The study incorporates symmetric and asymmetric noise at varying levels (10%, 30%, 50%, 80%, and 90%) and applies several machine learning methodologies to enhance noise robustness. Three model architectures are developed and compared:

Base CNN with Cross Entropy Loss
Regularized CNN with Symmetric Cross Entropy Loss
Advanced Regularized CNN with Normalized Cross Entropy and Reverse Cross Entropy Loss

Dataset

The CIFAR-10 dataset consists of 60,000 32x32 color images across 10 classes, divided into:

Training set: 50,000 images
Test set: 10,000 images

Label noise is introduced to simulate real-world data imperfections:

Symmetric Noise: Random mislabeling across all classes.
Asymmetric Noise: Systematic mislabeling of specific classes into predefined others.

Models

Base CNN + Cross Entropy Loss

A simple convolutional neural network architecture.
Uses traditional Cross Entropy Loss.
Designed as a baseline for comparison.

Regularized CNN + Symmetric Cross Entropy Loss

Enhances the Base CNN by adding batch normalization and dropout layers.
Implements Symmetric Cross Entropy Loss to balance robustness and accuracy.

Advanced Regularized CNN + NCE and RCE Loss

Introduces Kaiming and Xavier initialization.
Employs the Active Passive Loss framework, combining Normalized Cross Entropy (NCE) and Reverse Cross Entropy (RCE).
Optimized with SGD and fine-tuned hyperparameters.

Methodology

Noise Incorporation: Simulate symmetric and asymmetric noise in training data while keeping validation and testing data clean.
Data Preprocessing:
- Convert images to tensors.
- Normalize pixel values (mean=0.5, standard deviation=0.5).
- Batch size: 64.
Training: Models trained on noisy datasets with varying noise levels, using different loss functions and optimizers.
Evaluation: Compare test accuracies under different noise levels for each model.

Results

The models\u2019 performance under symmetric and asymmetric noise conditions:

Base CNN: Struggles with high noise levels, overfitting noisy labels.
Regularized CNN: Improved robustness but shows limitations under extreme noise.
Advanced Regularized CNN: Demonstrates superior resilience, maintaining higher accuracy across noise levels.

Key Findings:

NCE+RCE Loss outperforms Symmetric Cross Entropy Loss in high-noise scenarios.
Regularization techniques like dropout and batch normalization enhance model generalization.

Experimentation

Experiments conducted on Google Colab Pro using NVIDIA Tesla V100 GPU.
Models trained with 10 random seeds to ensure robustness.
Training configurations include fixed epochs, learning rate tuning, and noise-specific parameter adjustments.

Dependencies

Python (>=3.7)
PyTorch
NumPy
Matplotlib

Usage

Clone the repository.
Install required dependencies.

Run the scripts to reproduce experiments:

python train_model.py --model {base|regularized|advanced} --noise {symmetric|asymmetric} --level {10|30|50|80|90}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Data		Data
DataProcessor.ipynb		DataProcessor.ipynb
Mini_Project.pdf		Mini_Project.pdf
README.md		README.md
Report- Noisy-Labels-Training-on-CIFAR-10.pdf		Report- Noisy-Labels-Training-on-CIFAR-10.pdf
The link for google collab.txt		The link for google collab.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CIFAR-10 Classification on Symmetric and Asymmetric Noise

Overview

Dataset

Models

Base CNN + Cross Entropy Loss

Regularized CNN + Symmetric Cross Entropy Loss

Advanced Regularized CNN + NCE and RCE Loss

Methodology

Results

Key Findings:

Experimentation

Dependencies

Usage

About

Releases

Packages

Contributors 2

Languages

anashoussaini/Noisy-Labels-Training-on-CIFAR-10

Folders and files

Latest commit

History

Repository files navigation

CIFAR-10 Classification on Symmetric and Asymmetric Noise

Overview

Dataset

Models

Base CNN + Cross Entropy Loss

Regularized CNN + Symmetric Cross Entropy Loss

Advanced Regularized CNN + NCE and RCE Loss

Methodology

Results

Key Findings:

Experimentation

Dependencies

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages