This project performs classification on 100 sports categories using grayscale images. The core architecture is based on a customized VGG19 pretrained CNN with frozen layers and a new classifier head. The model's interpretability is enhanced using Grad-CAM heatmaps. The dataset is from kaggle.
- Python: Popular language for implementing neural networks and AI projects.
- Jupyter Notebook: Best tool for running Python code cell by cell in an interactive environment.
- Google Colab: Free cloud platform for running Jupyter Notebooks with GPU/TPU support and no local setup needed.
- OpenCV: Powerful library for image processing, computer vision, and real-time applications.
- NumPy: Essential library for numerical operations and array-based computing in Python.
- TQDM: Lightweight library for adding smart progress bars to loops and iterable tasks.
- CUDA: NVIDIA's parallel computing platform for accelerating deep learning on GPUs.
- PyTorch: Deep learning framework known for flexibility, dynamic computation graphs, and strong GPU support.
- TorchVision: Companion library to PyTorch for image-related deep learning tasks, datasets, and transforms.
- Kaggle: Online platform for data science competitions, datasets, and collaborative coding.
- Matplotlib: Versatile library for creating static, animated, and interactive plots in Python.
- Pillow: Python Imaging Library (PIL) fork used for opening, editing, and saving images easily.
You can easily run this code on google colab by just clicking this badge
We have used VGG19 model. also we freeze the 15th first layer.
| Model Type / Technique | Used in Your Code | Library / Source |
|---|---|---|
| CNN (Convolutional NN) | VGG19 as the base architecture |
torchvision.models |
| Pretrained Model | vgg19(weights=VGG19_Weights.DEFAULT) |
torchvision.models |
| Transfer Learning | Using pretrained VGG19 with partial freezing | torch.nn, torchvision |
| Freezing Layers | First 15 layers of vgg.features frozen |
torch.nn.Parameter.requires_grad = False |
| Custom NN (Classifier) | Custom nn.Sequential fully connected layers |
torch.nn |
| Grayscale Input Conv | Modified Conv2D to accept grayscale input | torch.nn.Conv2d |
| Dropout | nn.Dropout, nn.Dropout2d for regularization |
torch.nn |
| Batch Normalization | nn.BatchNorm1d for stable training |
torch.nn |
| Activation Function | nn.ReLU used in classifier |
torch.nn |
| Weight Initialization | Xavier (Glorot) init in classifier | torch.nn.init |
| Adaptive Pooling | nn.AdaptiveAvgPool2d((7,7)) |
torch.nn |
| Flatten Layer | nn.Flatten() |
torch.nn |
- Input Image Size: 512*512
- Optimizer:
Adam - Loss Function:
CrossEntropyLoss - Epochs:
5 - Batch Size:
32 - Drop out:
0.5(for better Generalization) - Device: GPU via CUDA (if available)
The dataset contains images of 100 different sports classes from Kaggle.
- Source: Kaggle Sports Dataset
- Format:
.jpgimages in folders by class
- Training Accuracy: 88.28% (can go up into 99.99%)
- Validation Accuracy: 93.20%
- Test Accuracy: 94.40%
- Training Time: 150 minutes on Colab GPU (Cuda)
- High Generalization at first epoch (80% test Accuracy)
This project is licensed under the MIT License.
