Custom CNN for Multi-Class Image Classification with PyTorch: Complete Training and Deployment Guide

deep_learning

Custom CNN for Multi-Class Image Classification with PyTorch: Complete Training and Deployment Guide

Build custom CNN for image classification with PyTorch. Complete tutorial covering data loading, model training, and deployment for CIFAR-10 dataset classification.

Jul 24, 2025

Custom CNN for Multi-Class Image Classification with PyTorch: Complete Training and Deployment Guide

I’ve been tackling image classification challenges recently, particularly with PyTorch, and wanted to share a practical walkthrough. Many resources cover fragments of the process, but stitching together a complete pipeline—from raw data to deployable model—reveals fascinating nuances. Why not explore this together using CIFAR-10? It’s approachable yet complex enough to demonstrate real-world considerations.

Setting up the environment is straightforward. We’ll need these core packages:

pip install torch torchvision matplotlib seaborn scikit-learn tensorboard

Here’s our foundational import block:

import torch
import torch.nn as nn
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from torchvision.datasets import CIFAR10

# Configure device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Active device: {device}")

CIFAR-10 contains 60,000 tiny 32x32 RGB images across 10 categories. Small images force models to learn efficient features—have you considered how spatial compression affects feature extraction? We implement aggressive augmentation to simulate real-world variations:

train_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(15),
    transforms.ColorJitter(brightness=0.2, contrast=0.2),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
    transforms.RandomErasing(p=0.1)
])

test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

Our CNN architecture balances complexity and efficiency. Notice the incremental channel expansion—why do you think this pattern works better than arbitrary layer sizes?

class CompactCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, 3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.Conv2d(32, 32, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Dropout(0.25),
            
            nn.Conv2d(32, 64, 3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.Conv2d(64, 64, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Dropout(0.35),
        )
        self.classifier = nn.Sequential(
            nn.Linear(64*8*8, 512),
            nn.ReLU(),
            nn.BatchNorm1d(512),
            nn.Dropout(0.5),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x, 1)
        return self.classifier(x)

Training incorporates several optimizations. The learning rate scheduler is particularly crucial—how might adaptive rate adjustment prevent overfitting?

model = CompactCNN().to(device)
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-4)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(
    optimizer, mode="max", factor=0.5, patience=3, verbose=True
)
criterion = nn.CrossEntropyLoss()

for epoch in range(30):
    model.train()
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
    
    # Validation phase
    model.eval()
    val_loss, correct = 0, 0
    with torch.no_grad():
        for images, labels in val_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            val_loss += criterion(outputs, labels).item()
            _, predicted = torch.max(outputs.data, 1)
            correct += (predicted == labels).sum().item()
    
    val_acc = 100 * correct / len(val_dataset)
    scheduler.step(val_acc)  # Adjust learning rate

Evaluation goes beyond accuracy. This confusion matrix snippet reveals class-specific weaknesses:

from sklearn.metrics import confusion_matrix
import seaborn as sns

def plot_confusion_matrix(model, loader):
    model.eval()
    all_preds, all_labels = [], []
    with torch.no_grad():
        for images, labels in loader:
            images = images.to(device)
            outputs = model(images)
            _, preds = torch.max(outputs, 1)
            all_preds.extend(preds.cpu().numpy())
            all_labels.extend(labels.numpy())
    
    cm = confusion_matrix(all_labels, all_preds)
    plt.figure(figsize=(10,8))
    sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
    plt.xlabel("Predicted")
    plt.ylabel("Actual")

For deployment, we export with TorchScript:

scripted_model = torch.jit.script(model.cpu())
scripted_model.save("cifar10_cnn.pt")

This preserves model architecture while decoupling from Python runtime—essential for production systems. What other deployment considerations might arise in your projects?

The journey from data to deployable model involves numerous design choices. Each decision—augmentation intensity, regularization strength, topology depth—creates tradeoffs between accuracy, speed, and robustness. I’ve found iterative refinement based on validation metrics yields the best results. What techniques have worked well in your projects?

If you found this walkthrough helpful, share it with others exploring PyTorch. Questions or insights? Let’s discuss in the comments—I’ll respond to thoughts and suggestions.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

deep_learning

Custom CNN for Multi-Class Image Classification with PyTorch: Complete Training and Deployment Guide

Our Creations

We are on Medium

Similar Posts

How to Build a Production-Ready Named Entity Recognition (NER) System

Build Multi-Modal Sentiment Analysis with PyTorch: Combine Text and Images for Better Emotion Detection

Build a Movie Recommendation System with Deep Learning: Complete Production Deployment Guide

Build Real-Time Object Detection System with YOLOv8 and PyTorch: Complete Training to Deployment Guide

Build Real-Time Object Detection System: YOLOv8 + OpenCV Python Tutorial for Beginners

Complete Guide: Build Multi-Class Image Classifier with TensorFlow Transfer Learning 2024