Complete PyTorch Image Classification Tutorial: From Custom CNNs to Production API Deployment

deep_learning

Complete PyTorch Image Classification Tutorial: From Custom CNNs to Production API Deployment

Learn to build and deploy a PyTorch image classification system from scratch. Covers CNN design, transfer learning, optimization, and production deployment with FastAPI.

Jul 25, 2025

Complete PyTorch Image Classification Tutorial: From Custom CNNs to Production API Deployment

I’ve been thinking about image classification systems a lot lately. After building several prototypes that worked well in the lab but struggled in production, I realized there’s a gap between training models and deploying them effectively. Many tutorials cover either training or deployment, but rarely both. This article bridges that gap by walking through a complete workflow using PyTorch. Let’s build something together that actually works in the real world.

Setting up our environment is straightforward. We create a virtual environment and install key packages:

python -m venv venv
source venv/bin/activate
pip install torch torchvision fastapi uvicorn docker

Our project structure organizes components logically. Why do you think separation of concerns matters in machine learning projects?

image_classifier/
├── src/
│   ├── data/       # Dataset handling
│   ├── models/     # Model architectures
│   ├── training/   # Training logic
│   ├── optimization/ # Model optimization
│   └── deployment/ # Production deployment
├── data/           # Raw datasets
├── models/         # Trained models
└── Dockerfile      # Container configuration

For data handling, we create a flexible dataset class. Notice how it handles both directory-based and CSV-based datasets:

class ImageClassificationDataset(Dataset):
    def __init__(self, data_dir, csv_file=None, transform=None):
        self.data_dir = data_dir
        self.transform = transform
        
        if csv_file:
            self._load_from_csv(csv_file)
        else:
            self._load_from_directory()
            
    def _load_from_directory(self):
        class_names = sorted(os.listdir(self.data_dir))
        self.class_to_idx = {name: idx for idx, name in enumerate(class_names)}
        
        for class_name in class_names:
            class_dir = os.path.join(self.data_dir, class_name)
            for img_name in os.listdir(class_dir):
                img_path = os.path.join(class_dir, img_name)
                self.image_paths.append(img_path)
                self.labels.append(self.class_to_idx[class_name])

Data augmentation significantly improves model robustness. We implement advanced techniques like CutMix:

def cutmix(batch, labels, alpha=1.0):
    batch_size = batch.size(0)
    lam = np.random.beta(alpha, alpha)
    rand_index = torch.randperm(batch_size)
    
    # Generate random bounding box
    W, H = batch.size(3), batch.size(2)
    rx = np.random.randint(W)
    ry = np.random.randint(H)
    rw = int(W * np.sqrt(1 - lam))
    rh = int(H * np.sqrt(1 - lam))
    
    # Apply cutmix
    batch[:, :, ry:ry+rh, rx:rx+rw] = batch[rand_index, :, ry:ry+rh, rx:rx+rw]
    return batch, labels

For model architecture, we balance custom networks and transfer learning. When would you choose a custom CNN over a pre-trained model? Here’s a lightweight custom architecture:

class CustomCNN(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.classifier = nn.Sequential(
            nn.Linear(64*56*56, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),
            nn.Linear(512, num_classes)
        )

For deployment, we use FastAPI to create a production-ready endpoint:

app = FastAPI()

@app.post("/predict")
async def predict(file: UploadFile = File(...)):
    image = Image.open(file.file).convert("RGB")
    tensor = transform(image).unsqueeze(0)
    with torch.no_grad():
        prediction = model(tensor).squeeze(0)
    return {"class": class_names[torch.argmax(prediction)]}

Containerization ensures consistent environments. Our Dockerfile captures dependencies:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "deployment.api:app", "--host", "0.0.0.0", "--port", "8000"]

Optimization techniques like quantization reduce model size for deployment:

model = torch.quantization.quantize_dynamic(
    model,
    {nn.Linear},
    dtype=torch.qint8
)
torch.save(model.state_dict(), "quantized_model.pth")

Monitoring is crucial for production systems. We track key metrics:

def log_prediction(input_data, prediction):
    with open("predictions.log", "a") as f:
        f.write(f"{datetime.now()}, {input_data.shape}, {prediction}\n")

Throughout this process, I learned that deployment challenges often reveal training weaknesses. Have you noticed how production environments test assumptions made during development?

We’ve covered the full lifecycle: data preparation, model architecture, training techniques, optimization, and production deployment. The complete system balances accuracy with practical constraints. What surprised me most was how quantization maintained performance while reducing model size by 4x.

This approach has served me well in several projects. If you implement similar systems, focus on the integration points between components. Try building your own version - start small with a custom CNN before scaling to transfer learning.

If you found this valuable, share it with others facing similar challenges. What part of this workflow do you find most challenging? Comment below with your experiences or questions about production machine learning systems.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

deep_learning

Complete PyTorch Image Classification Tutorial: From Custom CNNs to Production API Deployment

Our Creations

We are on Medium

Similar Posts

Build Real-Time Object Detection System with YOLOv8 and OpenCV in Python Tutorial

Build Sentiment Analysis with BERT: Complete PyTorch Guide from Pre-training to Custom Fine-tuning

How to Build Real-Time Object Detection with YOLOv8 and OpenCV Python Tutorial

Build Multi-Class Image Classifier with TensorFlow Transfer Learning and Fine-Tuning Complete Guide

How to Build Custom Attention Mechanisms in PyTorch: Complete Implementation Guide

How to Quantize Neural Networks for Fast, Efficient Edge AI Deployment