deep_learning

Build Real-Time Image Classification System with PyTorch FastAPI Complete Tutorial

Learn to build a real-time image classification system using PyTorch and FastAPI. Complete tutorial covering CNN architecture, transfer learning, API deployment, and production optimization techniques.

Build Real-Time Image Classification System with PyTorch FastAPI Complete Tutorial

I’ve been fascinated by how quickly computers can understand images. It’s a problem I’ve wrestled with for some time - how to build systems that see like we do, but at machine speed. That curiosity led me to create a real-time image classification system, and today I’ll show you how I did it using PyTorch and FastAPI. Follow along as we build something practical together - I think you’ll find it useful for your own projects.

Setting up the environment is our first step. We need a clean structure to keep everything organized. Here’s how I arrange my projects:

mkdir -p src/{models,data,training,api,utils} tests
touch requirements.txt config.yaml README.md

Our dependencies are crucial. This requirements.txt file covers everything:

torch>=2.0.0
torchvision>=0.15.0
fastapi>=0.104.0
uvicorn[standard]>=0.24.0
python-multipart>=0.0.6
Pillow>=10.0.0

Configuration management saves time later. I use YAML files because they’re human-readable and flexible. This config.yaml handles our settings:

model:
  name: "resnet18"
  num_classes: 10
  pretrained: true

api:
  port: 8000
  max_file_size: 10485760

But how do we use this in code? This configuration loader makes settings accessible anywhere:

# src/utils/config.py
import yaml

class Config:
    def __init__(self, config_path="config.yaml"):
        with open(config_path) as f:
            self.settings = yaml.safe_load(f)
        
    def get(self, key):
        return self.settings[key]

config = Config()

Data preparation often takes more time than modeling. I created a custom dataset handler that works with any image directory structure:

# src/data/dataset.py
from torch.utils.data import Dataset
from PIL import Image

class CustomImageDataset(Dataset):
    def __init__(self, data_dir, transform=None):
        self.image_paths = [p for p in Path(data_dir).rglob('*.jpg')]
        self.transform = transform
    
    def __getitem__(self, idx):
        img = Image.open(self.image_paths[idx])
        return self.transform(img) if self.transform else img

Why did I choose this approach? Because real-world data is messy, and this handles various folder structures gracefully. For preprocessing, I recommend Albumentations - their GPU-accelerated transforms speed things up significantly:

import albumentations as A

transform = A.Compose([
    A.Resize(224, 224),
    A.Normalize(),
    ToTensorV2()
])

When building models, I start simple. This custom CNN gives a good baseline:

# src/models/custom_cnn.py
import torch.nn as nn

class SimpleCNN(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Conv2d(3, 16, 3),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(16, 32, 3),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Flatten(),
            nn.Linear(32*54*54, num_classes)  # Adjust based on input size
        )
    
    def forward(self, x):
        return self.layers(x)

For production, transfer learning works better. ResNet models balance speed and accuracy well:

# src/models/transfer_learning.py
from torchvision import models

def get_model(config):
    model = models.__dict__[config.model.name](pretrained=config.model.pretrained)
    model.fc = nn.Linear(model.fc.in_features, config.model.num_classes)
    return model

Training requires careful monitoring. I use this training loop with early stopping:

# src/training/trainer.py
def train(model, train_loader, val_loader, epochs, device):
    optimizer = torch.optim.Adam(model.parameters())
    best_acc = 0
    
    for epoch in range(epochs):
        model.train()
        for images, labels in train_loader:
            outputs = model(images.to(device))
            loss = nn.CrossEntropyLoss()(outputs, labels.to(device))
            loss.backward()
            optimizer.step()
        
        # Validation phase
        model.eval()
        total, correct = 0, 0
        with torch.no_grad():
            for images, labels in val_loader:
                outputs = model(images.to(device))
                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted.cpu() == labels).sum().item()
        
        acc = 100 * correct / total
        print(f'Epoch {epoch+1}: Accuracy {acc:.2f}%')
        
        # Early stopping
        if acc > best_acc:
            best_acc = acc
            torch.save(model.state_dict(), 'best_model.pth')

The API is where everything comes together. FastAPI makes endpoint creation surprisingly simple:

# src/api/main.py
from fastapi import FastAPI, UploadFile
from PIL import Image

app = FastAPI()
model = load_model()  # Your trained model

@app.post("/classify")
async def classify_image(file: UploadFile):
    img = Image.open(file.file)
    tensor = transform(img).unsqueeze(0)
    prediction = model(tensor).argmax().item()
    return {"class": class_names[prediction]}

To run it:

uvicorn src.api.main:app --reload --port 8000

Now you can send images via curl:

curl -X POST -F "file=@test.jpg" http://localhost:8000/classify

What about performance? I optimize by:

  1. Using ONNX for model export
  2. Enabling TorchScript compilation
  3. Implementing request batching
  4. Using async preprocessing

For monitoring, I add Prometheus metrics:

from prometheus_fastapi_instrumentator import Instrumentator
Instrumentator().instrument(app).expose(app)

Testing is non-negotiable. These pytest cases catch regressions:

# tests/test_api.py
def test_classify_endpoint():
    with open("test.jpg", "rb") as f:
        response = client.post("/classify", files={"file": f})
    assert response.status_code == 200
    assert "class" in response.json()

Building this changed how I see deployment pipelines. The PyTorch/FastAPI combination handles production loads beautifully while staying developer-friendly. What surprised me most was how quickly we went from experiment to usable API - just hours rather than days.

Give this approach a try in your next computer vision project. If you found this useful, share it with others facing similar challenges. I’d love to hear about your implementation - drop a comment about your experience!

Keywords: real-time image classification PyTorch, FastAPI image classification tutorial, PyTorch CNN deep learning, transfer learning image classification, REST API machine learning deployment, real-time computer vision API, PyTorch FastAPI integration, production deep learning models, image processing pipeline Python, machine learning model optimization



Similar Posts
Blog Image
Build Multi-Class Image Classifier with PyTorch Transfer Learning: Complete Guide to Deployment

Learn to build a multi-class image classifier using PyTorch transfer learning. Complete tutorial covers data loading, ResNet fine-tuning, training optimization, and deployment. Get production-ready results fast.

Blog Image
Complete CNN Guide: Build, Optimize, and Deploy Image Classification Models with Transfer Learning

Master CNN image classification with TensorFlow and Keras. Learn custom architectures, transfer learning, and optimization techniques for production deployment.

Blog Image
Build Real-Time Emotion Recognition System Using CNN Computer Vision Transfer Learning Complete Tutorial

Build a real-time emotion recognition system using CNN, transfer learning & OpenCV. Complete guide with Python code for face detection & deployment.

Blog Image
Build Real-Time Object Detection System with YOLOv8 and PyTorch: Complete Training to Deployment Guide

Learn to build a complete YOLOv8 object detection system with PyTorch. Master training, optimization, and deployment for real-time detection applications.

Blog Image
PyTorch Image Classification Pipeline: Transfer Learning, Data Preprocessing to Production Deployment Guide

Learn to build a complete image classification pipeline using PyTorch transfer learning. Covers data preprocessing, model training, evaluation & deployment for production-ready applications.

Blog Image
Custom CNN Architecture Guide: Build PyTorch Image Classifiers from Scratch in 2024

Learn to build custom CNN architectures from scratch using PyTorch. Complete guide covering data preprocessing, model design, training pipelines & optimization for image classification.