deep_learning

Complete PyTorch Image Classification Tutorial: From Custom CNNs to Production API Deployment

Learn to build and deploy a PyTorch image classification system from scratch. Covers CNN design, transfer learning, optimization, and production deployment with FastAPI.

Complete PyTorch Image Classification Tutorial: From Custom CNNs to Production API Deployment

I’ve been thinking about image classification systems a lot lately. After building several prototypes that worked well in the lab but struggled in production, I realized there’s a gap between training models and deploying them effectively. Many tutorials cover either training or deployment, but rarely both. This article bridges that gap by walking through a complete workflow using PyTorch. Let’s build something together that actually works in the real world.

Setting up our environment is straightforward. We create a virtual environment and install key packages:

python -m venv venv
source venv/bin/activate
pip install torch torchvision fastapi uvicorn docker

Our project structure organizes components logically. Why do you think separation of concerns matters in machine learning projects?

image_classifier/
├── src/
│   ├── data/       # Dataset handling
│   ├── models/     # Model architectures
│   ├── training/   # Training logic
│   ├── optimization/ # Model optimization
│   └── deployment/ # Production deployment
├── data/           # Raw datasets
├── models/         # Trained models
└── Dockerfile      # Container configuration

For data handling, we create a flexible dataset class. Notice how it handles both directory-based and CSV-based datasets:

class ImageClassificationDataset(Dataset):
    def __init__(self, data_dir, csv_file=None, transform=None):
        self.data_dir = data_dir
        self.transform = transform
        
        if csv_file:
            self._load_from_csv(csv_file)
        else:
            self._load_from_directory()
            
    def _load_from_directory(self):
        class_names = sorted(os.listdir(self.data_dir))
        self.class_to_idx = {name: idx for idx, name in enumerate(class_names)}
        
        for class_name in class_names:
            class_dir = os.path.join(self.data_dir, class_name)
            for img_name in os.listdir(class_dir):
                img_path = os.path.join(class_dir, img_name)
                self.image_paths.append(img_path)
                self.labels.append(self.class_to_idx[class_name])

Data augmentation significantly improves model robustness. We implement advanced techniques like CutMix:

def cutmix(batch, labels, alpha=1.0):
    batch_size = batch.size(0)
    lam = np.random.beta(alpha, alpha)
    rand_index = torch.randperm(batch_size)
    
    # Generate random bounding box
    W, H = batch.size(3), batch.size(2)
    rx = np.random.randint(W)
    ry = np.random.randint(H)
    rw = int(W * np.sqrt(1 - lam))
    rh = int(H * np.sqrt(1 - lam))
    
    # Apply cutmix
    batch[:, :, ry:ry+rh, rx:rx+rw] = batch[rand_index, :, ry:ry+rh, rx:rx+rw]
    return batch, labels

For model architecture, we balance custom networks and transfer learning. When would you choose a custom CNN over a pre-trained model? Here’s a lightweight custom architecture:

class CustomCNN(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.classifier = nn.Sequential(
            nn.Linear(64*56*56, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),
            nn.Linear(512, num_classes)
        )

For deployment, we use FastAPI to create a production-ready endpoint:

app = FastAPI()

@app.post("/predict")
async def predict(file: UploadFile = File(...)):
    image = Image.open(file.file).convert("RGB")
    tensor = transform(image).unsqueeze(0)
    with torch.no_grad():
        prediction = model(tensor).squeeze(0)
    return {"class": class_names[torch.argmax(prediction)]}

Containerization ensures consistent environments. Our Dockerfile captures dependencies:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "deployment.api:app", "--host", "0.0.0.0", "--port", "8000"]

Optimization techniques like quantization reduce model size for deployment:

model = torch.quantization.quantize_dynamic(
    model,
    {nn.Linear},
    dtype=torch.qint8
)
torch.save(model.state_dict(), "quantized_model.pth")

Monitoring is crucial for production systems. We track key metrics:

def log_prediction(input_data, prediction):
    with open("predictions.log", "a") as f:
        f.write(f"{datetime.now()}, {input_data.shape}, {prediction}\n")

Throughout this process, I learned that deployment challenges often reveal training weaknesses. Have you noticed how production environments test assumptions made during development?

We’ve covered the full lifecycle: data preparation, model architecture, training techniques, optimization, and production deployment. The complete system balances accuracy with practical constraints. What surprised me most was how quantization maintained performance while reducing model size by 4x.

This approach has served me well in several projects. If you implement similar systems, focus on the integration points between components. Try building your own version - start small with a custom CNN before scaling to transfer learning.

If you found this valuable, share it with others facing similar challenges. What part of this workflow do you find most challenging? Comment below with your experiences or questions about production machine learning systems.

Keywords: image classification pytorch, pytorch cnn tutorial, pytorch transfer learning, pytorch model deployment, pytorch image recognition, fastapi pytorch deployment, pytorch model optimization, pytorch production deployment, pytorch docker containerization, pytorch model training tutorial



Similar Posts
Blog Image
Build Real-Time Object Detection System with YOLOv8 and OpenCV in Python Tutorial

Learn to build a powerful real-time object detection system using YOLOv8 and OpenCV in Python. Complete tutorial with code examples and deployment tips.

Blog Image
Build Sentiment Analysis with BERT: Complete PyTorch Guide from Pre-training to Custom Fine-tuning

Learn to build a complete sentiment analysis system using BERT transformers in PyTorch. Master pre-trained models, custom fine-tuning, and production deployment. Start building today!

Blog Image
How to Build Real-Time Object Detection with YOLOv8 and OpenCV Python Tutorial

Learn to build a real-time object detection system using YOLOv8 and OpenCV in Python. Complete tutorial with code examples, setup, and optimization tips. Start detecting objects now!

Blog Image
Build Multi-Class Image Classifier with TensorFlow Transfer Learning and Fine-Tuning Complete Guide

Learn to build powerful multi-class image classifiers using TensorFlow transfer learning and fine-tuning techniques. Complete tutorial with code examples.

Blog Image
How to Build Custom Attention Mechanisms in PyTorch: Complete Implementation Guide

Learn to build custom attention mechanisms in PyTorch from scratch. Complete guide covering theory, multi-head attention, optimization, and real-world implementation. Master PyTorch attention today!

Blog Image
How to Quantize Neural Networks for Fast, Efficient Edge AI Deployment

Learn how to shrink and speed up AI models using quantization techniques for real-time performance on edge devices.