Build Distributed Rate Limiting System with Redis FastAPI and Sliding Window Algorithm

python

Build Distributed Rate Limiting System with Redis FastAPI and Sliding Window Algorithm

Learn to build a production-ready distributed rate limiting system using Redis, FastAPI, and sliding window algorithms for scalable API protection.

Oct 8, 2025

Build Distributed Rate Limiting System with Redis FastAPI and Sliding Window Algorithm

I was building a production API recently when I noticed something concerning. Our rate limiting system was inconsistent across different server instances, allowing some users to make far more requests than they should have. That’s when I realized we needed a distributed solution that would work reliably across our entire infrastructure.

Have you ever wondered why some APIs feel perfectly responsive while others suddenly start rejecting valid requests? The answer often lies in their rate limiting implementation.

Let me show you how to build a robust distributed rate limiting system using Redis, FastAPI, and the sliding window algorithm. This approach gives you precise control while maintaining consistency across multiple application instances.

Here’s a practical implementation of the sliding window algorithm using Redis Lua scripts:

local key = KEYS[1]
local window = tonumber(ARGV[1])
local limit = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local weight = tonumber(ARGV[4]) or 1

local cutoff = now - window
redis.call('ZREMRANGEBYSCORE', key, 0, cutoff)

local current = redis.call('ZCARD', key)

if current + weight > limit then
    return {0, limit, 0, now + window}
end

redis.call('ZADD', key, now, now .. ':' .. weight)
redis.call('EXPIRE', key, window)

return {1, limit, limit - current - weight, now + window}

This script handles the complex logic atomically within Redis. But why use Lua scripts instead of multiple Redis commands? Because they ensure that no other operations can interfere while we’re checking and updating the rate limit.

Now let’s integrate this with FastAPI middleware:

from fastapi import Request, HTTPException
import time
import asyncio

class RateLimitMiddleware:
    def __init__(self, redis_storage, app):
        self.storage = redis_storage
        self.app = app
    
    async def __call__(self, request: Request, call_next):
        client_id = request.client.host
        endpoint = request.url.path
        
        allowed, limit, remaining, reset = await self.storage.check_sliding_window(
            identifier=f"{client_id}:{endpoint}",
            rule_name="api",
            window_seconds=60,
            max_requests=100
        )
        
        if not allowed:
            raise HTTPException(
                status_code=429,
                detail="Rate limit exceeded",
                headers={
                    "X-RateLimit-Limit": str(limit),
                    "X-RateLimit-Remaining": str(remaining),
                    "X-RateLimit-Reset": str(reset)
                }
            )
        
        response = await call_next(request)
        response.headers.update({
            "X-RateLimit-Limit": str(limit),
            "X-RateLimit-Remaining": str(remaining),
            "X-RateLimit-Reset": str(reset)
        })
        
        return response

What happens when you need different rate limits for different user tiers? The system becomes even more valuable. You can easily extend it to handle premium users with higher limits or specific endpoints with stricter controls.

Here’s how you might implement tier-based rate limiting:

async def get_user_tier_limits(user_id: str) -> dict:
    # In practice, this might check a database or cache
    user_tiers = {
        "free": {"requests_per_minute": 100, "burst_capacity": 10},
        "premium": {"requests_per_minute": 1000, "burst_capacity": 100},
        "enterprise": {"requests_per_minute": 10000, "burst_capacity": 1000}
    }
    
    # Default to free tier if user not found
    return user_tiers.get(await get_user_tier(user_id), user_tiers["free"])

Monitoring is crucial for any production system. You’ll want to track how often rate limits are being hit and by whom. This data helps you optimize your limits and identify potential abuse patterns.

Did you know that proper rate limiting can actually improve your API’s reliability during traffic spikes? By controlling the flow of requests, you prevent your servers from becoming overwhelmed and ensure consistent performance for all users.

One challenge I faced was handling burst traffic. The sliding window algorithm naturally accommodates bursts while still enforcing the overall limit. However, you might need to adjust your window size and limits based on your specific use case.

The beauty of this distributed approach is that it works seamlessly whether you’re running one instance or hundreds. Each instance consults the same Redis database, ensuring consistent enforcement across your entire infrastructure.

Remember that rate limiting isn’t just about preventing abuse. It’s about ensuring fair resource allocation and maintaining quality of service for all your users. A well-implemented system becomes invisible to legitimate users while effectively blocking malicious traffic.

I’ve found this implementation to be remarkably reliable in production. The combination of Redis for storage, Lua scripts for atomic operations, and FastAPI for the web framework creates a robust foundation that scales well.

What other challenges have you faced with API rate limiting? I’d love to hear about your experiences and solutions. If you found this guide helpful, please share it with others who might benefit, and feel free to leave comments with your thoughts or questions.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

python

Build Distributed Rate Limiting System with Redis FastAPI and Sliding Window Algorithm

Our Creations

We are on Medium

Similar Posts

How to Build Production-Ready Background Task Processing with Celery, Redis, and FastAPI

Build Production-Ready Distributed Task Processing System with Celery Redis and FastAPI Complete Guide

Building Production-Ready GraphQL APIs with Strawberry and SQLAlchemy: Complete Implementation Guide

Build Production-Ready Background Tasks: Complete Celery, Redis, FastAPI Integration Guide

Complete Production-Ready FastAPI Microservices Guide with SQLAlchemy and Redis Implementation

Build Event-Driven Architecture with AsyncIO, Redis Streams, and FastAPI: Complete Implementation Guide