python

How to Build a Lightweight Python ORM Using Metaclasses and Descriptors

Learn how to create a custom Python ORM from scratch using metaclasses and descriptors for full control and transparency.

How to Build a Lightweight Python ORM Using Metaclasses and Descriptors

Have you ever been frustrated by an ORM that felt like a black box? That’s what got me thinking about this. I was working on a project where the existing tools either did too much or not enough. I needed something that fit perfectly in the middle—a custom data layer that was transparent, efficient, and built exactly to my application’s needs. So, I decided to build my own. In this article, I’ll show you how you can do the same by combining two of Python’s most powerful features: metaclasses and descriptors. Let’s get started.

Think of a class in Python. When you write class User: pass, what is User? It’s an object. And every object has a type. The type of User is type. This is our first key idea: classes are instances of a metaclass, which is typically type. This means we can customize how classes themselves are built. That’s the foundation for an ORM. How do you think we could intercept the creation of a class to add database-aware behavior?

Descriptors are the other piece. They control how attribute access works on instances. When you see user.name, a descriptor can decide what happens. It can fetch data, validate it, or log the access. In an ORM, fields like IntegerField or CharField are perfect candidates for descriptors. They don’t just store a value; they know how to talk to a database column. Let’s build a simple descriptor to see this in action.

class Field:
    def __init__(self, column_type=None):
        self.column_type = column_type
        self.name = None  # Will be set later

    def __set_name__(self, owner, name):
        self.name = name

    def __get__(self, instance, owner):
        if instance is None:
            return self
        # Get the value from the instance's internal storage
        return instance._data.get(self.name)

    def __set__(self, instance, value):
        # Simple type checking
        if self.column_type and not isinstance(value, self.column_type):
            raise TypeError(f"Expected {self.column_type}, got {type(value)}")
        instance._data[self.name] = value

This Field class is a data descriptor. It uses __set_name__ to learn its name in the class. When you assign name = Field(str) on a model, the descriptor knows it’s called “name”. The __set__ method can validate the data. Where do you suppose the actual data should live on the instance? We use a special _data dictionary to avoid collisions with the descriptor object itself.

Now, how do we gather all these Field descriptors from a class? This is where a metaclass becomes essential. A metaclass can run code when a class is created. It can scan the class attributes, find all our Field instances, and register them. This allows the model to “know” about its database schema.

class ModelMeta(type):
    def __new__(mcs, name, bases, namespace):
        # Create the new class
        cls = super().__new__(mcs, name, bases, namespace)
        # Prepare a place to store field information
        cls._fields = {}
        # Look for Field descriptors
        for key, value in namespace.items():
            if isinstance(value, Field):
                cls._fields[key] = value
        return cls

Our ModelMeta metaclass runs when a class like User is defined. It collects all the Field objects into a _fields dictionary. What do you think is the next step? We need a base Model class that uses this metaclass so all our models inherit this automatic registration.

class Model(metaclass=ModelMeta):
    def __init__(self, **kwargs):
        # Each instance gets its own data store
        self._data = {}
        for key, field in self._fields.items():
            # Set values from keyword arguments, or use a default
            value = kwargs.get(key)
            setattr(self, key, value)

    def save(self):
        # A placeholder for the save logic
        field_data = {name: self._data.get(name) for name in self._fields}
        print(f"Pretending to save: {field_data}")

Now we can define a model in a very clean way. The metaclass handles the setup behind the scenes, and the descriptors handle the data on each instance. Let’s create a User model.

class User(Model):
    id = Field(int)
    name = Field(str)
    email = Field(str)

# Let's use it
john = User(id=1, name="John", email="john@example.com")
john.save()  # Output: Pretending to save: {'id': 1, 'name': 'John', 'email': 'john@example.com'}

This is the core pattern. The User class is created by ModelMeta, which registers the id, name, and email fields. When we create an instance, the __init__ method uses setattr, which triggers the Field.__set__ descriptor to validate and store the data in _data. Can you see how the separation of concerns makes this elegant?

But a real ORM needs to generate SQL. How might we extend our Field descriptor to know its SQL type? And how could the Model class generate a CREATE TABLE statement? We can enhance the Field’s __init__ to accept a SQL string and give the Model a class method to generate the SQL.

class Field:
    def __init__(self, column_type=None, sql_type="TEXT"):
        self.column_type = column_type
        self.sql_type = sql_type
        self.name = None

    # ... __set_name__, __get__, __set__ remain the same ...

class Model(metaclass=ModelMeta):
    # ... __init__ and save ...
    
    @classmethod
    def create_table_sql(cls):
        columns = []
        for name, field in cls._fields.items():
            col_def = f"{name} {field.sql_type}"
            columns.append(col_def)
        columns_sql = ", ".join(columns)
        return f"CREATE TABLE {cls.__name__.lower()} ({columns_sql});"

class Product(Model):
    product_id = Field(int, "INTEGER PRIMARY KEY")
    title = Field(str, "VARCHAR(255)")
    price = Field(float, "REAL")

print(Product.create_table_sql())
# Output: CREATE TABLE product (product_id INTEGER PRIMARY KEY, title VARCHAR(255), price REAL);

Now our simple framework can define its schema. The descriptor holds the SQL type, and the model can introspect it. What about queries? We could add another layer—a Query class that uses the model’s fields to build WHERE clauses. This often involves overriding operators like == on the descriptor to return a query object instead of a simple boolean.

Building this step-by-step demystifies the tools you use every day. You start to see the design choices behind libraries like SQLAlchemy. The power comes from composing simple mechanisms: metaclasses that configure classes, and descriptors that manage instance data.

This journey from a basic descriptor to a model that can generate SQL shows the essence of metaprogramming. It’s not magic; it’s just Python’s object model used deliberately. You gain fine-grained control over behavior, which leads to cleaner and more maintainable code for complex data layers.

I hope walking through this process has been as enlightening for you as building it was for me. The ability to craft your own abstractions is one of the most satisfying parts of programming in Python. If you enjoyed this exploration of Python’s internals, please share this article with a fellow developer. Have you ever built a custom descriptor for a project? What was your use case? Let me know in the comments below.


As a best-selling author, I invite you to explore my books on Amazon. Don’t forget to follow me on Medium and show your support. Thank you! Your support means the world!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!


📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!


Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Keywords: python orm,metaclasses,descriptors,custom data layer,python metaprogramming



Similar Posts
Blog Image
Complete Microservices Architecture with FastAPI: Build Scalable Services Using SQLAlchemy, Redis, and Docker

Master microservices with FastAPI, SQLAlchemy, Redis & Docker. Complete guide to architecture, authentication, caching & deployment. Build scalable services today!

Blog Image
FastAPI WebSocket Chat Application with Redis: Complete Real-Time Messaging Tutorial with Authentication

Learn to build a real-time chat app with FastAPI, WebSockets, and Redis Pub/Sub. Complete tutorial with authentication, scaling, and production deployment tips.

Blog Image
Django Celery Redis Guide: Build Production-Ready Background Task Processing Systems

Learn to build scalable background task processing with Celery, Redis & Django. Complete setup guide, monitoring, deployment & optimization tips for production environments.

Blog Image
How Strawberry and DataLoader Supercharge GraphQL APIs in Python

Discover how Strawberry and DataLoader simplify GraphQL in Python with efficient data fetching and clean, scalable code.

Blog Image
Build Event-Driven Microservices with FastAPI, SQLAlchemy, and Apache Kafka: Complete 2024 Guide

Learn to build scalable event-driven microservices using FastAPI, SQLAlchemy & Apache Kafka. Complete guide with real examples, async patterns & best practices.

Blog Image
Build High-Performance Async Web APIs with FastAPI, SQLAlchemy 2.0, and Redis Caching

Learn to build high-performance async web APIs with FastAPI, SQLAlchemy 2.0 & Redis caching. Complete tutorial with code examples & deployment tips.