Skip to main content

Getting Started with Structured Models

In this tutorial, you will build a structured data model from scratch using the core building blocks of pydantic-core. You will learn how to define individual fields, group them into a field-set, and finally bind them to a Python class for full validation and instantiation.

By the end of this guide, you will have a working User model that validates input data and returns a typed Python object.

Prerequisites

To follow this tutorial, you need pydantic-core installed in your environment. You should be familiar with basic Python classes and dictionary structures.

Step 1: Define the Target Python Class

ModelSchema works by populating a Python class with validated data. For optimal performance and compatibility with pydantic-core, your class should define specific __slots__.

Create a file named user_model.py and add the following:

class User:
# __slots__ are recommended for performance and to define
# where pydantic-core should store validated data.
__slots__ = '__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__'

def __init__(self, **kwargs):
# This is a simple init; ModelSchema can also handle
# instantiation without a custom __init__.
for key, value in kwargs.items():
setattr(self, key, value)

def __repr__(self):
return f"User(name={self.name!r}, age={self.age!r})"

The __pydantic_fields_set__ slot is particularly important as pydantic-core uses it to track which fields were explicitly provided during validation.

Step 2: Define Individual Model Fields

Next, you need to define the validation rules for each attribute using model_field. Each field wraps a CoreSchema (like a string or integer schema).

from pydantic_core import core_schema

# Define a field for the 'name' attribute
name_field = core_schema.model_field(
schema=core_schema.str_schema(),
validation_alias='username' # Look for 'username' in the input data
)

# Define a field for the 'age' attribute
age_field = core_schema.model_field(
schema=core_schema.int_schema()
)

By using model_field, you can add metadata like validation_alias, which allows the input data keys to differ from your class attribute names.

Step 3: Group Fields into a ModelFieldsSchema

Now, group these fields into a ModelFieldsSchema. This schema is responsible for validating a dictionary of input data against your field definitions.

fields_schema = core_schema.model_fields_schema(
fields={
'name': name_field,
'age': age_field,
}
)

Note on Behavior: If you were to validate data against fields_schema directly, it would return a 3-tuple: (validated_dict, extra_data, fields_set). This is the raw output used internally by models.

Step 4: Bind the Class and Fields with ModelSchema

To get a Python class instance instead of a raw tuple, you must wrap the fields_schema in a ModelSchema. This "bridge" tells pydantic-core which class to instantiate.

user_schema = core_schema.model_schema(
cls=User,
schema=fields_schema
)

The model_schema takes your User class and the fields_schema you just created. It handles the logic of calling the class and populating its __dict__ and __pydantic_fields_set__.

Step 5: Validate Data with SchemaValidator

Finally, use the SchemaValidator to run your schema against real data.

from pydantic_core import SchemaValidator

# Initialize the validator with our model schema
v = SchemaValidator(user_schema)

# Validate a dictionary
# Note that we use 'username' because of the validation_alias defined in Step 2
input_data = {'username': 'Alice', 'age': '30'}
user = v.validate_python(input_data)

print(user)
# Output: User(name='Alice', age=30)

print(type(user))
# Output: <class '__main__.User'>

print(user.__pydantic_fields_set__)
# Output: {'name', 'age'}

Complete Working Example

Here is the full code combining all the steps above:

from pydantic_core import SchemaValidator, core_schema

class User:
__slots__ = '__dict__', '__pydantic_fields_set__', '__pydantic_extra__', '__pydantic_private__'

def __repr__(self):
return f"User(name={getattr(self, 'name', 'N/A')!r}, age={getattr(self, 'age', 'N/A')!r})"

# 1. Define the fields
fields = {
'name': core_schema.model_field(core_schema.str_schema(), validation_alias='username'),
'age': core_schema.model_field(core_schema.int_schema()),
}

# 2. Create the fields schema
fields_schema = core_schema.model_fields_schema(fields=fields)

# 3. Create the model schema binding the class
user_schema = core_schema.model_schema(cls=User, schema=fields_schema)

# 4. Validate
v = SchemaValidator(user_schema)
user = v.validate_python({'username': 'Bob', 'age': 25})

assert isinstance(user, User)
assert user.name == 'Bob'
assert user.age == 25
print("Validation successful:", user)

Next Steps

Now that you have a basic model, you can explore:

  • Extra Behavior: Use the extra_behavior argument in model_fields_schema to allow or forbid extra fields (e.g., extra_behavior='allow').
  • Computed Fields: Add computed_fields to your ModelFieldsSchema to include properties that are calculated during serialization.
  • Strict Mode: Set strict=True in your CoreConfig or individual schemas to enforce exact type matching without coercion.