Recursive Schemas and Definitions
In pydantic-core, complex data structures like recursive trees or shared logic are implemented using a system of definitions and references. This approach allows a schema to be defined once and referenced multiple times, or even to reference itself, without creating infinite loops during schema construction.
The core of this system consists of two primary schema types:
DefinitionsSchema: A container that holds a main schema and a list of shared definitions.DefinitionReferenceSchema: A pointer that refers to one of the schemas in the definitions list.
The Definitions Container
The DefinitionsSchema acts as the root for any schema requiring shared or recursive components. It is typically created using the definitions_schema() helper function found in pydantic_core.core_schema.
from pydantic_core import core_schema
schema = core_schema.definitions_schema(
# The main entry point schema
schema=core_schema.definition_reference_schema('my-shared-int'),
# A list of schemas that can be referenced
definitions=[
core_schema.int_schema(ref='my-shared-int')
],
)
In this structure:
schema: The "entry point" validator.definitions: A list ofCoreSchemaobjects. Each schema in this list that needs to be referenced must have arefstring.
Referencing Definitions
To use a schema from the definitions list, you use a DefinitionReferenceSchema. This is created via definition_reference_schema(schema_ref=...). The schema_ref must match the ref attribute of a schema within the DefinitionsSchema container.
If a schema_ref is provided that does not exist in the definitions list, pydantic-core will raise a SchemaError during validator construction (e.g., Definitions error: definition '...' was never filled).
Recursive Data Structures
Recursion is achieved by having a schema in the definitions list reference itself via a definition_reference_schema.
A common example is a tree structure where each node can have a list of child nodes of the same type. This is demonstrated in pydantic-core/tests/validators/test_definitions_recursive.py:
from pydantic_core import SchemaValidator, core_schema
v = SchemaValidator(
core_schema.definitions_schema(
core_schema.definition_reference_schema('Branch'),
[
core_schema.typed_dict_schema(
{
'name': core_schema.typed_dict_field(core_schema.str_schema()),
'sub_branch': core_schema.typed_dict_field(
core_schema.with_default_schema(
core_schema.nullable_schema(
core_schema.definition_reference_schema('Branch')
),
default=None,
)
),
},
ref='Branch',
)
],
)
)
# Validating a nested structure
data = {'name': 'root', 'sub_branch': {'name': 'child'}}
assert v.validate_python(data) == {
'name': 'root',
'sub_branch': {'name': 'child', 'sub_branch': None}
}
Mutual Recursion
The system also supports mutual recursion, where two or more definitions reference each other. For example, a Foo class might contain a Bar class, which in turn contains a reference back to Foo.
core_schema.definitions_schema(
core_schema.definition_reference_schema('Foo'),
[
core_schema.typed_dict_schema(
{
'height': core_schema.typed_dict_field(core_schema.int_schema()),
'bar': core_schema.typed_dict_field(core_schema.definition_reference_schema('Bar')),
},
ref='Foo',
),
core_schema.typed_dict_schema(
{
'width': core_schema.typed_dict_field(core_schema.int_schema()),
'foo': core_schema.typed_dict_field(
core_schema.nullable_schema(core_schema.definition_reference_schema('Foo'))
),
},
ref='Bar',
),
],
)
Runtime Recursion Detection
While the schema itself can be recursive, actual data provided at runtime might contain cycles (e.g., a list that contains itself). pydantic-core includes built-in protection against these infinite loops.
When a cyclic reference is detected in the input data, the validator raises a ValidationError with a recursion_loop error type.
from pydantic_core import SchemaValidator, core_schema, ValidationError
v = SchemaValidator(
core_schema.definitions_schema(
core_schema.definition_reference_schema('the-list'),
[core_schema.list_schema(core_schema.definition_reference_schema('the-list'), ref='the-list')],
)
)
data = []
data.append(data) # Create a cyclic reference
try:
v.validate_python(data)
except ValidationError as exc:
# The error will indicate a 'recursion_loop'
assert exc.errors()[0]['type'] == 'recursion_loop'
Termination Patterns
To prevent infinite recursion during validation of non-cyclic data, recursive schemas must provide a way to terminate. In this codebase, this is typically handled in two ways:
nullable_schema: Allowing a field to beNoneto signal the end of the branch.with_default_schema: Providing a default value (likeNoneor an empty list) so the field is not required in every nested level.
These patterns ensure that while the schema allows for infinite depth, the actual data processed remains finite.