Recursive and Referenced Schemas
In pydantic-core, managing complex schema topologies—such as shared logic across multiple fields or self-referential data structures—is handled through a reference-based system. This system decouples the definition of a schema from its usage, allowing for efficient memory usage and the representation of recursive models like trees or linked lists.
The implementation relies on two primary components in pydantic_core.core_schema: the DefinitionsSchema container and the DefinitionReferenceSchema pointer.
The Definitions Container
The DefinitionsSchema acts as a registry for schemas that need to be reused or referenced. It is a TypedDict that wraps a main schema and provides a list of auxiliary definitions.
class DefinitionsSchema(TypedDict, total=False):
type: Required[Literal['definitions']]
schema: Required[CoreSchema]
definitions: Required[list[CoreSchema]]
metadata: dict[str, Any]
serialization: SerSchema
When the SchemaValidator or SchemaSerializer is initialized with a DefinitionsSchema, it processes the definitions list first, making each schema with a ref attribute available for lookup by other parts of the schema.
Referencing Schemas
To use a schema defined in the definitions list, you use a DefinitionReferenceSchema. This schema does not contain the validation logic itself; instead, it contains a schema_ref string that matches the ref attribute of a schema in the registry.
class DefinitionReferenceSchema(TypedDict, total=False):
type: Required[Literal['definition-ref']]
schema_ref: Required[str]
ref: str
metadata: dict[str, Any]
serialization: SerSchema
Shared Definitions
One common use case is sharing a complex validator across multiple fields to reduce the size of the generated schema. Instead of duplicating the validator, you define it once in the definitions and reference it multiple times.
from pydantic_core import SchemaValidator, core_schema
# Define a shared integer validator with a specific reference
shared_int = core_schema.int_schema(ref='my-shared-int')
schema = core_schema.definitions_schema(
schema=core_schema.typed_dict_schema({
'a': core_schema.typed_dict_field(core_schema.definition_reference_schema('my-shared-int')),
'b': core_schema.typed_dict_field(core_schema.definition_reference_schema('my-shared-int')),
}),
definitions=[shared_int],
)
v = SchemaValidator(schema)
assert v.validate_python({'a': 1, 'b': '2'}) == {'a': 1, 'b': 2}
Recursive Schemas
Recursion is the primary driver for the reference system. Because Python dictionaries cannot easily represent a truly recursive structure without using references, pydantic-core uses schema_ref to allow a schema to point back to itself or to a parent structure.
A classic example is a "Branch" or "Tree" structure where each node can contain other nodes of the same type.
from pydantic_core import SchemaValidator, core_schema
v = SchemaValidator(
core_schema.definitions_schema(
# The entry point is a reference to the 'Branch' definition
core_schema.definition_reference_schema('Branch'),
[
core_schema.typed_dict_schema(
{
'name': core_schema.typed_dict_field(core_schema.str_schema()),
'sub_branch': core_schema.typed_dict_field(
core_schema.with_default_schema(
core_schema.nullable_schema(
# Recursive reference back to 'Branch'
core_schema.definition_reference_schema('Branch')
),
default=None,
)
),
},
ref='Branch',
)
],
)
)
# Validating nested data
data = {'name': 'root', 'sub_branch': {'name': 'child', 'sub_branch': None}}
assert v.validate_python(data) == data
In this implementation, the typed-dict schema is assigned the ref='Branch'. Inside its own fields, it uses a definition-ref pointing back to 'Branch', enabling infinite depth validation.
Constraints and Safety
Reference Resolution
Every schema_ref used in a DefinitionReferenceSchema must correspond to a ref defined within the same DefinitionsSchema (or a parent DefinitionsSchema if nested). If a reference cannot be resolved during initialization, pydantic-core raises a SchemaError indicating that the definition was never filled.
Cyclic Data Detection
While the schema itself can be recursive, the input data must eventually terminate unless specifically handled. If pydantic-core detects that it is validating the same object instance multiple times within a recursive loop, it raises a ValidationError with the type recursion_loop.
As seen in pydantic-core/tests/validators/test_definitions_recursive.py:
b = {'name': 'recursive'}
b['branch'] = b # Create a circular reference in data
with pytest.raises(ValidationError) as exc_info:
v.validate_python(b)
assert exc_info.value.errors()[0]['type'] == 'recursion_loop'
Serialization
References are also respected during serialization. The DefinitionReferenceSchema can include its own serialization configuration, which allows you to customize how a referenced object is serialized specifically when accessed through that reference, even if the base definition has different serialization logic. This is useful for controlling the depth or format of recursive structures during export.