Skip to main content

Core Schema Architecture

The Core Schema architecture in pydantic-core provides a low-level, high-performance definition language for data validation and serialization. Instead of working directly with Python classes, the engine operates on a CoreSchema—a tree of TypedDict objects that describe exactly how data should be processed.

Permissive Validation with AnySchema

The most basic building block in the schema architecture is AnySchema. It represents a "pass-through" validation where any input is considered valid. This is frequently used as a fallback for collection items or when a field's type is truly dynamic.

In pydantic_core/core_schema.py, AnySchema is defined as:

class AnySchema(TypedDict, total=False):
type: Required[Literal['any']]
ref: str
metadata: dict[str, Any]
serialization: SerSchema

While it is permissive, it still supports the standard schema features like metadata for custom extensions and serialization rules for defining how the "any" value should be converted back to a serializable format.

Definitions and References

For complex data structures, especially those involving shared components or recursion, pydantic-core uses a registry-based approach. This prevents infinite recursion during the schema construction phase and allows multiple parts of a schema to point to the same validation logic.

The Registry: DefinitionsSchema

The DefinitionsSchema acts as a container. It holds a primary "entry point" schema and a list of "definitions" that can be referenced by name.

class DefinitionsSchema(TypedDict, total=False):
type: Required[Literal['definitions']]
schema: Required[CoreSchema]
definitions: Required[list[CoreSchema]]
metadata: dict[str, Any]
serialization: SerSchema

The Pointer: DefinitionReferenceSchema

To use a schema defined in the definitions list, you use a DefinitionReferenceSchema. This schema doesn't contain validation logic itself; instead, its schema_ref field points to the ref attribute of a schema inside the DefinitionsSchema.

class DefinitionReferenceSchema(TypedDict, total=False):
type: Required[Literal['definition-ref']]
schema_ref: Required[str]
ref: str
metadata: dict[str, Any]
serialization: SerSchema

Example: Shared Definitions

In this example, a list of integers is defined where the integer validation logic is shared via a reference:

from pydantic_core import SchemaValidator, core_schema

schema = core_schema.definitions_schema(
# The entry point: a list of references to 'foobar'
core_schema.list_schema(core_schema.definition_reference_schema('foobar')),
# The registry: defines what 'foobar' is
[core_schema.int_schema(ref='foobar')],
)

v = SchemaValidator(schema)
assert v.validate_python([1, 2, '3']) == [1, 2, 3]

Implementing Recursive Structures

Recursion is a primary use case for the definitions/references pattern. Without this separation, a recursive model (like a tree) would require an infinitely nested dictionary to define.

By using DefinitionReferenceSchema, the schema can refer to itself by name. The following example from pydantic-core/tests/validators/test_definitions_recursive.py demonstrates a recursive "Branch" structure:

v = SchemaValidator(
core_schema.definitions_schema(
# Entry point points to the 'Branch' definition
core_schema.definition_reference_schema('Branch'),
[
core_schema.typed_dict_schema(
{
'name': core_schema.typed_dict_field(core_schema.str_schema()),
'sub_branch': core_schema.typed_dict_field(
core_schema.with_default_schema(
core_schema.nullable_schema(
# Recursion: sub_branch is also a 'Branch'
core_schema.definition_reference_schema('Branch')
),
default=None,
)
),
},
ref='Branch', # This ref is what 'Branch' points to
)
],
)
)

# Validating a nested structure
assert v.validate_python({'name': 'root', 'sub_branch': {'name': 'b1'}}) == (
{'name': 'root', 'sub_branch': {'name': 'b1', 'sub_branch': None}}
)

Schema Sentinels and Safety

InvalidSchema

The InvalidSchema class serves as a sentinel within the CoreSchema union. It is used to explicitly mark a schema as broken or unconstructible.

class InvalidSchema(TypedDict, total=False):
type: Required[Literal['invalid']]
ref: str
metadata: dict[str, Any]
serialization: SerSchema

Attempting to instantiate a SchemaValidator with an InvalidSchema will result in a SchemaError. This is primarily used internally during schema generation when a valid schema cannot be produced for a given Python type.

Runtime Recursion Guards

While DefinitionsSchema handles recursion at the definition level, pydantic-core also protects against recursion at the data level. If a user provides cyclic data (e.g., a list that contains itself), the engine's internal recursion guard will detect the loop and raise a recursion_loop error, even if the schema itself is correctly defined as recursive. This ensures that the validation engine never enters an infinite loop during runtime processing.