Core Schema Architecture
The Core Schema architecture in pydantic-core provides a low-level, high-performance definition language for data validation and serialization. Instead of working directly with Python classes, the engine operates on a CoreSchema—a tree of TypedDict objects that describe exactly how data should be processed.
Permissive Validation with AnySchema
The most basic building block in the schema architecture is AnySchema. It represents a "pass-through" validation where any input is considered valid. This is frequently used as a fallback for collection items or when a field's type is truly dynamic.
In pydantic_core/core_schema.py, AnySchema is defined as:
class AnySchema(TypedDict, total=False):
type: Required[Literal['any']]
ref: str
metadata: dict[str, Any]
serialization: SerSchema
While it is permissive, it still supports the standard schema features like metadata for custom extensions and serialization rules for defining how the "any" value should be converted back to a serializable format.
Definitions and References
For complex data structures, especially those involving shared components or recursion, pydantic-core uses a registry-based approach. This prevents infinite recursion during the schema construction phase and allows multiple parts of a schema to point to the same validation logic.
The Registry: DefinitionsSchema
The DefinitionsSchema acts as a container. It holds a primary "entry point" schema and a list of "definitions" that can be referenced by name.
class DefinitionsSchema(TypedDict, total=False):
type: Required[Literal['definitions']]
schema: Required[CoreSchema]
definitions: Required[list[CoreSchema]]
metadata: dict[str, Any]
serialization: SerSchema
The Pointer: DefinitionReferenceSchema
To use a schema defined in the definitions list, you use a DefinitionReferenceSchema. This schema doesn't contain validation logic itself; instead, its schema_ref field points to the ref attribute of a schema inside the DefinitionsSchema.
class DefinitionReferenceSchema(TypedDict, total=False):
type: Required[Literal['definition-ref']]
schema_ref: Required[str]
ref: str
metadata: dict[str, Any]
serialization: SerSchema
Example: Shared Definitions
In this example, a list of integers is defined where the integer validation logic is shared via a reference:
from pydantic_core import SchemaValidator, core_schema
schema = core_schema.definitions_schema(
# The entry point: a list of references to 'foobar'
core_schema.list_schema(core_schema.definition_reference_schema('foobar')),
# The registry: defines what 'foobar' is
[core_schema.int_schema(ref='foobar')],
)
v = SchemaValidator(schema)
assert v.validate_python([1, 2, '3']) == [1, 2, 3]
Implementing Recursive Structures
Recursion is a primary use case for the definitions/references pattern. Without this separation, a recursive model (like a tree) would require an infinitely nested dictionary to define.
By using DefinitionReferenceSchema, the schema can refer to itself by name. The following example from pydantic-core/tests/validators/test_definitions_recursive.py demonstrates a recursive "Branch" structure:
v = SchemaValidator(
core_schema.definitions_schema(
# Entry point points to the 'Branch' definition
core_schema.definition_reference_schema('Branch'),
[
core_schema.typed_dict_schema(
{
'name': core_schema.typed_dict_field(core_schema.str_schema()),
'sub_branch': core_schema.typed_dict_field(
core_schema.with_default_schema(
core_schema.nullable_schema(
# Recursion: sub_branch is also a 'Branch'
core_schema.definition_reference_schema('Branch')
),
default=None,
)
),
},
ref='Branch', # This ref is what 'Branch' points to
)
],
)
)
# Validating a nested structure
assert v.validate_python({'name': 'root', 'sub_branch': {'name': 'b1'}}) == (
{'name': 'root', 'sub_branch': {'name': 'b1', 'sub_branch': None}}
)
Schema Sentinels and Safety
InvalidSchema
The InvalidSchema class serves as a sentinel within the CoreSchema union. It is used to explicitly mark a schema as broken or unconstructible.
class InvalidSchema(TypedDict, total=False):
type: Required[Literal['invalid']]
ref: str
metadata: dict[str, Any]
serialization: SerSchema
Attempting to instantiate a SchemaValidator with an InvalidSchema will result in a SchemaError. This is primarily used internally during schema generation when a valid schema cannot be produced for a given Python type.
Runtime Recursion Guards
While DefinitionsSchema handles recursion at the definition level, pydantic-core also protects against recursion at the data level. If a user provides cyclic data (e.g., a list that contains itself), the engine's internal recursion guard will detect the loop and raise a recursion_loop error, even if the schema itself is correctly defined as recursive. This ensures that the validation engine never enters an infinite loop during runtime processing.