Skip to main content

Validation Pipelines and Chaining

Validation pipelines in this codebase are implemented using the ChainSchema, which allows multiple validation steps to be executed sequentially. In a chain, the output of one validator is passed as the input to the next, creating a powerful mechanism for data transformation and multi-stage verification.

The Chain Schema Structure

The core of this functionality is the ChainSchema defined in pydantic_core/core_schema.py. It is a TypedDict that requires a list of CoreSchema objects:

class ChainSchema(TypedDict, total=False):
type: Required[Literal['chain']]
steps: Required[list[CoreSchema]]
ref: str
metadata: dict[str, Any]
serialization: SerSchema

To simplify the creation of these schemas, the codebase provides a chain_schema helper function. This function is the standard way to compose validation steps.

Data Flow and Transformation

The primary purpose of a chain is to pipe data through a series of transformations. Each step in the steps list is executed in order. If a step succeeds, its return value is used as the input for the subsequent step.

A common pattern found in pydantic-core/tests/validators/test_chain.py involves chaining a type validator with a transformation function:

from decimal import Decimal
from pydantic_core import SchemaValidator, core_schema as cs

# Chain a string validator with a function that converts the string to a Decimal
validator = SchemaValidator(
cs.chain_schema(
steps=[
cs.str_schema(),
cs.with_info_plain_validator_function(lambda v, info: Decimal(v))
]
)
)

# '1.44' (str) -> passes str_schema -> passed to lambda -> returns Decimal('1.44')
assert validator.validate_python('1.44') == Decimal('1.44')

Internal Validation Pipelines

The Pydantic library uses chain_schema internally to implement complex validation logic, such as enforcing strict type checks before applying laxer validation rules.

In pydantic/_internal/_generate_schema.py, the _deque_schema method uses a chain to implement strict validation for collections.deque. It first checks if the input is an instance of deque before proceeding to validate its contents:

# Simplified excerpt from pydantic/_internal/_generate_schema.py
check_instance = core_schema.json_or_python_schema(
json_schema=list_schema,
python_schema=core_schema.is_instance_schema(collections.deque, cls_repr='Deque'),
)

lax_schema = core_schema.no_info_wrap_validator_function(deque_validator, list_schema)

# The strict schema chains the instance check with the actual validator
strict_schema = core_schema.chain_schema([check_instance, lax_schema])

This ensures that in strict mode, the validator does not attempt to process types that are not explicitly deques, even if they could otherwise be converted.

Experimental Pipeline API

For higher-level model definitions, the pydantic.experimental.pipeline module provides a more readable "fluent" API that generates ChainSchema under the hood. Methods like validate_as and transform compose these chains:

# Example of high-level pipeline usage from tests/test_pipeline.py
from pydantic import TypeAdapter, Annotated
from pydantic.experimental.pipeline import validate_as

ta = TypeAdapter[int](Annotated[int, validate_as(str).str_strip().validate_as(int)])

# This creates a chain:
# 1. Validate as string
# 2. Strip whitespace
# 3. Validate/convert result as integer
assert ta.validate_python(' 1 ') == 1

Technical Constraints and Behavior

Error Short-Circuiting

Validation in a chain is short-circuited. If any step in the pipeline fails, the process stops immediately, and the error from that specific step is returned. Subsequent steps are never executed.

Schema Requirements

A ChainSchema must contain at least one step. Attempting to initialize a SchemaValidator with an empty list of steps will result in a SchemaError.

Automatic Flattening

For performance optimization, pydantic-core automatically flattens nested chain schemas. If a ChainSchema contains another ChainSchema as one of its steps, they are merged into a single flat list of validation steps during the validator construction phase.

Title Generation

The default title for a chain validator is automatically generated from its components, typically following the pattern chain[step1,step2,...]. If the chain contains only a single step, it inherits the title of that specific step.