Collections and Sequences
Validation and serialization of collections in pydantic-core are handled by a set of specialized schema types. These schemas define how Python's built-in collection types—such as lists, dictionaries, sets, and tuples—are validated, coerced, and transformed.
Sequences: Lists and Sets
The ListSchema, SetSchema, and FrozenSetSchema classes in pydantic_core.core_schema provide validation for sequence-like collections. They share a common structure for defining item types and length constraints.
Core Configuration
These schemas typically include:
items_schema: ACoreSchemaused to validate each element in the collection.min_length/max_length: Constraints on the number of items.strict: WhenTrue, the validator will not attempt to coerce other iterables into the target type (e.g., it will reject a tuple if a list is expected).
In pydantic-core/tests/validators/test_list.py, the behavior of strict mode is demonstrated:
from pydantic_core import SchemaValidator, core_schema as cs
# Strict mode prevents coercion from tuples to lists
v = SchemaValidator(cs.list_schema(items_schema=cs.int_schema(), strict=True))
assert v.validate_python([1, 2, '33']) == [1, 2, 33]
# This would raise a ValidationError because the input is a tuple
# v.validate_python((1, 2, 3))
By default, list_schema is permissive and can validate various iterables (like deque, set, or generators) into a list, as seen in the test suite's test_list_int parameterization.
Dictionaries and Mappings
The DictSchema handles validation for mapping types. Unlike sequences, it allows for separate validation of keys and values through keys_schema and values_schema.
from pydantic_core import SchemaValidator, core_schema as cs
v = SchemaValidator(cs.dict_schema(
keys_schema=cs.str_schema(),
values_schema=cs.int_schema()
))
assert v.validate_python({'a': 1, 'b': '2'}) == {'a': 1, 'b': 2}
If keys_schema or values_schema are omitted, they default to AnySchema, allowing any type for that component of the dictionary.
Tuples: Positional and Variadic
TupleSchema supports two primary validation patterns: positional (fixed-length with specific types for each slot) and variadic (variable length with a repeating type).
Positional Tuples
A positional tuple is defined by providing a list of schemas to items_schema. The input must match the length of this list unless a variadic index is specified.
Variadic Tuples
Using the variadic_item_index field, you can implement PEP 646 style variadic tuples. This index points to the schema within items_schema that should be used for all "extra" items.
Example from pydantic-core/python/pydantic_core/core_schema.py:
from pydantic_core import SchemaValidator, core_schema
schema = core_schema.tuple_schema(
[core_schema.int_schema(), core_schema.str_schema(), core_schema.float_schema()],
variadic_item_index=1,
)
v = SchemaValidator(schema)
# The second schema (str_schema) is variadic
assert v.validate_python((1, 'hello', 'world', 1.5)) == (1, 'hello', 'world', 1.5)
Lazy Validation with Generators
The GeneratorSchema differs from other collection schemas because validation is lazy. When you validate an iterable against a GeneratorSchema, pydantic-core returns a ValidatorIterator.
Validation errors are not raised when validate_python is called; instead, they are raised item-by-item as the generator is consumed.
Example from pydantic-core/tests/validators/test_generator.py:
v = SchemaValidator({'type': 'generator', 'items_schema': {'type': 'int'}})
gen = v.validate_python([1, 'wrong', 3])
assert next(gen) == 1
# The error is raised only when we reach the invalid item
# next(gen) -> ValidationError
The ValidatorIterator also maintains an index attribute, allowing you to track how many items have been successfully processed before an error occurred.
Common Validation Controls
Fail Fast
The fail_fast parameter (available on ListSchema, DictSchema, SetSchema, TupleSchema, and FrozenSetSchema) determines whether validation should stop immediately after the first error is encountered.
In pydantic-core/tests/validators/test_list.py, this is tested by checking the number of errors returned:
s = core_schema.list_schema(core_schema.int_schema(), fail_fast=True)
v = SchemaValidator(s)
try:
v.validate_python([1, 'not-num', 'again'])
except ValidationError as exc:
# With fail_fast=True, only the first error ('not-num') is reported
assert len(exc.errors()) == 1
Serialization Overrides
Most collection schemas support a serialization field (using types like IncExSeqOrElseSerSchema or IncExDictOrElseSerSchema). This allows you to define specific serialization behavior, such as including or excluding certain items or keys when converting the collection back to a serialized format like JSON.