Skip to main content

Collection and Sequence Validation

Collection and sequence validation in pydantic-core is handled through a set of specialized schemas that define how Python's built-in data structures are validated, coerced, and constrained. These schemas allow for deep validation of nested items, length constraints, and performance optimizations like early exit on failure.

Sequence Validation

Sequences like lists and tuples are validated using ListSchema and TupleSchema. While they share common traits like length constraints, their validation logic differs significantly regarding item positioning and type coercion.

Lists

The ListSchema (created via core_schema.list_schema) validates that an input is a list-like structure. In "lax" mode, pydantic-core will coerce other iterables (such as tuples, sets, or generators) into a list. In "strict" mode, only actual list objects are accepted.

Key features include:

  • items_schema: A CoreSchema applied to every element in the list.
  • min_length / max_length: Constraints on the number of elements.
  • fail_fast: If True, validation stops immediately after the first item fails validation, rather than collecting all errors.
from pydantic_core import SchemaValidator, core_schema as cs

# List requiring at least 2 integers
v = SchemaValidator(cs.list_schema(
items_schema=cs.int_schema(),
min_length=2
))

# Coerces strings to ints in lax mode
assert v.validate_python(['1', 2]) == [1, 2]

Tuples

TupleSchema supports two primary validation modes: fixed-position (positional) and variable-length (variadic).

  1. Positional Tuples: Validates each element against a specific schema based on its index.
  2. Variadic Tuples: Uses variadic_item_index to define where a repeating schema begins. This is used to implement types like tuple[int, str, ...] or PEP 646 variadic tuples.
# Positional and Variadic Tuple: (int, str, *float)
# variadic_item_index=2 means index 2 and beyond use the third schema
v = SchemaValidator(cs.tuple_schema(
[cs.int_schema(), cs.str_schema(), cs.float_schema()],
variadic_item_index=2
))

assert v.validate_python((1, 'a', 1.1, 2.2, 3.3)) == (1, 'a', 1.1, 2.2, 3.3)

Set and FrozenSet Validation

SetSchema and FrozenSetSchema validate unordered collections of unique items. A critical requirement for these schemas is that all items must be hashable. If an input contains unhashable items (like a list inside a set), validation will fail with a set_item_not_hashable error.

Like lists, sets support min_length, max_length, and fail_fast.

# Set of integers with length constraints
v = SchemaValidator(cs.set_schema(
items_schema=cs.int_schema(),
min_length=1
))

assert v.validate_python({1, '2', 2}) == {1, 2}

Mapping Validation

The DictSchema (via core_schema.dict_schema) validates dictionary structures by applying separate schemas to keys and values.

  • keys_schema: Validates the dictionary keys (defaults to AnySchema).
  • values_schema: Validates the dictionary values (defaults to AnySchema).
  • Constraints: min_length and max_length apply to the number of key-value pairs.
# Dictionary with string keys and integer values
v = SchemaValidator(cs.dict_schema(
keys_schema=cs.str_schema(),
values_schema=cs.int_schema()
))

assert v.validate_python({'id': '123', 'count': 1}) == {'id': '123', 'count': 1}

Lazy Generator Validation

GeneratorSchema provides a unique "lazy" validation mechanism. Unlike other collections where the entire structure is validated when validate_python is called, a generator is validated item-by-item as it is consumed.

This ensures that the performance benefits of Python generators (lazy evaluation) are preserved. If an item fails validation, the ValidationError is raised during the next() call or during iteration, not at the initial validation step.

def my_generator():
yield 1
yield 'not an int'

v = SchemaValidator(cs.generator_schema(cs.int_schema()))
gen = v.validate_python(my_generator())

next(gen) # Returns 1
# next(gen) # Raises ValidationError here

Advanced Configuration

Fail Fast

The fail_fast parameter is available on ListSchema, SetSchema, DictSchema, and TupleSchema. When enabled, the validator returns the first error it encounters. This is particularly useful for large collections where identifying every single error is computationally expensive or unnecessary.

Strict Mode

When strict=True is set on a collection schema:

  • Lists: Only accept list types (no coercion from tuples/sets).
  • Sets: Only accept set types.
  • Dicts: Only accept dict types.
  • Tuples: Only accept tuple types.

In lax mode (default), pydantic-core attempts to convert compatible types, such as converting a deque or a generator to a list when using ListSchema.