Skip to main content

Union and Conditional Logic

Handling multiple possible types and conditional validation logic is a core requirement of the validation engine. The engine provides several specialized schemas to manage these scenarios, ranging from simple type branching to complex, context-aware validation pipelines.

Untagged Unions

The UnionSchema allows a value to be validated against multiple potential schemas. It is defined by a list of choices. The engine supports two primary modes for resolving which choice to use: smart and left_to_right.

Smart vs. Left-to-Right Matching

By default, UnionSchema uses mode='smart'. In this mode, the engine attempts to find the "best" match among the choices rather than simply the first one that succeeds. This is particularly important when choices have overlapping valid inputs (e.g., int and float).

In contrast, mode='left_to_right' stops at the first schema that successfully validates the input.

from pydantic_core import SchemaValidator, core_schema

# Smart union (default)
choices = [core_schema.int_schema(), core_schema.float_schema()]
v_smart = SchemaValidator(core_schema.union_schema(choices, mode='smart'))
# Prefers float for 1.0 even though int could match in lax mode
assert isinstance(v_smart.validate_python(1.0), float)

# Left-to-right union
v_ltr = SchemaValidator(core_schema.union_schema(choices, mode='left_to_right'))
# Selects int for 1.0 because int_schema(lax) accepts 1.0 and comes first
assert isinstance(v_ltr.validate_python(1.0), int)

The auto_collapse attribute (defaulting to True) allows the engine to optimize unions containing only a single element by collapsing them into the inner validator, reducing overhead.

Discriminated Unions

For performance-critical applications or complex polymorphic data structures, TaggedUnionSchema provides a more efficient alternative to untagged unions. Instead of trying every choice, it uses a discriminator to look up the correct schema directly.

The discriminator can be:

  • A string representing a key in a dictionary.
  • A list of strings/ints representing a path to a nested value.
  • A callable that returns the tag from the input data.
from pydantic_core import SchemaValidator, core_schema

apple_schema = core_schema.typed_dict_schema({'type': core_schema.typed_dict_field(core_schema.str_schema())})
banana_schema = core_schema.typed_dict_schema({'type': core_schema.typed_dict_field(core_schema.str_schema())})

schema = core_schema.tagged_union_schema(
choices={
'apple': apple_schema,
'banana': banana_schema,
},
discriminator='type',
)
v = SchemaValidator(schema)
# The engine looks up 'apple' in the choices dict immediately
assert v.validate_python({'type': 'apple'}) == {'type': 'apple'}

This approach avoids the O(N) complexity of untagged unions, where N is the number of choices, making it the preferred choice for large sets of possible types.

Context-Aware Validation

The engine provides schemas that change their behavior based on the validation context, such as whether the input is JSON or whether "strict" mode is enabled.

Lax or Strict Validation

LaxOrStrictSchema allows defining different validation paths depending on the strictness setting. This is frequently used to allow data coercion (like string-to-int) in lax mode while requiring exact types in strict mode.

v = SchemaValidator(
core_schema.lax_or_strict_schema(
lax_schema=core_schema.str_schema(),
strict_schema=core_schema.int_schema(),
strict=True
)
)
# Uses int_schema because strict=True in the schema definition
assert v.validate_python(123) == 123

# Overriding to strict=False at runtime switches to the lax_schema (str)
assert v.validate_python('aaa', strict=False) == 'aaa'

JSON vs. Python Input

JsonOrPythonSchema branches logic based on the entry point used: validate_json or validate_python. This is useful when data coming from JSON (which has limited types) needs different preprocessing than data already in Python objects.

from pydantic_core import core_schema as cs

s = cs.json_or_python_schema(
json_schema=cs.chain_schema([cs.str_schema(), cs.int_schema()]),
python_schema=cs.int_schema()
)
v = SchemaValidator(s)

# Uses python_schema directly
assert v.validate_python(123) == 123

# Uses json_schema (which in this case chains str -> int)
assert v.validate_json('"123"') == 123

Sequential Pipelines

The ChainSchema executes a sequence of validation steps in order. The output of one step becomes the input to the next. This is the primary mechanism for combining standard type validation with custom transformation logic or additional constraints.

from decimal import Decimal
from pydantic_core import SchemaValidator, core_schema as cs

validator = SchemaValidator(
cs.chain_schema(
steps=[
cs.str_schema(),
cs.with_info_plain_validator_function(lambda v, info: Decimal(v))
]
)
)

# Step 1: Validates input is a string (or coerces to string)
# Step 2: Passes that string to the lambda to create a Decimal
assert validator.validate_python('1.44') == Decimal('1.44')

If any step in the chain fails, the entire validation fails immediately, and the error from that specific step is returned. The engine also performs "flattening" optimization; if a ChainSchema contains another ChainSchema, they are merged into a single sequence of steps to reduce recursion depth.

Design Decisions and Constraints

The implementation of these logic schemas reflects a balance between flexibility and performance:

  1. Union Complexity: Untagged unions with many choices can be slow because the engine may need to attempt multiple validations and track multiple error states. The smart mode adds further overhead by comparing the "quality" of matches.
  2. Discriminator Flexibility: TaggedUnionSchema supports complex paths and callables for discriminators, allowing it to handle data structures where the tag isn't a simple top-level field.
  3. Strictness Propagation: Strictness settings in LaxOrStrictSchema can be defined at the schema level, the validator level, or the call level, providing granular control over coercion behavior.
  4. Chain Requirements: A ChainSchema must contain at least one step. An empty steps list will result in a SchemaError during validator initialization, as seen in pydantic_core/tests/validators/test_chain.py.