Strict and Lax Validation Modes
Validation in this codebase operates in two primary modes: Lax and Strict. Lax mode allows for data coercion (e.g., converting a string "123" to an integer 123), while strict mode requires the input to match the target type exactly. This design allows developers to balance the flexibility needed for messy real-world data with the performance and predictability of strict type enforcement.
Global Configuration with CoreConfig
The default validation behavior for a SchemaValidator is controlled by the CoreConfig class. This TypedDict defines global settings that apply to all fields within the schema unless overridden.
The strict attribute in CoreConfig sets the baseline for the entire validator. When strict is True, the validator will generally reject any input that does not exactly match the expected Python type.
from pydantic_core import CoreConfig, SchemaValidator, core_schema as cs
# Global strict mode enabled via CoreConfig
config = CoreConfig(strict=True)
v = SchemaValidator(cs.int_schema(), config=config)
# This will fail in strict mode because '123' is a string
# v.validate_python('123') -> ValidationError
Another important configuration option is coerce_numbers_to_str. By default, Pydantic does not coerce numbers to strings in lax mode to avoid accidental data loss. However, this can be enabled globally:
# Enable number-to-string coercion in lax mode
config = CoreConfig(coerce_numbers_to_str=True)
v = SchemaValidator(cs.str_schema(), config=config)
assert v.validate_python(123) == "123"
Schema-Level Overrides
Individual schemas can override the global configuration. Most primitive schemas, such as IntSchema, StrSchema, and BoolSchema, include a strict field. If this field is set, it takes precedence over the CoreConfig.
# Global config is lax, but this specific field is strict
v = SchemaValidator(
cs.int_schema(strict=True),
config=CoreConfig(strict=False)
)
# This fails despite the global lax setting
# v.validate_python('123') -> ValidationError
This allows for granular control, where a model might be generally lax but require strictness for specific sensitive fields like identifiers or flags.
Advanced Branching with LaxOrStrictSchema
For complex types where validation logic must diverge significantly between modes, the codebase provides LaxOrStrictSchema. This schema explicitly defines two different paths: a lax_schema and a strict_schema.
This is particularly useful for types that have a "natural" Python representation but are often received as strings in serialized formats (like JSON). For example, an IP address might be validated as a string in lax mode but required to be an ipaddress.IPv4Address instance in strict mode.
v = SchemaValidator(cs.lax_or_strict_schema(
lax_schema=cs.str_schema(), # Used in lax mode
strict_schema=cs.int_schema(), # Used in strict mode
))
# Default (lax) uses str_schema
assert v.validate_python('aaa') == 'aaa'
# Runtime strict=True uses strict_schema (int_schema)
assert v.validate_python(123, strict=True) == 123
The LaxOrStrictSchema itself can also have a strict property, which acts as a toggle for which sub-schema to prefer by default.
Precedence Hierarchy
The effective strictness of a validation operation is determined by a clear hierarchy. A setting higher in the list overrides those below it:
- Runtime Argument: The
strictparameter passed to validation methods likevalidate_python(data, strict=True). - Schema Field: The
strictattribute defined within a specificCoreSchema(e.g.,IntSchema(strict=True)). - Global Config: The
strictattribute defined in theCoreConfigpassed to theSchemaValidator.
If none of these are set, the system defaults to Lax mode.
Trade-offs and Design Decisions
The implementation of these modes reflects a trade-off between performance and flexibility:
- Lax Mode (Flexibility): Designed for interoperability. It handles the common case where data types are "flattened" during transport (e.g., everything becoming a string in a URL query parameter). The cost is additional logic in the Rust-based validators to check for and perform coercions.
- Strict Mode (Performance): When
strict=Trueis used, the underlying Rust implementation can often perform faster "exact match" checks. It avoids the overhead of coercion logic and provides stronger guarantees about the resulting Python objects, making it ideal for internal API boundaries where type integrity is paramount.
By separating these concerns into CoreConfig for defaults and LaxOrStrictSchema for structural differences, the codebase maintains a high-performance core while remaining adaptable to various data ingestion requirements.