Dual-Mode Validation: JSON vs. Python
In pydantic-core, validation is not a monolithic process. The library distinguishes between data sources (JSON strings vs. Python objects) and validation modes (lax vs. strict). This separation allows for high-performance parsing while maintaining the flexibility required for complex Python type hierarchies.
Branching by Entry Point: JsonOrPythonSchema
The JsonOrPythonSchema is the primary mechanism for defining different validation paths based on how the data enters the system. It explicitly separates the logic for validate_json and validate_python.
This design choice addresses a fundamental challenge in Pydantic: a Python-native object might already be in its desired form (e.g., an instance of a class), whereas a JSON input is always a raw primitive that requires parsing and instantiation.
from pydantic_core import core_schema as cs
from pydantic_core import SchemaValidator
class Foo:
def __init__(self, value):
self.value = value
def __eq__(self, other):
return isinstance(other, Foo) and self.value == other.value
# Define a schema that behaves differently for Python vs JSON
s = cs.json_or_python_schema(
json_schema=cs.no_info_after_validator_function(Foo, cs.str_schema()),
python_schema=cs.is_instance_schema(Foo)
)
v = SchemaValidator(s)
# In Python mode, we expect an instance of Foo
assert v.validate_python(Foo('abc')) == Foo('abc')
# In JSON mode, we parse the string and then create a Foo instance
assert v.validate_json('"abc"') == Foo('abc')
By using JsonOrPythonSchema, the codebase avoids the overhead of checking if a string is valid JSON during validate_python, and conversely, avoids unnecessary isinstance checks when the input is known to be a JSON primitive. This is heavily utilized in Pydantic's internal types, such as SecretStr or IP address types, where the Python representation is a specialized class but the JSON representation is a simple string.
Branching by Validation Mode: LaxOrStrictSchema
While JsonOrPythonSchema branches based on the source of the data, LaxOrStrictSchema branches based on the strictness configuration. This allows a single schema to support both "lax" validation (which allows type coercion) and "strict" validation (which requires exact type matches).
The implementation in pydantic_core/core_schema.py defines this as:
class LaxOrStrictSchema(TypedDict, total=False):
type: Required[Literal['lax-or-strict']]
lax_schema: Required[CoreSchema]
strict_schema: Required[CoreSchema]
strict: bool
# ...
This structure is frequently used to wrap complex types where coercion logic is only acceptable in lax mode. For example, converting a string to an integer is a "lax" operation, while requiring an actual int is "strict".
v = SchemaValidator(cs.lax_or_strict_schema(
lax_schema=cs.str_schema(),
strict_schema=cs.int_schema()
))
# Lax mode (default) uses lax_schema (str)
assert v.validate_python('aaa') == 'aaa'
# Strict mode uses strict_schema (int)
# This will fail if the input is 'aaa' but pass for 123
assert v.validate_python(123, strict=True) == 123
The strict boolean in the schema definition provides a default, but this can be overridden at runtime via the strict argument in the validation methods. This design allows developers to toggle validation rigor without rebuilding the underlying SchemaValidator.
Embedded JSON Parsing: JsonSchema
The JsonSchema validator serves a specialized purpose: it treats a string or bytes input as a JSON-encoded payload that must be parsed before further validation. This is distinct from validate_json because it can be embedded anywhere within a larger Python object schema.
This is particularly useful for handling "JSON-in-string" scenarios, such as JSON data submitted within a multipart form field or a database column that stores JSON as a string.
# A schema that expects a JSON string representing a list of integers
v = SchemaValidator(cs.json_schema(
cs.list_schema(cs.int_schema())
))
# Input must be a JSON string, even when calling validate_python
assert v.validate_python('[1, 2, 3, "4"]') == [1, 2, 3, 4]
Unlike JsonOrPythonSchema, which chooses a path, JsonSchema is a transformation step. It forces the input to be parsed as JSON regardless of the entry point, then passes the resulting Python object to the inner schema.
Design Tradeoffs and Constraints
The implementation of these dual-mode schemas reflects a trade-off between schema complexity and runtime performance:
- Schema Complexity: Using these schemas increases the depth and branching of the
CoreSchematree. This makes the initial schema generation (usually handled bypydantic) more complex but results in a highly optimizedSchemaValidatorin Rust. - Strictness Inheritance:
LaxOrStrictSchemainteracts with the globalstrictsetting. If a parent schema is set tostrict: True, this preference propagates down the tree unless explicitly overridden. - Input Type Constraints:
JsonSchemastrictly requires string or bytes input. If it receives a pre-parsed dictionary duringvalidate_python, it will fail. This enforces a clear boundary:JsonSchemais for parsing, not just for validating structure.
By combining these three schema types, pydantic-core provides a robust framework for handling the ambiguity of modern data validation, where the same logical data type may appear in different formats depending on the transport layer.