Input-Specific Validation Logic
In Pydantic V2, validation logic often needs to distinguish between data arriving as a JSON string and data already present as Python objects. The pydantic-core library provides two primary schemas to handle these scenarios: JsonOrPythonSchema for branching logic based on the input source, and JsonSchema for forcing the parsing of strings as JSON.
Branching Logic with JsonOrPythonSchema
The JsonOrPythonSchema (created via core_schema.json_or_python_schema) allows you to define two separate validation paths. The path chosen depends on which SchemaValidator method is called:
validate_json: Executes thejson_schemabranch.validate_python: Executes thepython_schemabranch.
This is essential for types that do not have a direct representation in JSON, such as IP addresses, Datetime objects, or custom Python classes.
The "Strict Python, Lax JSON" Pattern
A common pattern in the Pydantic codebase is to allow string-to-type conversion when parsing JSON (since JSON is text-based) while requiring the exact type when validating Python objects in strict mode.
In pydantic/_internal/_generate_schema.py, the schema for IP addresses (like IPv4Address) uses this pattern:
from pydantic_core import core_schema
# Simplified representation of IP address schema generation
core_schema.json_or_python_schema(
json_schema=core_schema.no_info_after_validator_function(
IPv4Address,
core_schema.str_schema()
),
python_schema=core_schema.is_instance_schema(IPv4Address),
)
In this configuration:
- JSON Mode: A string like
"127.0.0.1"is accepted and passed to theIPv4Addressconstructor. - Python Mode: Only an actual
IPv4Addressinstance is accepted; a string"127.0.0.1"would fail validation if strict mode is enabled.
Handling Custom Classes
JsonOrPythonSchema is also used to ensure that custom classes are instantiated correctly from JSON while allowing existing instances to pass through unchanged in Python.
As seen in pydantic-core/tests/validators/test_json_or_python.py:
from pydantic_core import SchemaValidator, core_schema as cs
class Foo:
def __init__(self, value):
self.value = value
def __eq__(self, other):
return isinstance(other, Foo) and self.value == other.value
s = cs.json_or_python_schema(
json_schema=cs.no_info_after_validator_function(Foo, cs.str_schema()),
python_schema=cs.is_instance_schema(Foo)
)
v = SchemaValidator(s)
# validate_python requires an instance
assert v.validate_python(Foo('abc')) == Foo('abc')
# validate_json allows a string and converts it
assert v.validate_json('"abc"') == Foo('abc')
Forced Parsing with JsonSchema
While JsonOrPythonSchema branches based on the source of the data, JsonSchema (created via core_schema.json_schema) forces the content of the data to be treated as a JSON string, regardless of the entry point.
This is useful for "double-encoded" JSON or fields that are explicitly defined as JSON strings within a larger structure.
Validating JSON-in-Strings
If a field is expected to contain a JSON-encoded list of integers, JsonSchema can be used to parse that string before applying the inner validation logic. This is demonstrated in pydantic-core/tests/validators/test_json.py:
from pydantic_core import SchemaValidator, core_schema
v = SchemaValidator(
core_schema.json_schema(
core_schema.list_schema(core_schema.int_schema())
)
)
# The input is a string that is itself valid JSON
assert v.validate_python('[1, 2, 3, "4"]') == [1, 2, 3, 4]
JSON Keys in Dictionaries
Another advanced use case is validating dictionary keys that are encoded as JSON strings. This is common when complex objects (like tuples) need to be used as keys in a JSON object, which only supports string keys.
v = SchemaValidator(
core_schema.dict_schema(
core_schema.json_schema(
core_schema.tuple_positional_schema([core_schema.int_schema()])
),
core_schema.int_schema(),
)
)
# The key '[1]' is parsed as JSON into the tuple (1,)
assert v.validate_python({'[1]': 4}) == {(1,): 4}
Interaction with Serialization
When using these schemas, the serialization property is often provided to ensure that the data can be converted back to its expected format. For example, an IP address validated via JsonOrPythonSchema might use a plain_serializer_function_ser_schema to ensure it is serialized as a string when calling to_json(), but remains an object when calling to_python().
In pydantic/_internal/_generate_schema.py, the IP schema includes:
serialization=core_schema.plain_serializer_function_ser_schema(
ser_ip,
info_arg=True,
when_used='always'
)
This ensures that the symmetry between validation and serialization is maintained, even when the validation logic is split across different input types.