Type Checking and Specialized Schemas
Pydantic-core is designed to bridge the gap between the dynamic, object-oriented nature of Python and the structured, serialized nature of JSON. While many schemas focus on standard data types (like strings or integers), specialized schemas like IsInstanceSchema, JsonSchema, and MissingSentinelSchema handle the "impedance mismatch" that occurs when validating complex Python objects or stringified data.
Runtime Type Checking
The IsInstanceSchema and IsSubclassSchema provide a mechanism for performing standard Python type checks within the validation pipeline. These are essential for validating custom class instances or class types that cannot be natively represented in JSON.
Design and Limitations
These schemas are strictly intended for Python-side validation. Because JSON has no concept of Python classes, attempting to use these schemas with validate_json() will result in a ValidationError with the type needs_python_object.
As seen in pydantic-core/python/pydantic_core/core_schema.py, the is_instance_schema helper allows for a custom cls_repr. This is a design choice that enables developers to provide more readable error messages than the default Python class repr.
from pydantic_core import SchemaValidator, core_schema as cs
class MyCustomClass:
pass
# Using IsInstanceSchema via the helper
schema = cs.is_instance_schema(cls=MyCustomClass, cls_repr='MyObject')
v = SchemaValidator(schema)
# Success in Python validation
obj = MyCustomClass()
assert v.validate_python(obj) == obj
# Failure in JSON validation
# v.validate_json('{"some": "json"}') -> Raises ValidationError: needs_python_object
Similarly, IsSubclassSchema (created via is_subclass_schema) ensures that the input is a Python class that inherits from a specified base. This is useful for factory patterns or registry systems where a class itself is the expected input.
Handling Stringified JSON
In many real-world scenarios, data is received as a string that contains a JSON-encoded object—often referred to as "JSON-in-JSON". The JsonSchema is specifically designed to handle this by acting as a wrapper.
The Parsing Pipeline
When a JsonSchema is encountered, the validator first parses the input string (or bytes) into a Python object. It then passes that object to an inner schema for further validation. This allows for deep validation of stringified fields.
The following example from pydantic-core/python/pydantic_core/core_schema.py demonstrates how JsonSchema wraps a model-like structure:
from pydantic_core import SchemaValidator, core_schema
# Define the internal structure expected inside the JSON string
dict_schema = core_schema.model_fields_schema(
{
'field_a': core_schema.model_field(core_schema.str_schema()),
'field_b': core_schema.model_field(core_schema.bool_schema()),
},
)
# Wrap it in a JsonSchema
json_schema = core_schema.json_schema(schema=dict_schema)
v = SchemaValidator(json_schema)
# The input is a string, but the output is a validated dictionary
input_str = '{"field_a": "hello", "field_b": true}'
assert v.validate_python(input_str) == {'field_a': 'hello', 'field_b': True}
This approach decouples the transport format (a string) from the data structure, allowing the same validation logic to be reused regardless of whether the data arrived as a native Python dict or a JSON string.
Specialized Sentinels and Internal States
Pydantic-core includes schemas for handling states that are neither "valid data" nor "null".
The MISSING Sentinel
The MissingSentinelSchema is used to handle the MISSING object (imported from pydantic_core). In Python, None is often used to represent "no value," but this creates ambiguity when None is itself a valid value. The MISSING sentinel represents a field that was truly not provided.
As demonstrated in tests/test_missing_sentinel.py, fields using this sentinel are typically omitted during serialization:
from pydantic_core import MISSING
from pydantic import BaseModel
class Model(BaseModel):
f: int | MISSING = MISSING
m = Model()
# The field 'f' is MISSING, so it is omitted from the dump
assert m.model_dump() == {}
assert m.f is MISSING
The Invalid Schema
The InvalidSchema serves as a sentinel for broken or unsupported states. While it exists in the core_schema definitions for type-system completeness, it is not intended for use in production models. In fact, the Pydantic JSON schema generator in pydantic/json_schema.py explicitly raises a RuntimeError if it encounters this schema, treating it as a bug in the schema construction logic.
# From pydantic/json_schema.py
def invalid_schema(self, schema: core_schema.InvalidSchema) -> JsonSchemaValue:
"""Placeholder - should never be called."""
raise RuntimeError('Cannot generate schema for invalid_schema. This is a bug! Please report it.')
This design ensures that developers are alerted immediately if a schema reaches an inconsistent state during the generation of OpenAPI or JSON Schema documents.