Skip to main content

Object-Oriented Validation

Object-oriented validation in pydantic-core is centered around three primary structures: Pydantic Models, Python Dataclasses, and TypedDicts. The engine distinguishes between the "outer" schema, which defines the container type and class-level behavior, and the "inner" schema, which defines how input data is mapped to fields or constructor arguments.

Pydantic Models

Validation for Pydantic models is split between ModelSchema and ModelFieldsSchema. This separation allows the engine to handle the class instantiation and metadata separately from the field-level validation logic.

Model Container (ModelSchema)

The ModelSchema (defined in pydantic_core/core_schema.py) acts as the entry point for validating a class as a Pydantic model. It requires a reference to the Python class (cls) and an inner schema (typically a ModelFieldsSchema).

Key features of ModelSchema include:

  • revalidate_instances: Controls whether existing instances of the class should be re-validated or returned as-is.
  • root_model: Indicates if the model is a "RootModel" (wrapping a single value).
  • post_init: Allows specifying a method name to be called after the model is initialized.

Model Fields (ModelFieldsSchema)

The ModelFieldsSchema defines the actual structure of the data. It maps field names to ModelField objects.

from pydantic_core import core_schema, SchemaValidator

class MyModel:
# Models often require specific slots for pydantic-core to manage state
__slots__ = ('__dict__', '__pydantic_fields_set__', '__pydantic_extra__')

schema = core_schema.model_schema(
cls=MyModel,
schema=core_schema.model_fields_schema(
fields={
'name': core_schema.model_field(core_schema.str_schema()),
'age': core_schema.model_field(core_schema.int_schema()),
}
),
)
v = SchemaValidator(schema)

Individual Fields (ModelField)

Each ModelField manages metadata for a specific attribute, such as:

  • validation_alias: Allows the field to be populated from a different key in the input data (e.g., a JSON key that differs from the Python attribute name).
  • frozen: If true, the field cannot be modified after initialization.

Python Dataclasses

Dataclass validation follows a similar pattern to models but uses DataclassSchema and DataclassArgsSchema. The primary difference is that dataclasses are often initialized via positional or keyword arguments rather than direct attribute assignment.

Dataclass Container (DataclassSchema)

The DataclassSchema wraps the dataclass and manages class-specific logic like slots and post_init (which defaults to False for dataclasses unless specified).

Dataclass Arguments (DataclassArgsSchema)

Unlike models, dataclasses use DataclassArgsSchema to define how input data is transformed into arguments for the dataclass __init__ method.

  • collect_init_only: If enabled, the validator will separate fields marked as init_only into a distinct dictionary, which is useful for dataclasses that use InitVar.
  • DataclassField: Includes dataclass-specific flags like kw_only and init.
import dataclasses
from pydantic_core import core_schema, SchemaValidator

@dataclasses.dataclass
class User:
id: int
username: str

schema = core_schema.dataclass_schema(
User,
core_schema.dataclass_args_schema(
'User',
[
core_schema.dataclass_field(name='id', schema=core_schema.int_schema()),
core_schema.dataclass_field(name='username', schema=core_schema.str_schema()),
],
),
fields=['id', 'username'],
)

TypedDicts

TypedDictSchema is used for validating plain Python dictionaries against a fixed set of keys. Unlike models or dataclasses, it does not result in a class instance but returns a validated dictionary.

  • total: Determines if all keys defined in the fields dictionary are required by default.
  • TypedDictField: Includes a required flag to override the global total setting for specific keys.
v = SchemaValidator(
core_schema.typed_dict_schema(
fields={
'key': core_schema.typed_dict_field(schema=core_schema.str_schema()),
'count': core_schema.typed_dict_field(
schema=core_schema.with_default_schema(schema=core_schema.int_schema(), default=0)
),
}
)
)
# Returns a dict: {'key': 'val', 'count': 0}

Computed Fields

ComputedField (defined in pydantic_core/core_schema.py) represents properties or methods that are not part of the validation input but should be included during serialization.

Computed fields can be added to ModelFieldsSchema, DataclassArgsSchema, or TypedDictSchema. They require:

  • property_name: The name of the property/method on the object.
  • return_schema: The schema used to serialize the return value of the property.

During serialization, the engine calls the property named by property_name and uses the return_schema to format the output. This is commonly used for derived values that need to appear in the final JSON output.

Extra Fields and Attributes

Both ModelFieldsSchema and TypedDictSchema support extra_behavior, which defines how to handle keys in the input data that are not defined in the schema:

  • 'ignore': Extra fields are dropped.
  • 'allow': Extra fields are kept (and validated against extras_schema if provided).
  • 'forbid': Validation fails if extra fields are present.

For models, from_attributes (in ModelFieldsSchema) allows the validator to extract data from object attributes instead of just dictionary keys, enabling validation of ORM objects or other class instances.