Skip to main content

Core Schema and Validator Data Model

The data model of pydantic-core revolves around the translation of a CoreSchema (a Python-side definition) into high-performance Rust-based validators and serializers.

Key Components:

  • CoreSchema: A recursive, union-based data structure (implemented as TypedDict in Python) that defines the validation and serialization logic for a specific type. It includes fields like type, ref, and metadata.
  • SchemaValidator: The primary engine for validation. It is initialized from a CoreSchema and a CoreConfig. Internally, it holds a CombinedValidator, which is an enum of specialized validator implementations (e.g., IntValidator, ModelValidator).
  • SchemaSerializer: The engine for converting Python objects into JSON or other Python formats. Like the validator, it is built from a CoreSchema and uses a CombinedSerializer enum.
  • Input: A Rust trait that abstracts over different input formats, primarily PythonInput (wrapping PyAny) and JsonInput (wrapping jiter::JsonValue).
  • ValidationError: An exception raised when validation fails. It encapsulates one or more PyLineError objects, each detailing the ErrorType, the Location (path) of the error, and the input_value that caused it.
  • SchemaError: An exception raised during the construction of a SchemaValidator or SchemaSerializer if the provided CoreSchema is invalid or inconsistent.
  • CoreConfig: A configuration object that provides global settings such as strict mode, extra_fields_behavior, and string constraints.

The diagram illustrates how the CoreSchema acts as the source of truth for both validation and serialization, and how these processes interact with input data and error reporting.

Key Architectural Findings:

  • CoreSchema is a recursive TypedDict union that serves as the blueprint for both validation and serialization.
  • SchemaValidator and SchemaSerializer are the core Rust-implemented classes exposed to Python.
  • The Input trait allows the same validation logic to be applied to both Python objects and raw JSON data.
  • ValidationError is a structured container for PyLineError, which provides detailed context (type, location, input) for each failure.
  • SchemaError handles errors in the schema definition itself, separate from data validation errors.
Loading diagram...