Architecture Overview
This section contains architecture diagrams and documentation for pydantic.
Available Diagrams
Pydantic Core Internal Architecture
The pydantic-core library is structured as a high-performance Rust core with a thin Python wrapper.
The architecture is centered around two main entry points: SchemaValidator and SchemaSerializer. These are exposed to Python via the _pydantic_core extension module.
Key components include:
- Python Layer: The
pydantic_corepackage provides the public API and re-exports the Rust-implemented classes. Thecore_schemamodule defines the structure of schemas that the core can process, primarily using PythonTypedDicts. - Validation Engine: Implemented in Rust, it builds a tree of specialized validators from the input schema. It uses an Input Abstraction to uniformly process different data formats like Python objects and JSON (via the
jitercrate). - Serialization Engine: Also in Rust, it builds a tree of serializers. It handles converting complex Python objects back into JSON or plain Python types.
- Shared Infrastructure: Both engines rely on a Definitions system to handle recursive models and shared references, and a robust Error Handling system to produce detailed
ValidationErrors.
The diagram illustrates how data flows from the Python user through the validator/serializer entry points into the specialized Rust logic, and how the Rust core abstracts away the differences between input formats.
Key Architectural Findings:
- pydantic-core uses a Rust extension (_pydantic_core) to provide high-performance validation and serialization.
- The SchemaValidator and SchemaSerializer are the primary interfaces between Python and Rust.
- An 'Input' trait in Rust abstracts over Python objects and JSON data, allowing validators to be format-agnostic.
- The 'core_schema' Python module defines the contract for schemas using TypedDicts, which are then parsed by Rust to build validator/serializer trees.
- A shared 'Definitions' system manages recursion and cross-references within schemas.
Core Schema and Validator Data Model
The data model of pydantic-core revolves around the translation of a pydantic_core.core_schema (a Python-side definition) into high-performance Rust-based validators and serializers.
Key Components:
- CoreSchema: A recursive, union-based data structure (implemented as
TypedDictin Python) that defines the validation and serialization logic for a specific type. It includes fields liketype,ref, andmetadata. - SchemaValidator: The primary engine for validation. It is initialized from a
CoreSchemaand aCoreConfig. Internally, it holds aCombinedValidator, which is an enum of specialized validator implementations (e.g.,IntValidator,ModelValidator). - SchemaSerializer: The engine for converting Python objects into JSON or other Python formats. Like the validator, it is built from a
CoreSchemaand uses aCombinedSerializerenum. - Input: A Rust trait that abstracts over different input formats, primarily
PythonInput(wrappingPyAny) andJsonInput(wrappingjiter::JsonValue). - ValidationError: An exception raised when validation fails. It encapsulates one or more
PyLineErrorobjects, each detailing theErrorType, theLocation(path) of the error, and theinput_valuethat caused it. - SchemaError: An exception raised during the construction of a
SchemaValidatororSchemaSerializerif the providedCoreSchemais invalid or inconsistent. - CoreConfig: A configuration object that provides global settings such as
strictmode,extra_fields_behavior, and string constraints.
The diagram illustrates how the CoreSchema acts as the source of truth for both validation and serialization, and how these processes interact with input data and error reporting.
Key Architectural Findings:
- CoreSchema is a recursive TypedDict union that serves as the blueprint for both validation and serialization.
- SchemaValidator and SchemaSerializer are the core Rust-implemented classes exposed to Python.
- The Input trait allows the same validation logic to be applied to both Python objects and raw JSON data.
- ValidationError is a structured container for PyLineError, which provides detailed context (type, location, input) for each failure.
- SchemaError handles errors in the schema definition itself, separate from data validation errors.