Skip to main content

Overview

Elevator Pitch

Pydantic is Python's most popular data validation and serialization library. This repository houses pydantic-core, the high-performance engine written in Rust that powers Pydantic V2, offering validation speeds up to 17x faster than previous versions.

The Problem

Python is dynamically typed, which makes it easy to write but difficult to ensure data integrity when receiving input from external sources (like APIs or configuration files). Traditional validation is often slow, verbose, and disconnected from Python's type hints. Pydantic solves this by providing a unified way to define data structures that are automatically validated and serialized with minimal overhead.

Core Concepts

  • CoreSchema: A dictionary-based definition of how data should be validated. It describes types, constraints (like ge=18), and structure (like list or dict).
  • SchemaValidator: The heavy-lifter. It takes a CoreSchema and compiles it into a highly optimized validator (implemented in Rust) that can process Python objects or raw JSON.
  • SchemaSerializer: The counterpart to the validator. It takes a CoreSchema and provides methods to convert Python objects back into JSON or plain Python dictionaries.
  • Strict vs. Lax Mode: Pydantic can either strictly enforce types (e.g., "no strings allowed for integers") or attempt to "coerce" data into the correct type (e.g., converting the string "123" to the integer 123).

How it Works

  1. Schema Definition: You define a CoreSchema using helper functions from pydantic_core.core_schema.
  2. Compilation: You pass this schema to SchemaValidator. Behind the scenes, the Rust engine builds a tree of validators.
  3. Validation: You call validate_python() or validate_json(). The engine traverses the input data, applying the validation logic at each node.
  4. Result: If successful, you get back a "cleaned" Python object. If it fails, a ValidationError is raised with a detailed list of what went wrong and where.

Use Cases

Validating a Simple Dictionary

Use pydantic-core to validate structured data with high performance.

from pydantic_core import SchemaValidator, core_schema

# Define a schema for a user
schema = core_schema.typed_dict_schema({
'name': core_schema.typed_dict_field(core_schema.str_schema()),
'age': core_schema.typed_dict_field(core_schema.int_schema(ge=18)),
})

v = SchemaValidator(schema)

# Valid input
user = v.validate_python({'name': 'Alice', 'age': 30})
print(user) # {'name': 'Alice', 'age': 30}

Direct JSON Validation

Skip json.loads() and validate raw JSON bytes directly for maximum speed.

from pydantic_core import SchemaValidator, core_schema

v = SchemaValidator(core_schema.list_schema(core_schema.int_schema()))

# Validates JSON bytes directly in Rust
result = v.validate_json(b'[1, 2, "3"]')
print(result) # [1, 2, 3] (coerced to int)

Serialization

Convert complex objects back to JSON-compatible formats.

from pydantic_core import SchemaSerializer, core_schema

schema = core_schema.datetime_schema()
s = SchemaSerializer(schema)

from datetime import datetime
now = datetime.now()

# Serialize to a JSON-compatible string
print(s.to_python(now, mode='json')) # e.g., "2023-10-27T10:00:00"

When to Use

  • Use Pydantic (high-level): For 99% of applications. It provides the BaseModel API, integrates with IDEs, and uses pydantic-core under the hood.
  • Use pydantic-core directly: If you are building a framework, a high-performance data pipeline, or another library where every microsecond counts and you don't need the BaseModel abstraction.

Integration / Stack Compatibility

  • Python: Requires Python 3.10+.
  • Rust: Built with PyO3; binaries are distributed as wheels for most platforms.
  • Ecosystem: Powers FastAPI, Logfire, and thousands of other packages.

Getting Started Pointers

  • [[LINK: core_schema.py]]: Explore all available schema types (int, str, list, union, etc.).
  • [[LINK: _pydantic_core.pyi]]: See the full Python API for SchemaValidator and SchemaSerializer.

Limitations / Assumptions

  • Schema Immutability: Once a SchemaValidator is created, the schema cannot be changed. You must create a new validator for a new schema.
  • Rust Dependency: While users just pip install it, developers contributing to the core logic will need a Rust toolchain.
  • Low-level API: pydantic-core uses raw dictionaries for schemas, which is less ergonomic than the high-level Pydantic classes.

FAQ

  • Is pydantic-core faster than standard json? Yes, validate_json is often faster than json.loads because it validates while parsing, avoiding the creation of intermediate Python objects for invalid data.
  • Can I use it without the main Pydantic package? Absolutely. It is a standalone package.
  • What happens on validation failure? It raises a pydantic_core.ValidationError which contains a .errors() method returning a list of detailed error dictionaries.
  • Does it support custom validators? Yes, via core_schema.with_info_plain_validator_function and similar "function" schema types.