Specialized and Recursive Types
Support for specialized formats like URLs and UUIDs, as well as recursive schema definitions, is provided through specific schema types in pydantic_core.core_schema. These types ensure that input data is not only validated but also parsed into useful Python objects.
Validating and Parsing URLs
To validate strings as URLs and parse them into specialized Url objects, use url_schema or multi_host_url_schema.
from pydantic_core import SchemaValidator, core_schema, Url
# Define a schema for a standard URL
v = SchemaValidator(core_schema.url_schema(
allowed_schemes=['http', 'https'],
max_length=100
))
# Validate and parse
url = v.validate_python('https://example.com/foo/bar?baz=qux#quux')
assert isinstance(url, Url)
assert url.scheme == 'https'
assert url.host == 'example.com'
assert url.port == 443
assert url.path == '/foo/bar'
assert url.query == 'baz=qux'
assert url.fragment == 'quux'
assert url.query_params() == [('baz', 'qux')]
Multi-Host URLs
For connection strings that support multiple hosts (e.g., Redis or MongoDB), use multi_host_url_schema. The resulting MultiHostUrl object provides a hosts() method to access all defined hosts.
from pydantic_core import SchemaValidator, core_schema, MultiHostUrl
v = SchemaValidator(core_schema.multi_host_url_schema())
url = v.validate_python('redis://host1:6379,host2:6380/0')
assert isinstance(url, MultiHostUrl)
assert url.scheme == 'redis'
assert url.hosts() == [
{'host': 'host1', 'port': 6379, 'username': None, 'password': None},
{'host': 'host2', 'port': 6380, 'username': None, 'password': None},
]
Validating UUIDs
The uuid_schema validates input as a UUID and returns a standard Python uuid.UUID object. You can enforce specific UUID versions or use strict mode to require an actual UUID instance.
import uuid
from pydantic_core import SchemaValidator, core_schema
# Require a version 4 UUID
v = SchemaValidator(core_schema.uuid_schema(version=4))
# Validates strings, bytes, or UUID objects
res = v.validate_python('0e7ac198-9acd-4c0c-b4b4-761974bf71d7')
assert isinstance(res, uuid.UUID)
assert res.version == 4
# Strict mode requires an actual UUID instance
v_strict = SchemaValidator(core_schema.uuid_schema(strict=True))
# This would raise a ValidationError:
# v_strict.validate_python('0e7ac198-9acd-4c0c-b4b4-761974bf71d7')
Defining Recursive Schemas
Recursive structures (like trees or linked lists) are defined using definitions_schema and definition_reference_schema. This approach prevents infinite recursion during schema construction by using named references.
from pydantic_core import SchemaValidator, core_schema
# Define a recursive "Branch" structure
# Each branch has a name and an optional sub-branch of the same type
schema = core_schema.definitions_schema(
# The main schema is a reference to the definition
core_schema.definition_reference_schema('Branch'),
[
# The actual definition of 'Branch'
core_schema.typed_dict_schema(
{
'name': core_schema.typed_dict_field(core_schema.str_schema()),
'sub_branch': core_schema.typed_dict_field(
core_schema.with_default_schema(
core_schema.nullable_schema(
core_schema.definition_reference_schema('Branch')
),
default=None,
)
),
},
ref='Branch',
)
],
)
v = SchemaValidator(schema)
# Validating a nested structure
data = {
'name': 'root',
'sub_branch': {
'name': 'child',
'sub_branch': {'name': 'grandchild'}
}
}
result = v.validate_python(data)
assert result['sub_branch']['sub_branch']['name'] == 'grandchild'
Handling Cyclic Data
While pydantic-core supports recursive schemas, it detects and prevents infinite loops in the input data itself. If a data structure references itself (a cycle), validation will fail with a recursion_loop error.
# Create cyclic data
cyclic_data = {'name': 'loop'}
cyclic_data['sub_branch'] = cyclic_data
try:
v.validate_python(cyclic_data)
except Exception as e:
# Raises ValidationError with type 'recursion_loop'
assert 'recursion_loop' in str(e)
Troubleshooting and Gotchas
- URL Path Preservation: By default,
url_schemamay normalize empty paths. Usepreserve_empty_path=Truein the schema orurl_preserve_empty_path=TrueinCoreConfigif you need to distinguish betweenhttp://example.comandhttp://example.com/. - UUID Versioning:
uuid_schema(version=X)only works for RFC 4122 UUIDs. If you are using non-standard UUIDs that Python'suuidmodule assignsversion=None, validation will fail if a version is specified. - Missing Definitions: Every
definition_reference_schemamust have a corresponding schema with a matchingrefinside thedefinitions_schemalist. Failure to provide the definition will result in aSchemaErrorduringSchemaValidatorinitialization.