Skip to main content

Specialized and Recursive Types

Support for specialized formats like URLs and UUIDs, as well as recursive schema definitions, is provided through specific schema types in pydantic_core.core_schema. These types ensure that input data is not only validated but also parsed into useful Python objects.

Validating and Parsing URLs

To validate strings as URLs and parse them into specialized Url objects, use url_schema or multi_host_url_schema.

from pydantic_core import SchemaValidator, core_schema, Url

# Define a schema for a standard URL
v = SchemaValidator(core_schema.url_schema(
allowed_schemes=['http', 'https'],
max_length=100
))

# Validate and parse
url = v.validate_python('https://example.com/foo/bar?baz=qux#quux')

assert isinstance(url, Url)
assert url.scheme == 'https'
assert url.host == 'example.com'
assert url.port == 443
assert url.path == '/foo/bar'
assert url.query == 'baz=qux'
assert url.fragment == 'quux'
assert url.query_params() == [('baz', 'qux')]

Multi-Host URLs

For connection strings that support multiple hosts (e.g., Redis or MongoDB), use multi_host_url_schema. The resulting MultiHostUrl object provides a hosts() method to access all defined hosts.

from pydantic_core import SchemaValidator, core_schema, MultiHostUrl

v = SchemaValidator(core_schema.multi_host_url_schema())
url = v.validate_python('redis://host1:6379,host2:6380/0')

assert isinstance(url, MultiHostUrl)
assert url.scheme == 'redis'
assert url.hosts() == [
{'host': 'host1', 'port': 6379, 'username': None, 'password': None},
{'host': 'host2', 'port': 6380, 'username': None, 'password': None},
]

Validating UUIDs

The uuid_schema validates input as a UUID and returns a standard Python uuid.UUID object. You can enforce specific UUID versions or use strict mode to require an actual UUID instance.

import uuid
from pydantic_core import SchemaValidator, core_schema

# Require a version 4 UUID
v = SchemaValidator(core_schema.uuid_schema(version=4))

# Validates strings, bytes, or UUID objects
res = v.validate_python('0e7ac198-9acd-4c0c-b4b4-761974bf71d7')
assert isinstance(res, uuid.UUID)
assert res.version == 4

# Strict mode requires an actual UUID instance
v_strict = SchemaValidator(core_schema.uuid_schema(strict=True))
# This would raise a ValidationError:
# v_strict.validate_python('0e7ac198-9acd-4c0c-b4b4-761974bf71d7')

Defining Recursive Schemas

Recursive structures (like trees or linked lists) are defined using definitions_schema and definition_reference_schema. This approach prevents infinite recursion during schema construction by using named references.

from pydantic_core import SchemaValidator, core_schema

# Define a recursive "Branch" structure
# Each branch has a name and an optional sub-branch of the same type
schema = core_schema.definitions_schema(
# The main schema is a reference to the definition
core_schema.definition_reference_schema('Branch'),
[
# The actual definition of 'Branch'
core_schema.typed_dict_schema(
{
'name': core_schema.typed_dict_field(core_schema.str_schema()),
'sub_branch': core_schema.typed_dict_field(
core_schema.with_default_schema(
core_schema.nullable_schema(
core_schema.definition_reference_schema('Branch')
),
default=None,
)
),
},
ref='Branch',
)
],
)

v = SchemaValidator(schema)

# Validating a nested structure
data = {
'name': 'root',
'sub_branch': {
'name': 'child',
'sub_branch': {'name': 'grandchild'}
}
}
result = v.validate_python(data)
assert result['sub_branch']['sub_branch']['name'] == 'grandchild'

Handling Cyclic Data

While pydantic-core supports recursive schemas, it detects and prevents infinite loops in the input data itself. If a data structure references itself (a cycle), validation will fail with a recursion_loop error.

# Create cyclic data
cyclic_data = {'name': 'loop'}
cyclic_data['sub_branch'] = cyclic_data

try:
v.validate_python(cyclic_data)
except Exception as e:
# Raises ValidationError with type 'recursion_loop'
assert 'recursion_loop' in str(e)

Troubleshooting and Gotchas

  • URL Path Preservation: By default, url_schema may normalize empty paths. Use preserve_empty_path=True in the schema or url_preserve_empty_path=True in CoreConfig if you need to distinguish between http://example.com and http://example.com/.
  • UUID Versioning: uuid_schema(version=X) only works for RFC 4122 UUIDs. If you are using non-standard UUIDs that Python's uuid module assigns version=None, validation will fail if a version is specified.
  • Missing Definitions: Every definition_reference_schema must have a corresponding schema with a matching ref inside the definitions_schema list. Failure to provide the definition will result in a SchemaError during SchemaValidator initialization.