Specialized Identifier Types
This guide demonstrates how to use specialized identifier types like UUIDs and URLs within your validation schemas. These types provide built-in format validation and normalization beyond simple string matching.
Validating UUIDs
To validate UUID strings or objects, use the uuid_schema function. By default, it accepts any valid UUID format and returns a standard Python uuid.UUID object.
from pydantic_core import SchemaValidator, core_schema
from uuid import UUID
# Basic UUID validation
v = SchemaValidator(core_schema.uuid_schema())
# Validates strings
result = v.validate_python('a6cc5730-2261-11ee-9c43-2eb5a363657c')
assert isinstance(result, UUID)
# Validates existing UUID objects
result = v.validate_python(UUID('a6cc5730-2261-11ee-9c43-2eb5a363657c'))
assert result == UUID('a6cc5730-2261-11ee-9c43-2eb5a363657c')
Enforcing UUID Versions
You can restrict validation to specific RFC 4122 versions (1, 3, 4, 5, 6, 7, or 8) using the version parameter.
from pydantic_core import SchemaValidator, core_schema
# Enforce UUID version 4
v = SchemaValidator(core_schema.uuid_schema(version=4))
# This succeeds (valid v4)
v.validate_python('0e7ac198-9acd-4c0c-b4b4-761974bf71d7')
# This fails (valid v1, but v4 expected)
# Raises ValidationError: UUID version 4 expected
try:
v.validate_python('a6cc5730-2261-11ee-9c43-2eb5a363657c')
except Exception as e:
print(e)
Strict vs. Lax Validation
In lax mode (default), uuid_schema accepts strings, bytes, or UUID objects. In strict mode, only uuid.UUID instances are accepted.
v = SchemaValidator(core_schema.uuid_schema(strict=True))
# Fails in strict mode even if the string is a valid UUID
# Raises ValidationError: Input should be an instance of UUID
v.validate_python('a6cc5730-2261-11ee-9c43-2eb5a363657c')
Validating URLs
The url_schema provides robust validation for web addresses and other URI formats. It returns a specialized Url object that allows easy access to components.
from pydantic_core import SchemaValidator, core_schema
v = SchemaValidator(core_schema.url_schema(allowed_schemes=['http', 'https']))
url = v.validate_python('https://user:pass@example.com:8080/path?query=1#frag')
assert url.scheme == 'https'
assert url.host == 'example.com'
assert url.port == 8080
assert url.username == 'user'
assert url.password == 'pass'
assert url.path == '/path'
assert url.query == 'query=1'
assert url.fragment == 'frag'
URL Constraints and Normalization
You can configure length limits, required hosts, and how empty paths are handled.
v = SchemaValidator(core_schema.url_schema(
max_length=30,
host_required=True,
preserve_empty_path=True
))
# Normalization: By default, 'https://example.com' becomes 'https://example.com/'
# With preserve_empty_path=True, it remains 'https://example.com'
url = v.validate_python('https://example.com')
assert str(url) == 'https://example.com'
Handling Multi-Host URLs
For database connection strings or cluster addresses that support multiple hosts (e.g., Postgres or MongoDB), use multi_host_url_schema. This returns a MultiHostUrl object.
from pydantic_core import SchemaValidator, core_schema
v = SchemaValidator(core_schema.multi_host_url_schema())
# Typical database cluster string
dsn = 'postgres://user:pass@host1:5432,host2:6432/mydb'
url = v.validate_python(dsn)
assert url.scheme == 'postgres'
assert url.path == '/mydb'
# Access individual host details
hosts = url.hosts()
assert len(hosts) == 2
assert hosts[0] == {'username': 'user', 'password': 'pass', 'host': 'host1', 'port': 5432}
assert hosts[1] == {'username': None, 'password': None, 'host': 'host2', 'port': 6432}
Troubleshooting and Gotchas
UUID Version Mismatches
If you specify a version in uuid_schema, the input must strictly follow the RFC 4122 bit patterns for that version. Python's uuid.UUID implementation may return None for the .version attribute if the UUID does not follow RFC 4122 (e.g., all zeros or custom formats). These will fail validation if a specific version is required.
URL Path Normalization
By default, url_schema converts empty paths to /. If your application distinguishes between https://example.com and https://example.com/, ensure you set preserve_empty_path=True in your schema.
Multi-Host vs. Standard URLs
A standard url_schema will fail if it encounters a comma-separated list of hosts. If you expect cluster-style URLs, always use multi_host_url_schema. Conversely, multi_host_url_schema can handle single-host URLs perfectly fine, making it a safer choice for database DSNs.
# Standard URL schema fails on multiple hosts
v_single = SchemaValidator(core_schema.url_schema())
# Raises ValidationError: unencoded @ sign in username or password (parsing error)
v_single.validate_python('postgres://host1,host2/db')