Skip to main content

Global Schema Configuration

CoreConfig is the central configuration object for both SchemaValidator and SchemaSerializer. It defines global behaviors for validation and serialization that apply across the entire schema unless overridden by specific field-level settings.

In this codebase, CoreConfig is implemented as a TypedDict in pydantic_core.core_schema. It is typically passed as the config argument when initializing a validator or serializer.

String Handling

CoreConfig provides several options to control how strings are validated and transformed globally. These settings are particularly useful for enforcing consistent data hygiene across an entire application.

Constraints and Transformations

You can set global limits on string length and apply automatic transformations like whitespace stripping or case conversion.

from pydantic_core import CoreConfig, SchemaValidator, core_schema as cs

# Global configuration for string handling
config = CoreConfig(
str_max_length=10,
str_strip_whitespace=True,
str_to_lower=True
)

v = SchemaValidator(cs.str_schema(), config=config)

# Whitespace is stripped and string is lowercased
assert v.validate_python(' EXAMPLE ') == 'example'

# Global length constraint is enforced
try:
v.validate_python('this string is too long')
except Exception as e:
print(e) # String should have at most 10 characters

Regex Engine

The regex_engine setting determines which engine is used for pattern validation. By default, pydantic-core uses rust-regex, which is highly performant but does not support features like backreferences. If your patterns require advanced features, you can switch to python-re.

# Using python-re to support backreferences
pattern = r'r(#*)".*?"\1'
config = CoreConfig(regex_engine='python-re')

v = SchemaValidator(
schema=cs.str_schema(pattern=pattern),
config=config
)
assert v.validate_python('r"foo"') == 'r"foo"'

Numeric Coercion and Constraints

Configuration options in CoreConfig also manage how numeric types are handled during validation, specifically regarding coercion and special float values.

Coercing Numbers to Strings

The coerce_numbers_to_str option allows types like int, float, and Decimal to be automatically converted to strings. Note that this coercion is ignored in strict mode.

config = CoreConfig(coerce_numbers_to_str=True)
v = SchemaValidator(cs.str_schema(), config=config)

assert v.validate_python(42) == '42'
assert v.validate_python(42.0) == '42.0'

Handling Infinity and NaN

By default, allow_inf_nan is True, allowing inf, -inf, and NaN values for float fields. Setting this to False forces validation to fail for these values.

config = CoreConfig(allow_inf_nan=False)
v = SchemaValidator(cs.float_schema(), config=config)

# This will raise a ValidationError: Input should be a finite number
# v.validate_python(float('nan'))

Serialization Behavior

CoreConfig controls how data is transformed into JSON-compatible formats. These settings are primarily used by SchemaSerializer.

  • ser_json_bytes: Controls how bytes are serialized. Options include 'utf8', 'base64', and 'hex'.
  • ser_json_inf_nan: Controls how non-finite floats are represented in JSON. Options include 'null', 'constants' (e.g., NaN), and 'strings'.
  • ser_json_temporal: Controls serialization for datetime, date, time, and timedelta. Options include 'iso8601', 'seconds', and 'milliseconds'.

Note: ser_json_temporal takes precedence over ser_json_timedelta if both are provided.

Model and Data Structure Behavior

Several settings in CoreConfig dictate how complex structures like models, dataclasses, and typed dicts behave.

Extra Fields

The extra_fields_behavior (of type ExtraBehavior) determines what happens when a dictionary contains keys not defined in the schema:

  • 'ignore': Extra fields are silently dropped (default).
  • 'allow': Extra fields are kept and validated if a catch-all validator exists.
  • 'forbid': Validation fails if extra fields are present.

Instance Revalidation

The revalidate_instances setting controls whether existing model or dataclass instances should be re-validated when passed to a validator.

  • 'never': Instances are returned as-is (default).
  • 'always': Instances are always re-validated.
  • 'subclass-instances': Only instances of subclasses are re-validated.

Precedence and Strict Mode

Understanding how CoreConfig interacts with other settings is crucial for predictable behavior:

  1. Field Priority: Settings defined directly on a schema (e.g., max_length inside cs.str_schema()) always take precedence over the global CoreConfig.
  2. Strict Mode: When strict=True is set (either in CoreConfig or during a validate_python call), many coercion behaviors like coerce_numbers_to_str are disabled, even if explicitly enabled in the config.
  3. Error Representation: The hide_input_in_errors setting can be used to prevent sensitive input data from appearing in ValidationError messages, which is useful for production logging.
config = CoreConfig(hide_input_in_errors=True)
v = SchemaValidator(cs.str_schema(), config=config)

# Error message will not include the input value '123'
# ValidationError: Input should be a valid string [type=string_type]