Skip to main content

Scalar & Primitive Types

Scalar and primitive types in pydantic-core provide the foundation for data validation. These types are defined using the pydantic_core.core_schema module and validated against Python objects or JSON strings using the SchemaValidator.

Numeric Types

Numeric schemas support standard comparison constraints (ge, gt, le, lt) and divisibility checks (multiple_of).

Integers

The int_schema matches integer values. In lax mode (default), it also accepts strings that can be parsed as integers and floats with no fractional part.

from pydantic_core import SchemaValidator, core_schema as cs

# Integer with constraints
v = SchemaValidator(cs.int_schema(multiple_of=5, ge=0, le=100))

assert v.validate_python(15) == 15
assert v.validate_python('10') == 10
assert v.validate_python(20.0) == 20

Gotcha: Integer Parsing Limit To prevent potential DoS attacks, pydantic-core enforces a maximum size for parsing integers from strings (default 4,300 characters). Exceeding this results in an int_parsing_size error.

# Found in pydantic-core/tests/validators/test_int.py
v = SchemaValidator(cs.int_schema())
# This will raise a ValidationError for exceeding the 4300 character limit
# v.validate_python('1' * 4301)

Floats

The float_schema handles floating-point numbers. It includes an allow_inf_nan option (default True) to control the acceptance of NaN, Infinity, and -Infinity.

v = SchemaValidator(cs.float_schema(gt=0, allow_inf_nan=False))

assert v.validate_python(42.5) == 42.5
assert v.validate_json('12.3') == 12.3
# v.validate_python(float('nan')) # Raises ValidationError: finite_number

Decimals

The decimal_schema validates decimal.Decimal objects. It supports specific constraints for precision: max_digits and decimal_places.

from decimal import Decimal

v = SchemaValidator(cs.decimal_schema(max_digits=5, decimal_places=2))

assert v.validate_python(Decimal('123.45')) == Decimal('123.45')
assert v.validate_python('12.3') == Decimal('12.3')
# v.validate_python('123.456') # Raises ValidationError: decimal_places

Text Types

The str_schema is used for string validation and supports various transformations and pattern matching.

v = SchemaValidator(cs.str_schema(
min_length=3,
max_length=10,
pattern=r'^[a-z]+$',
to_upper=True
))

assert v.validate_python('abc') == 'ABC'

Key Constraints and Transformations:

  • strip_whitespace: Removes leading and trailing whitespace.
  • to_lower / to_upper: Transforms the case of the validated string.
  • pattern: Validates the string against a regular expression.

Gotcha: Constraint Order String length constraints (min_length, max_length) are checked before strip_whitespace is applied. Similarly, the pattern check occurs before to_upper or to_lower transformations.

# Found in pydantic-core/tests/validators/test_string.py
v = SchemaValidator(cs.str_schema(max_length=5, strip_whitespace=True))
# This passes because '1234 ' has length 6, but max_length is checked first?
# Actually, the test shows:
# ({'max_length': 5, 'strip_whitespace': True}, '1234 ', '1234')

Regex Engines: By default, pydantic-core uses the rust-regex engine, which does not support backreferences. If you need advanced regex features, you can switch to the python-re engine.

# Using python-re for backreferences
v = SchemaValidator(cs.str_schema(pattern=r'r(#*)".*?"\1', regex_engine='python-re'))
assert v.validate_python('r#""#') == 'r#""#'

Booleans and Nulls

Booleans

The bool_schema validates boolean values. In lax mode, it performs broad coercion from strings and numbers.

v = SchemaValidator(cs.bool_schema())

# Lax coercions
assert v.validate_python('yes') is True
assert v.validate_python('0') is False
assert v.validate_python(1) is True

# Strict mode
v_strict = SchemaValidator(cs.bool_schema(strict=True))
# v_strict.validate_python('true') # Raises ValidationError: bool_type

Null Values

The none_schema ensures the value is exactly None (in Python) or null (in JSON).

v = SchemaValidator(cs.none_schema())

assert v.validate_python(None) is None
assert v.validate_json('null') is None

Global Configuration

You can influence the behavior of primitive types globally by passing a CoreConfig to the SchemaValidator.

OptionDescription
str_max_lengthSets a global maximum length for all strings.
coerce_numbers_to_strEnables coercion of numeric types to strings in lax mode.
allow_inf_nanGlobally allows or disallows NaN and Infinity for floats.
regex_engineSets the default regex engine (rust-regex or python-re).
from pydantic_core import CoreConfig

config = CoreConfig(str_max_length=10, coerce_numbers_to_str=True)
v = SchemaValidator(cs.str_schema(), config=config)

assert v.validate_python(123) == '123'
# v.validate_python('a' * 11) # Raises ValidationError: string_too_long

Troubleshooting

  • Lax Integer Validation: Integers allow floats in lax mode only if they have no fractional part (e.g., 42.0 is valid, but 42.1 is not).
  • Strict Mode Precedence: If strict=True is set on a schema or during validation, coerce_numbers_to_str is ignored even if enabled in the config.
  • Unicode Errors: str_schema will raise a string_unicode error if the input contains unpaired surrogates that cannot be converted to a valid UTF-8 string.