Skip to main content

Scalar and Primitive Types

Scalar and primitive types in pydantic-core are defined using specialized schema classes in the pydantic_core.core_schema module. These schemas provide the foundation for validating basic Python types like integers, strings, and booleans, offering both "lax" (coercive) and "strict" validation modes.

Numeric Types

The codebase provides four primary numeric schemas: IntSchema, FloatSchema, DecimalSchema, and ComplexSchema. All numeric types support standard comparison constraints: le (less than or equal), ge (greater than or equal), lt (less than), and gt (greater than).

Integer Validation

IntSchema (created via core_schema.int_schema()) is the most common numeric schema. In its default lax mode, it is highly permissive, accepting various types that can be safely converted to integers.

from pydantic_core import SchemaValidator, core_schema as cs

# Lax mode (default)
v = SchemaValidator(cs.int_schema(multiple_of=5))
assert v.validate_python('15') == 15 # String coercion
assert v.validate_python(True) == 1 # Boolean coercion
assert v.validate_python(20.0) == 20 # Whole-number float coercion

# Fails on floats with fractional parts
# v.validate_python(12.5) -> ValidationError

As seen in pydantic-core/tests/validators/test_int.py, the integer validator specifically rejects floats with fractional parts even in lax mode to prevent silent data loss.

Floating Point and Decimals

FloatSchema and DecimalSchema handle real numbers. FloatSchema includes an allow_inf_nan flag (defaulting to True) to control the acceptance of NaN, +inf, and -inf.

DecimalSchema provides additional precision constraints:

  • max_digits: Total number of allowed digits.
  • decimal_places: Maximum number of digits after the decimal point.
from decimal import Decimal
from pydantic_core import SchemaValidator, core_schema as cs

v = SchemaValidator(cs.decimal_schema(max_digits=5, decimal_places=2))
assert v.validate_python('123.45') == Decimal('123.45')

Note that in pydantic-core/python/pydantic_core/core_schema.py, DecimalSchema defaults allow_inf_nan to False, unlike FloatSchema.

Text and Binary Types

Textual data is handled by StringSchema (core_schema.str_schema()) and binary data by BytesSchema (core_schema.bytes_schema()).

String Transformations and Constraints

StringSchema supports several transformations that are applied during validation:

  • strip_whitespace: Removes leading and trailing whitespace.
  • to_lower: Converts the string to lowercase.
  • to_upper: Converts the string to uppercase.

A critical detail in the implementation (found in pydantic-core/tests/validators/test_string.py) is the order of operations. strip_whitespace is applied before length and pattern checks, while to_lower and to_upper are applied after validation.

v = SchemaValidator(cs.str_schema(max_length=5, strip_whitespace=True))
# '1234 ' is stripped to '1234' (length 4), so it passes max_length=5
assert v.validate_python('1234 ') == '1234'

v_regex = SchemaValidator(cs.str_schema(pattern=r'^abc$', to_upper=True))
# 'abc' matches the pattern, then is transformed to 'ABC'
assert v_regex.validate_python('abc') == 'ABC'

Regex Engines

The StringSchema allows selecting between two regex engines via the regex_engine field:

  1. rust-regex (default): Uses the Rust regex crate. It is highly performant and resistant to ReDoS (Regular Expression Denial of Service) but does not support features like backreferences or lookarounds.
  2. python-re: Uses Python's standard re module, supporting the full range of Python regex features at the cost of potential performance and security trade-offs.

Boolean and Special Types

Boolean Permissiveness

BoolSchema is notably permissive in lax mode. It accepts not just True/False, but also strings and numbers that represent truthiness in common data formats.

According to pydantic-core/tests/validators/test_bool.py, the following are accepted in lax mode:

  • True values: True, 1, 1.0, 'true', 'yes', 'on'.
  • False values: False, 0, 0.0, 'false', 'no', 'off'.

Any and None

  • AnySchema: Matches any Python object without modification. It is the most basic schema and performs no validation logic.
  • NoneSchema: Specifically matches the Python None value (or null in JSON).

Enumerations

EnumSchema bridges Python's enum.Enum classes with pydantic-core validation. It requires the original class (cls) and a list of its members.

from enum import Enum
from pydantic_core import SchemaValidator, core_schema as cs

class Status(Enum):
ACTIVE = 'active'
INACTIVE = 'inactive'

schema = cs.enum_schema(Status, list(Status.__members__.values()))
v = SchemaValidator(schema)

assert v.validate_python('active') is Status.ACTIVE

The EnumSchema also supports a sub_type (one of 'str', 'int', or 'float'). When a sub_type is provided, the validator will attempt to coerce the input to that type before looking up the enum member, allowing for more flexible input (e.g., accepting the string '1' for an IntEnum member with value 1).

Validation Modes: Lax vs Strict

Most scalar schemas include a strict field.

  • Lax Mode (default): The validator attempts to coerce the input to the target type (e.g., string to integer, integer to boolean).
  • Strict Mode: The validator requires the input to be of the exact expected type.

You can enable strict mode globally in the CoreConfig or on a per-schema basis:

# Per-schema strict mode
v = SchemaValidator(cs.int_schema(strict=True))
# v.validate_python('123') -> Raises ValidationError
assert v.validate_python(123) == 123

In pydantic-core/tests/validators/test_enums.py, strict mode for enums is shown to require an actual instance of the Enum class unless validating from JSON, where it allows the raw value if it matches a member.