Skip to main content

Instance Revalidation Strategies

In pydantic-core, the revalidate_instances setting determines how the validator handles input that is already an instance of the target model or dataclass. This configuration is critical for balancing performance with data integrity, especially in systems where objects may be mutated after their initial creation.

The strategy can be configured globally via CoreConfig or specifically for individual models and dataclasses using the revalidate_instances field in ModelSchema and DataclassSchema.

Revalidation Strategies

There are three primary strategies for instance revalidation, defined by the Literal['always', 'never', 'subclass-instances'] type.

The Default Strategy: never

By default, pydantic-core uses the 'never' strategy. In this mode, if the input to validate_python is already an instance of the expected class (or a subclass), the validator trusts the object and returns it immediately without performing any internal validation.

  • Performance: This is the most performant option as it avoids the overhead of checking fields and creating a new object.
  • Identity: The original object identity is preserved (output is input).
  • Risk: If an instance was mutated after its initial validation (e.g., my_model.some_field = 'invalid_value'), the 'never' strategy will not catch this inconsistency.

The Safety Strategy: always

The 'always' strategy forces the validator to re-examine the internal state of every instance passed to it, regardless of whether it is already an instance of the target class.

When 'always' is enabled, the validator extracts the data from the existing instance (typically using __dict__ and __pydantic_fields_set__) and runs it through the validation logic again.

# Example of 'always' revalidation in ModelSchema
v = SchemaValidator(
core_schema.model_schema(
cls=MyModel,
revalidate_instances='always',
schema=core_schema.model_fields_schema(
fields={
'field_a': core_schema.model_field(schema=core_schema.str_schema()),
'field_b': core_schema.model_field(schema=core_schema.int_schema()),
}
),
)
)

m2 = MyModel(field_a='x', field_b=42)
m3 = v.validate_python(m2)

assert m3 is not m2 # A NEW instance is created
assert m3.field_a == 'x'

As seen in pydantic-core/tests/validators/test_model.py, this strategy ensures that even if m2 was manually altered to an invalid state, v.validate_python(m2) would raise a ValidationError.

The Hybrid Strategy: subclass-instances

The 'subclass-instances' strategy provides a middle ground. It trusts instances that are exactly the class defined in the schema but re-validates any instances that are subclasses.

This is particularly useful for enforcing "type narrowing" or ensuring that a subclass doesn't carry extra state or behavior when a base class is expected.

# Example of 'subclass-instances' behavior
class MyModel: ...
class MySubModel(MyModel): ...

v = SchemaValidator(
core_schema.model_schema(
cls=MyModel,
revalidate_instances='subclass-instances',
schema=...
)
)

m1 = MyModel()
assert v.validate_python(m1) is m1 # Exact class is trusted

m3 = MySubModel()
m4 = v.validate_python(m3)
assert m4 is not m3
assert type(m4) is MyModel # Subclass is coerced to base class

In this mode, as demonstrated in the codebase's tests, the subclass instance is re-validated and then coerced back into the base class type.

Implementation Details

The revalidation logic is integrated into the core validation loop for models and dataclasses.

Configuration Hierarchy

The setting is resolved in the following order:

  1. The revalidate_instances field on the specific ModelSchema or DataclassSchema.
  2. The revalidate_instances field in the CoreConfig passed to the schema.
  3. The default value of 'never'.

Technical Requirements

For revalidation to work (especially for models), the input instance must provide access to its internal state. The validator typically looks for:

  • __dict__: To extract the field values.
  • __pydantic_fields_set__: To determine which fields were explicitly set.
  • __pydantic_extra__: For models that allow extra fields.

If these attributes are missing or malformed, the revalidation process may fail or behave unexpectedly.

Trade-offs and Constraints

Object Identity

A critical side effect of 'always' and 'subclass-instances' (for subclasses) is the loss of object identity. Because the validator creates a new instance of the class after re-validation, input is output will be False. This can be a "gotcha" if your application logic relies on maintaining a reference to a specific object instance.

Performance Overhead

Revalidation is not free. It involves:

  1. Iterating over the object's attributes.
  2. Running each attribute through its respective validator.
  3. Instantiating a new Python object.

In high-throughput systems where objects are known to be immutable or are never mutated after creation, the default 'never' strategy is significantly more efficient.

Coercion in Subclasses

The 'subclass-instances' strategy effectively strips away any subclass-specific data or methods by returning a base class instance. This ensures that the resulting object strictly adheres to the schema of the base class, which is a form of safety when passing data across boundaries where only the base contract is guaranteed.