Serialization
Serialization in this codebase is the process of converting a validated BaseModel instance back into a Python dictionary or a JSON-compatible string. Unlike validation, which ensures data conforms to a schema, serialization allows you to control the output format, filter fields, and transform values for downstream consumption.
The serialization system is powered by pydantic-core, providing high-performance conversion with extensive customization hooks at the model, field, and type levels.
Primary Serialization Methods
The BaseModel class in pydantic/main.py provides two primary methods for serialization: model_dump and model_dump_json.
model_dump
The model_dump method converts a model instance into a Python dictionary. It supports two primary modes:
mode='python'(default): Returns a dictionary where values are Python objects (e.g.,datetimeobjects remain asdatetime).mode='json': Returns a dictionary where values are converted to JSON-compatible types (e.g.,datetimeobjects become ISO-8601 strings).
from pydantic import BaseModel
from datetime import datetime
class User(BaseModel):
name: str
created_at: datetime
user = User(name="Alice", created_at=datetime(2024, 1, 1))
# Python mode (default)
print(user.model_dump())
#> {'name': 'Alice', 'created_at': datetime.datetime(2024, 1, 1, 0, 0)}
# JSON mode
print(user.model_dump(mode='json'))
#> {'name': 'Alice', 'created_at': '2024-01-01T00:00:00'}
model_dump_json
The model_dump_json method returns a JSON string representation of the model. It is more efficient than calling json.dumps(model.model_dump(mode='json')) because it uses the optimized pydantic-core serializer directly.
print(user.model_dump_json(indent=2))
# {
# "name": "Alice",
# "created_at": "2024-01-01T00:00:00"
# }
Functional Serializers
For custom logic that applies to specific fields or the entire model, Pydantic provides decorators in pydantic/functional_serializers.py.
Field Serializers
The @field_serializer decorator allows you to customize how specific fields are serialized. It supports two modes:
plain: The decorated function completely replaces the default serialization logic.wrap: The decorated function receives a handler to call the default logic, allowing you to transform the result.
from pydantic import BaseModel, field_serializer
class StudentModel(BaseModel):
name: str = 'Jane'
courses: set[str]
@field_serializer('courses', when_used='json')
def serialize_courses_in_order(self, courses: set[str]):
# Convert set to a sorted list specifically for JSON output
return sorted(courses)
Model Serializers
The @model_serializer decorator provides total control over the serialization of the entire model. This is useful when the output structure needs to differ significantly from the model's internal field structure.
from typing import Literal
from pydantic import BaseModel, model_serializer
class TemperatureModel(BaseModel):
unit: Literal['C', 'F']
value: int
@model_serializer()
def serialize_model(self):
# Always serialize to Celsius regardless of internal state
if self.unit == 'F':
return {'unit': 'C', 'value': int((self.value - 32) / 1.8)}
return {'unit': self.unit, 'value': self.value}
Annotation-Based Serialization
You can also define serialization logic directly within type annotations using Annotated. This is ideal for creating reusable custom types.
PlainSerializer and WrapSerializer
These classes in pydantic/functional_serializers.py mirror the behavior of the functional decorators but are used within type hints.
from typing import Annotated
from pydantic import BaseModel, PlainSerializer
# A reusable type that serializes a list into a space-separated string
CustomStr = Annotated[
list[str],
PlainSerializer(lambda x: ' '.join(x), return_type=str)
]
class StudentModel(BaseModel):
courses: CustomStr
student = StudentModel(courses=['Math', 'Chemistry'])
print(student.model_dump())
#> {'courses': 'Math Chemistry'}
SerializeAsAny
By default, Pydantic serializes fields based on their annotated type. If a field contains a subclass instance, the extra fields of the subclass are typically lost during serialization. SerializeAsAny forces "duck-typing" serialization, preserving the actual runtime type of the object.
from pydantic import BaseModel, SerializeAsAny
class Parent(BaseModel):
x: int
class Child(Parent):
y: int
class Model(BaseModel):
# Without SerializeAsAny, 'y' would be lost if a Child is passed
data: SerializeAsAny[Parent]
m = Model(data=Child(x=1, y=2))
print(m.model_dump())
#> {'data': {'x': 1, 'y': 2}}
Global Configuration
The ConfigDict in pydantic/config.py contains several settings that influence serialization behavior globally for a model:
ser_json_temporal: Controls the format ofdatetime,date,time, andtimedeltain JSON. Options include'iso8601','seconds', and'milliseconds'.ser_json_bytes: Determines howbytesare encoded in JSON (e.g.,'utf8','base64', or'hex').ser_json_inf_nan: Controls how infinity and NaN float values are represented in JSON ('null','constants', or'strings').use_enum_values: WhenTrue, enums are serialized as their raw values rather than the enum objects.
from pydantic import BaseModel, ConfigDict
class MyModel(BaseModel):
model_config = ConfigDict(
ser_json_bytes='base64',
ser_json_temporal='milliseconds'
)
data: bytes
timestamp: datetime
Filtering and Aliases
Both model_dump and model_dump_json accept parameters to control the output:
include/exclude: Specify exactly which fields should be present in the output.by_alias: IfTrue, uses the field's alias (defined inField(alias=...)) as the key in the resulting dictionary/JSON.exclude_unset: Only includes fields that were explicitly set during model instantiation.exclude_none: Removes fields that have a value ofNone.