Skip to main content

Serialization

Serialization in this codebase is the process of converting a validated BaseModel instance back into a Python dictionary or a JSON-compatible string. Unlike validation, which ensures data conforms to a schema, serialization allows you to control the output format, filter fields, and transform values for downstream consumption.

The serialization system is powered by pydantic-core, providing high-performance conversion with extensive customization hooks at the model, field, and type levels.

Primary Serialization Methods

The BaseModel class in pydantic/main.py provides two primary methods for serialization: model_dump and model_dump_json.

model_dump

The model_dump method converts a model instance into a Python dictionary. It supports two primary modes:

  • mode='python' (default): Returns a dictionary where values are Python objects (e.g., datetime objects remain as datetime).
  • mode='json': Returns a dictionary where values are converted to JSON-compatible types (e.g., datetime objects become ISO-8601 strings).
from pydantic import BaseModel
from datetime import datetime

class User(BaseModel):
name: str
created_at: datetime

user = User(name="Alice", created_at=datetime(2024, 1, 1))

# Python mode (default)
print(user.model_dump())
#> {'name': 'Alice', 'created_at': datetime.datetime(2024, 1, 1, 0, 0)}

# JSON mode
print(user.model_dump(mode='json'))
#> {'name': 'Alice', 'created_at': '2024-01-01T00:00:00'}

model_dump_json

The model_dump_json method returns a JSON string representation of the model. It is more efficient than calling json.dumps(model.model_dump(mode='json')) because it uses the optimized pydantic-core serializer directly.

print(user.model_dump_json(indent=2))
# {
# "name": "Alice",
# "created_at": "2024-01-01T00:00:00"
# }

Functional Serializers

For custom logic that applies to specific fields or the entire model, Pydantic provides decorators in pydantic/functional_serializers.py.

Field Serializers

The @field_serializer decorator allows you to customize how specific fields are serialized. It supports two modes:

  • plain: The decorated function completely replaces the default serialization logic.
  • wrap: The decorated function receives a handler to call the default logic, allowing you to transform the result.
from pydantic import BaseModel, field_serializer

class StudentModel(BaseModel):
name: str = 'Jane'
courses: set[str]

@field_serializer('courses', when_used='json')
def serialize_courses_in_order(self, courses: set[str]):
# Convert set to a sorted list specifically for JSON output
return sorted(courses)

Model Serializers

The @model_serializer decorator provides total control over the serialization of the entire model. This is useful when the output structure needs to differ significantly from the model's internal field structure.

from typing import Literal
from pydantic import BaseModel, model_serializer

class TemperatureModel(BaseModel):
unit: Literal['C', 'F']
value: int

@model_serializer()
def serialize_model(self):
# Always serialize to Celsius regardless of internal state
if self.unit == 'F':
return {'unit': 'C', 'value': int((self.value - 32) / 1.8)}
return {'unit': self.unit, 'value': self.value}

Annotation-Based Serialization

You can also define serialization logic directly within type annotations using Annotated. This is ideal for creating reusable custom types.

PlainSerializer and WrapSerializer

These classes in pydantic/functional_serializers.py mirror the behavior of the functional decorators but are used within type hints.

from typing import Annotated
from pydantic import BaseModel, PlainSerializer

# A reusable type that serializes a list into a space-separated string
CustomStr = Annotated[
list[str],
PlainSerializer(lambda x: ' '.join(x), return_type=str)
]

class StudentModel(BaseModel):
courses: CustomStr

student = StudentModel(courses=['Math', 'Chemistry'])
print(student.model_dump())
#> {'courses': 'Math Chemistry'}

SerializeAsAny

By default, Pydantic serializes fields based on their annotated type. If a field contains a subclass instance, the extra fields of the subclass are typically lost during serialization. SerializeAsAny forces "duck-typing" serialization, preserving the actual runtime type of the object.

from pydantic import BaseModel, SerializeAsAny

class Parent(BaseModel):
x: int

class Child(Parent):
y: int

class Model(BaseModel):
# Without SerializeAsAny, 'y' would be lost if a Child is passed
data: SerializeAsAny[Parent]

m = Model(data=Child(x=1, y=2))
print(m.model_dump())
#> {'data': {'x': 1, 'y': 2}}

Global Configuration

The ConfigDict in pydantic/config.py contains several settings that influence serialization behavior globally for a model:

  • ser_json_temporal: Controls the format of datetime, date, time, and timedelta in JSON. Options include 'iso8601', 'seconds', and 'milliseconds'.
  • ser_json_bytes: Determines how bytes are encoded in JSON (e.g., 'utf8', 'base64', or 'hex').
  • ser_json_inf_nan: Controls how infinity and NaN float values are represented in JSON ('null', 'constants', or 'strings').
  • use_enum_values: When True, enums are serialized as their raw values rather than the enum objects.
from pydantic import BaseModel, ConfigDict

class MyModel(BaseModel):
model_config = ConfigDict(
ser_json_bytes='base64',
ser_json_temporal='milliseconds'
)
data: bytes
timestamp: datetime

Filtering and Aliases

Both model_dump and model_dump_json accept parameters to control the output:

  • include/exclude: Specify exactly which fields should be present in the output.
  • by_alias: If True, uses the field's alias (defined in Field(alias=...)) as the key in the resulting dictionary/JSON.
  • exclude_unset: Only includes fields that were explicitly set during model instantiation.
  • exclude_none: Removes fields that have a value of None.