Skip to main content

Filtering Collection Serialization

To control which items are exported during serialization of collections like lists, tuples, or dictionaries, you can use the serialization parameter in their respective core schemas. This is achieved using filter_seq_schema for sequences and filter_dict_schema for dictionaries.

Filtering Sequences

You can filter sequences (lists, tuples, sets, and frozensets) by specifying a set of integer indices to include or exclude. This is defined using IncExSeqSerSchema via the filter_seq_schema helper.

from pydantic_core import SchemaSerializer, core_schema

# Only include elements at indices 1, 3, and 5
v = SchemaSerializer(
core_schema.list_schema(
core_schema.any_schema(),
serialization=core_schema.filter_seq_schema(include={1, 3, 5})
)
)

assert v.to_python([0, 1, 2, 3, 4, 5, 6]) == [1, 3, 5]

Combining Include and Exclude

When both include and exclude are provided, the serializer first filters for included indices and then removes any that are also in the exclude set.

from pydantic_core import SchemaSerializer, core_schema

v = SchemaSerializer(
core_schema.list_schema(
core_schema.any_schema(),
serialization=core_schema.filter_seq_schema(
include={1, 3, 5},
exclude={5, 6}
)
)
)

# Index 5 is included but then excluded
assert v.to_python([0, 1, 2, 3, 4, 5, 6, 7]) == [1, 3]

Filtering Dictionaries

For dictionaries, you filter by keys (strings or integers) using IncExDictSerSchema via the filter_dict_schema helper.

from pydantic_core import SchemaSerializer, core_schema

# Only include keys 'a' and 'c'
s = SchemaSerializer(
core_schema.dict_schema(
serialization=core_schema.filter_dict_schema(include={'a', 'c'})
)
)

assert s.to_python({'a': 1, 'b': 2, 'c': 3, 'd': 4}) == {'a': 1, 'c': 3}

Schema vs. Runtime Filtering

Filters defined in the CoreSchema provide default behavior, but they can be overridden or extended during the call to to_python or to_json.

Union Behavior

When a runtime include argument is provided, it can "bring back" items that were excluded by the schema-level filter. The runtime arguments and schema filters are combined using a union-like logic where the runtime preference typically takes precedence.

from pydantic_core import SchemaSerializer, core_schema

v = SchemaSerializer(
core_schema.list_schema(
core_schema.any_schema(),
serialization=core_schema.filter_seq_schema(exclude={0, 1})
)
)

# Default behavior: indices 0 and 1 are excluded
assert v.to_python([0, 1, 2, 3]) == [2, 3]

# Runtime 'include' overrides the schema-level 'exclude'
assert v.to_python([0, 1, 2, 3], include={1, 2}) == [1, 2]

Advanced Patterns

Using the __all__ Keyword

In dictionary filtering, the special key '__all__' can be used in the exclude set to exclude all items by default, or in nested filtering scenarios.

from pydantic_core import SchemaSerializer, core_schema

s = SchemaSerializer(core_schema.dict_schema())

# Exclude everything using the runtime argument
assert s.to_python({'a': 1, 'b': 2}, exclude={'__all__'}) == {}

Nested Filtering

Runtime arguments allow for complex nested filtering that goes beyond simple sets of indices or keys. You can pass dictionaries to include or exclude to filter deep into the data structure.

from pydantic_core import SchemaSerializer, core_schema

# A list of dictionaries
v = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
s = SchemaSerializer(core_schema.list_schema(core_schema.dict_schema()))

# Exclude key 'a' from all elements in the list
assert s.to_python(v, exclude={'__all__': {'a'}}) == [{'b': 2}, {'b': 4}]

Troubleshooting

Negative Indexing

While runtime include and exclude arguments for sequences support negative indexing (e.g., -1 for the last element), the schema-level filter_seq_schema requires absolute integer indices.

Type Consistency

Ensure that the types passed to the filter schemas match the collection type:

  • filter_seq_schema expects a set[int].
  • filter_dict_schema expects a set[int | str].

Passing incorrect types (like a string key to a sequence filter) will result in validation errors during schema construction or unexpected serialization results.