Skip to main content

ceil_char_boundary

No overview available.

def ceil_char_boundary(
raw_bytes: bytes,
offset: int
) - > int

Finds the nearest character boundary at or after a given byte offset in a UTF-8 encoded byte string. This ensures that slicing operations do not split multi-byte characters, preventing encoding errors.

Parameters

NameTypeDescription
raw_bytesbytesThe UTF-8 encoded byte sequence to be indexed.
offsetintThe starting byte position from which to search for the next valid character boundary.

Returns

TypeDescription
intThe byte index of the first character boundary that is greater than or equal to the provided offset.