Skip to main content

get_heading_text

Extracts the text content from a BeautifulSoup Tag object, removing paragraph symbols, stripping whitespace, and replacing newlines with spaces.

def get_heading_text(
heading: Tag
) - > string

Extracts and cleans the text content from a BeautifulSoup Tag object by removing paragraph symbols, stripping whitespace, and normalizing newlines.

Parameters

NameTypeDescription
headingTagThe BeautifulSoup Tag object representing an HTML heading element to be processed.

Returns

TypeDescription
stringThe sanitized heading text with paragraph markers removed and internal newlines replaced by spaces.