on_page_content
Processes and cleans HTML page content for search indexing when running in a CI environment. It modifies the DOM to ensure proper heading structures, removes UI-specific elements like source code embeds, and populates a global records list with structured data for Algolia indexing.
def on_page_content(
html: string,
page: Page,
config: Config,
files: Files
) - > string
Processes and cleans up page HTML for search indexing during CI builds, ensuring pages have proper heading structures and simplified code blocks. It extracts content sections into Algolia search records while stripping UI elements like line numbers and source code embeds.
Parameters
| Name | Type | Description |
|---|---|---|
| html | string | The raw HTML content of the page to be processed for indexing. |
| page | Page | The MkDocs page object containing metadata such as the title and absolute URL used for record identification. |
| config | Config | The global MkDocs configuration object. |
| files | Files | The collection of files being processed in the current documentation build. |
Returns
| Type | Description |
|---|---|
string | The original HTML string, unmodified by the function's internal transformations which primarily target side-effect record generation. |