Custom Templates

pandas-cat uses Jinja2 to render HTML reports. A single unified renderer handles all template types. The rendering mode (static or interactive) is auto-detected from a tag inside the template file — no separate renderer flag is needed.

All Jinja2 environments use autoescape=True. User-supplied strings (column names, dataset title) are escaped at render time — do not mark them safe again with | safe unless you intend to inject raw HTML.

Template lookup

template=

File loaded

Mode

Detected via

None / 'default'

templates/default.html.j2

static

(no tag — default)

'modern'

templates/modern.html.j2

static

(no tag — default)

'interactive'

templates/interactive/interactive.html.j2

interactive

{# pandas-cat: mode=interactive #}

Any file-system path

that file

auto-detected

tag in file, or static if absent

Mode declaration tag

Add this Jinja2 comment anywhere in your template to opt into interactive mode (raw data, no SVG pre-rendering):

{# pandas-cat: mode=interactive #}

Omitting the tag causes the engine to treat the template as static (SVG charts are pre-rendered and embedded as base64 strings before the template receives the context).

Creating a custom template

  1. Write a .html.j2 file anywhere on the file system.

  2. Add {# pandas-cat: mode=interactive #} near the top if you want the interactive context (raw data); omit it for the static context (SVG charts).

  3. Pass the path to profile():

    pandas_cat.profile(df, template="/path/to/my_report.html.j2")
    

    A relative path is resolved from the current working directory. A ValueError is raised if the file does not exist.

  4. The template receives the full context for its mode as keyword arguments (see the reference sections below). Extra context keys are silently ignored by Jinja2, so your template only needs to reference the variables it uses.

Note

The FileSystemLoader root for a custom template is the directory that contains the template file. Relative {% include %} or {% extends %} paths inside the template resolve from that directory.

Keeping the module-level default

template_name is a public module-level variable that controls which built-in template is used when template=None or template='default':

import pandas_cat
pandas_cat.template_name = "myname.html.j2"
pandas_cat.profile(df)

The named file must live inside pandas_cat/templates/. For templates outside the package directory, pass a full path instead (see above).


Context reference

The context passed to a template depends on its mode (determined by the presence or absence of the {# pandas-cat: mode=interactive #} tag).

Common variables

Both modes receive these top-level variables:

Variable

Type

Description

title

str

Title shown in the report header. Defaults to "DataFrame" when the caller passes None.

version_string

str

pandas-cat version string, e.g. "0.1.5".

warning_info

list[dict]

One entry per user-visible warning (column excluded, limit exceeded). Each item has two string keys: type (Bootstrap alert class, e.g. "alert-warning") and text (human-readable message).

records_count

int

Number of rows in the report DataFrame.

attribute_count

int

Number of columns in the report (after filtering).

missing_count

int

Total count of missing cells across the whole DataFrame.

attribute_profiles — fields common to both modes

Every item in attribute_profiles (regardless of mode) includes:

Key

Type

Description

attribute

str

Column name.

is_continuous

bool

True for numeric (non-categorical) columns.

is_ordered

bool

True when the column has pandas.CategoricalDtype(ordered=True). Always False for continuous columns.

detected

list[str]

Missing-value sentinel strings replaced with pd.NA (from handle_missing_values()).

replaced

list[int]

Replacement counts, parallel to detected.


Static mode

Templates without the mode tag receive pre-rendered SVG charts. Additional top-level variables beyond the common set:

Variable

Type

Description

df_summary

dict

Dataset-level summary. See df_summary.

attribute_profiles

list[dict]

One entry per column. Common fields above plus static-specific fields.

corr

dict

Correlation charts and raw data. See corr.

df_summary

Key

Type

Description

overall_table

dict

Three string entries: "Records" (formatted int), "Columns" (formatted int), "Memory usage" (human-readable).

Profiles

list[dict]

One entry per column. See df_summary.Profiles item.

mem_usg_svg

str

Base64-encoded SVG of the stacked memory-usage bar chart. Empty string "" when the DataFrame has no columns.

df_summary.Profiles item

Key

Type

Description

Attribute

str

Column name.

Categories

str

Number of distinct categories as a string, or "Continuous" for numeric columns.

Categories_list

str

Comma-separated list of category values (categorical), or "mean=…, std=…" (continuous).

Memory_usage

int

Raw byte count for this column.

Memory_usage_hr

str

Human-readable, e.g. "1.23 KB".

attribute_profiles — static additional fields

Each item also contains:

Key

Type

Description

fcont

str

Base64-encoded SVG: histogram+KDE for continuous columns, frequency bar chart for categorical columns.

summary_tbl

dict

Pre-formatted statistics card. Keys differ by column type — see below.

freq_tbl

list[dict]

Frequency table rows (categorical only; empty list for continuous). See freq_tbl item.

summary_tbl keys — categorical column:

"Categories", "Most frequent", "Least frequent", "Missings", "Memory"

summary_tbl keys — continuous column:

"Mean", "Median", "Std Dev", "Min", "Max", "Q1 (25%)", "Q3 (75%)", "Missings", "Memory"

freq_tbl item (categorical columns only)

Key

Type

Description

name

any

Category value (may be a number, string, or pd.NA).

count

int

Number of rows with this value.

pct

str

Percentage of total rows as a string, e.g. "12.34%".

pct_num

float

Raw percentage value (0–100).

fmt_width

str

Width of the frequency bar relative to the most-frequent category, as a CSS % string. Use in style="width: {{ row.fmt_width }}" to draw proportional bars without extra calculation.

corr

Key

Type

Description

overall_corr

str | None

Base64-encoded SVG of the Cramér’s V heatmap (all categorical column pairs). None when the DataFrame contains no categorical columns.

pearson_corr

str | None

Base64-encoded SVG of the Pearson correlation heatmap (continuous columns only). None when fewer than two continuous columns are present.

indiv_corr

dict

Per-column-pair crosstab heatmaps. See corr.indiv_corr.

spearman_rank

list[dict]

Flat {x, y, v} point list of Spearman rank correlations for all column pairs. Range −1–1. Available for custom templates; not rendered by the built-in static templates.

theils_u

list[dict]

Flat {x, y, v} point list of Theil’s U for categorical–categorical pairs; 0.0 for pairs involving a continuous column. Range 0–1. Available for custom templates; not rendered by the built-in static templates.

corr.indiv_corr

A nested dict keyed by the row column name:

corr.indiv_corr = {
  "ColumnA": {
    "attribute": "ColumnA",        # str — row column name
    "vars": {
      "ColumnA": "<base64 SVG>",   # self-vs-self
      "ColumnB": "<base64 SVG>",   # ColumnA × ColumnB crosstab
      ...
    }
  },
  ...
}

Only categorical columns appear as keys. Continuous columns are absent from indiv_corr entirely.


Interactive mode

Templates with {# pandas-cat: mode=interactive #} receive raw data instead of pre-rendered SVGs. Additional top-level variables beyond the common set:

Variable

Type

Description

total_ram

str

Human-readable total memory usage of the report DataFrame.

excluded_attributes

list[dict]

Columns removed because they exceeded cat_limit. See excluded_attributes item.

attribute_profiles

list[dict]

One entry per column. Common fields above plus interactive-specific fields.

correlations_data

dict

Correlation matrices as flat {x, y, v} point lists. See correlations_data.

excluded_attributes item

Key

Type

Description

attribute

str

Column name.

categories

int

Number of distinct categories found (exceeds cat_limit).

attribute_profiles — interactive additional fields

Each item also contains these fields for every column:

Key

Type

Description

missing

int

Count of missing values in this column.

ram

str

Human-readable memory usage of this column.

Additional fields for categorical columns (is_continuous == False):

Key

Type

Description

categories

list

Category values, in display order (natural for ordered CategoricalDtype; by frequency otherwise).

counts

list[int]

Row count per category, parallel to categories.

percentages

list[float]

Percentage of total rows (including missing) per category, rounded to 2 decimal places.

Additional fields for continuous columns (is_continuous == True):

Key

Type

Description

categories

list

Always an empty list.

counts

list

Always an empty list.

percentages

list

Always an empty list.

histogram_bins

list[float]

Bin midpoints computed by numpy.histogram(bins="auto").

histogram_counts

list[int]

Row count per bin, parallel to histogram_bins.

histogram_percentages

list[float]

Percentage of total rows (including missing) per bin, rounded to 2 decimal places.

stats

dict

Descriptive statistics — see stats dict.

stats dict (continuous columns only)

All values are float, rounded to 6 decimal places.

Key

Description

mean

Arithmetic mean.

std

Standard deviation (ddof=1).

min

Minimum value.

max

Maximum value.

median

Median (50th percentile).

q1

25th percentile.

q3

75th percentile.

correlations_data

A dict whose values are flat lists of {x, y, v} point dicts:

correlations_data = {
  "Cramers V":    [{"x": col_a, "y": col_b, "v": float}, ...],
  "Spearman Rank":[{"x": col_a, "y": col_b, "v": float}, ...],
  "Theils U":     [{"x": col_a, "y": col_b, "v": float}, ...],
  "<A> x <B>":    [{"x": cat_val_a, "y": cat_val_b, "v": float}, ...],
  ...
}

The three summary keys cover every column in the report. Individual "<A> x <B>" keys are added only for categorical–categorical pairs.

Key pattern

Notes

"Cramers V"

Bergsma-Wicher corrected. Range 0–1. 0.0 for pairs involving a continuous column.

"Spearman Rank"

Range −1–1. Categoricals encoded as integer codes before ranking. Computed for all column pairs including continuous.

"Theils U"

Asymmetric uncertainty coefficient. Range 0–1. 0.0 for pairs involving a continuous column.

"<A> x <B>"

Raw crosstab counts for categorical–categorical pairs only. v is a float count (converted from the pandas crosstab).