Group-level evaluation¶

This section documents evaluation utilities for computing metrics and diagnostics across grouped entities.

Group-level evaluation supports aggregation and comparison across cohorts, segments, or experimental groups.

`eb_evaluation.dataframe.group` ¶

Group-level evaluation (DataFrame utilities).

This module provides helpers for evaluating forecasts on grouped subsets of a DataFrame (e.g., by store, item, daypart, region). It orchestrates grouping, parameter handling, and tabular output while delegating metric definitions to eb_metrics.metrics.

The primary entry point is evaluate_groups_df, which computes the Electric Barometer metric suite (CWSL, NSL, UD, HR@tau, FRS) plus common symmetric diagnostics (wMAPE, MAE, RMSE, MAPE) for each group.

`evaluate_groups_df(df, group_cols, *, actual_col='actual_qty', forecast_col='forecast_qty', cu=2.0, co=1.0, tau=2.0, sample_weight_col=None)` ¶

Evaluate core EB metrics per group from a DataFrame.

For each group defined by group_cols, this helper computes:

CWSL
NSL
UD
wMAPE
HR@tau
FRS
MAE
RMSE
MAPE

Cost parameters can be provided either globally (scalar) or per-row (column name).

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input data containing actuals, forecasts, and grouping columns.	required
`group_cols`	`list[str]`	Column names used to define groups (e.g., `["store_id", "item_id"]`).	required
`actual_col`	`str`	Name of the column containing actual demand values.	`"actual_qty"`
`forecast_col`	`str`	Name of the column containing forecast values.	`"forecast_qty"`
`cu`	`float \| str`	Underbuild (shortfall) cost coefficient. If `float`: scalar cost applied uniformly across all rows/groups. If `str`: name of a column in `df` containing per-row underbuild costs.	`2.0`
`co`	`float \| str`	Overbuild (excess) cost coefficient. If `float`: scalar cost applied uniformly across all rows/groups. If `str`: name of a column in `df` containing per-row overbuild costs.	`1.0`
`tau`	`float`	Absolute-error tolerance parameter for the hit-rate metric HR@tau.	`2.0`
`sample_weight_col`	`str \| None`	Optional column name containing non-negative sample weights per row. If provided, weights are passed into metrics that accept a `sample_weight` argument.	`None`

Returns:

Type	Description
`DataFrame`	DataFrame with one row per group and columns:: `group_cols + ["CWSL", "NSL", "UD", "wMAPE", "HR@tau", "FRS", "MAE", "RMSE", "MAPE"].` If a metric is undefined for a particular group (e.g., invalid values for that group), the corresponding value is returned as NaN rather than raising an error for the entire evaluation.

Raises:

Type	Description
`KeyError`	If required columns are missing from `df`.
`ValueError`	If `df` is empty, or if `group_cols` is empty.

Notes

wmape in eb_metrics.metrics does not take sample_weight, so it is computed unweighted here.
Symmetric diagnostics (MAE, RMSE, MAPE) are computed unweighted to match the current eb_metrics signatures.
Metrics are evaluated group-by-group; a failure in one group does not prevent evaluation of other groups.

Group-level evaluation¶

eb_evaluation.dataframe.group ¶

evaluate_groups_df(df, group_cols, *, actual_col='actual_qty', forecast_col='forecast_qty', cu=2.0, co=1.0, tau=2.0, sample_weight_col=None) ¶

`eb_evaluation.dataframe.group` ¶

`evaluate_groups_df(df, group_cols, *, actual_col='actual_qty', forecast_col='forecast_qty', cu=2.0, co=1.0, tau=2.0, sample_weight_col=None)` ¶