Group-level evaluation¶
This section documents evaluation utilities for computing metrics and diagnostics across grouped entities.
Group-level evaluation supports aggregation and comparison across cohorts, segments, or experimental groups.
eb_evaluation.dataframe.group
¶
Group-level evaluation (DataFrame utilities).
This module provides helpers for evaluating forecasts on grouped subsets of a DataFrame
(e.g., by store, item, daypart, region). It orchestrates grouping, parameter handling, and
tabular output while delegating metric definitions to eb_metrics.metrics.
The primary entry point is evaluate_groups_df, which computes the Electric Barometer
metric suite (CWSL, NSL, UD, HR@tau, FRS) plus common symmetric diagnostics (wMAPE, MAE,
RMSE, MAPE) for each group.
evaluate_groups_df(df, group_cols, *, actual_col='actual_qty', forecast_col='forecast_qty', cu=2.0, co=1.0, tau=2.0, sample_weight_col=None)
¶
Evaluate core EB metrics per group from a DataFrame.
For each group defined by group_cols, this helper computes:
- CWSL
- NSL
- UD
- wMAPE
- HR@tau
- FRS
- MAE
- RMSE
- MAPE
Cost parameters can be provided either globally (scalar) or per-row (column name).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Input data containing actuals, forecasts, and grouping columns. |
required |
group_cols
|
list[str]
|
Column names used to define groups (e.g., |
required |
actual_col
|
str
|
Name of the column containing actual demand values. |
"actual_qty"
|
forecast_col
|
str
|
Name of the column containing forecast values. |
"forecast_qty"
|
cu
|
float | str
|
Underbuild (shortfall) cost coefficient.
|
2.0
|
co
|
float | str
|
Overbuild (excess) cost coefficient.
|
1.0
|
tau
|
float
|
Absolute-error tolerance parameter for the hit-rate metric HR@tau. |
2.0
|
sample_weight_col
|
str | None
|
Optional column name containing non-negative sample weights per row. If provided,
weights are passed into metrics that accept a |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with one row per group and columns:: If a metric is undefined for a particular group (e.g., invalid values for that group), the corresponding value is returned as NaN rather than raising an error for the entire evaluation. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If required columns are missing from |
ValueError
|
If |
Notes
wmapeineb_metrics.metricsdoes not takesample_weight, so it is computed unweighted here.- Symmetric diagnostics (MAE, RMSE, MAPE) are computed unweighted to match the
current
eb_metricssignatures. - Metrics are evaluated group-by-group; a failure in one group does not prevent evaluation of other groups.