Skip to content

Rolling Window Features

This section documents rolling-window feature construction utilities provided by eb-features.

Rolling features support multiple statistics and optional leakage-safe computation.

All content below is generated automatically from NumPy-style docstrings in the source code.

Rolling Feature API

eb_features.panel.rolling

Rolling window feature construction for panel time series.

This module provides stateless utilities to compute rolling window statistics of a target series within each entity of a panel (entity-by-timestamp) dataset.

Rolling windows are expressed in index steps (rows) at the input frequency rather than wall-clock units. This keeps the transformation frequency-agnostic.

Definitions

For a target series y_t and a window length w, the rolling mean feature is:

\[ \mathrm{roll\_mean}_w(t) = \frac{1}{w}\sum_{j=1}^{w} y_{t-j} \]

This formulation explicitly uses past values only (y_{t-1} through y_{t-w}), which avoids target leakage when predicting at time t.

The resulting feature columns are named roll_{w}_{stat}, e.g., roll_24_mean.

Notes
  • Rolling features are computed strictly within each entity.
  • By default this module computes rolling statistics on target.shift(1) within each entity, so that the current target value y_t is not included in the window.
  • The calling pipeline is responsible for sorting data by (entity, timestamp) and for deciding how to handle NaNs introduced by rolling windows (e.g., dropping rows).

add_rolling_features(df, *, entity_col, target_col, rolling_windows, rolling_stats, min_periods=None, leakage_safe=True)

Add rolling window statistics features to a panel DataFrame.

Parameters:

Name Type Description Default
df DataFrame

Input DataFrame containing at least entity_col and target_col.

required
entity_col str

Name of the entity identifier column.

required
target_col str

Name of the numeric target column used to compute rolling statistics.

required
rolling_windows Sequence[int] | None

Positive rolling window lengths (in steps). For each w in rolling_windows and each stat in rolling_stats, the feature roll_{w}_{stat} is added. If None or empty, no rolling features are added.

required
rolling_stats Sequence[str]

Rolling statistics to compute. Allowed values are: {"mean", "std", "min", "max", "sum", "median"}.

required
min_periods int | None

Minimum number of observations in the window required to produce a value. If None, defaults to w (full-window requirement).

None
leakage_safe bool

If True, compute rolling statistics on the lagged target (target.shift(1) within each entity), so that the current y_t is excluded from the window.

True

Returns:

Name Type Description
df_out DataFrame

Copy of df with rolling feature columns added.

feature_cols list[str]

Names of the rolling feature columns added.

Raises:

Type Description
KeyError

If entity_col or target_col is missing from df.

ValueError

If any rolling window is non-positive, if min_periods is invalid, or if an unsupported stat is requested.

Notes

Rolling features introduce missing values at the beginning of each entity's series. These are typically removed downstream when dropna=True or handled via imputation.