Rolling Window Features¶
This section documents rolling-window feature construction utilities provided by eb-features.
Rolling features support multiple statistics and optional leakage-safe computation.
All content below is generated automatically from NumPy-style docstrings in the source code.
Rolling Feature API¶
eb_features.panel.rolling
¶
Rolling window feature construction for panel time series.
This module provides stateless utilities to compute rolling window statistics of a target series within each entity of a panel (entity-by-timestamp) dataset.
Rolling windows are expressed in index steps (rows) at the input frequency rather than wall-clock units. This keeps the transformation frequency-agnostic.
Definitions
For a target series y_t and a window length w, the rolling mean feature is:
This formulation explicitly uses past values only (y_{t-1} through y_{t-w}),
which avoids target leakage when predicting at time t.
The resulting feature columns are named roll_{w}_{stat}, e.g., roll_24_mean.
Notes
- Rolling features are computed strictly within each entity.
- By default this module computes rolling statistics on
target.shift(1)within each entity, so that the current target valuey_tis not included in the window. - The calling pipeline is responsible for sorting data by
(entity, timestamp)and for deciding how to handle NaNs introduced by rolling windows (e.g., dropping rows).
add_rolling_features(df, *, entity_col, target_col, rolling_windows, rolling_stats, min_periods=None, leakage_safe=True)
¶
Add rolling window statistics features to a panel DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Input DataFrame containing at least |
required |
entity_col
|
str
|
Name of the entity identifier column. |
required |
target_col
|
str
|
Name of the numeric target column used to compute rolling statistics. |
required |
rolling_windows
|
Sequence[int] | None
|
Positive rolling window lengths (in steps). For each |
required |
rolling_stats
|
Sequence[str]
|
Rolling statistics to compute. Allowed values are:
|
required |
min_periods
|
int | None
|
Minimum number of observations in the window required to produce a value. If None,
defaults to |
None
|
leakage_safe
|
bool
|
If True, compute rolling statistics on the lagged target ( |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
df_out |
DataFrame
|
Copy of |
feature_cols |
list[str]
|
Names of the rolling feature columns added. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If |
ValueError
|
If any rolling window is non-positive, if |
Notes
Rolling features introduce missing values at the beginning of each entity's series.
These are typically removed downstream when dropna=True or handled via imputation.