Lag Features¶

This section documents lag-based feature construction utilities provided by eb-features.

Lag features are computed strictly within entity boundaries and expressed in index steps relative to the input frequency.

All content below is generated automatically from NumPy-style docstrings in the source code.

Lag Feature API¶

`eb_features.panel.lags` ¶

Lag feature construction for panel time series.

This module provides stateless utilities to construct lagged versions of a target series within each entity of a panel (entity-by-timestamp) dataset.

Lag features are expressed in index steps (rows) at the input frequency rather than wall-clock units. This makes the transformation frequency-agnostic (5-min, 30-min, hourly, daily, etc.), assuming the panel is sorted by timestamp within each entity.

Definition

For a target series y_t and lag step k:

\[ \mathrm{lag}_k(t) = y_{t-k} \]

The resulting feature column is named lag_{k}.

Notes

Lag features are computed strictly within each entity using grouped shifts.
The calling pipeline is responsible for handling missing values introduced by lagging (e.g., dropping rows or applying imputation).

`add_lag_features(df, *, entity_col, target_col, lag_steps)` ¶

Add target lag features to a panel DataFrame.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input DataFrame containing at least `entity_col` and `target_col`.	required
`entity_col`	`str`	Name of the entity identifier column.	required
`target_col`	`str`	Name of the numeric target column to be lagged.	required
`lag_steps`	`Sequence[int] \| None`	Positive lag offsets (in steps). For each `k` in `lag_steps`, the feature `lag_{k}` is added. If None or empty, no lag features are added.	required

Returns:

Name	Type	Description
`df_out`	`DataFrame`	Copy of `df` with lag feature columns added.
`feature_cols`	`list[str]`	Names of the lag feature columns added.

Raises:

Type	Description
`KeyError`	If `entity_col` or `target_col` is missing from `df`.
`ValueError`	If any lag step is non-positive.

Notes

Lagging introduces missing values for the first k observations of each entity. These are typically removed downstream when dropna=True or handled via imputation.

Lag Features¶

Lag Feature API¶

eb_features.panel.lags ¶

add_lag_features(df, *, entity_col, target_col, lag_steps) ¶

`eb_features.panel.lags` ¶

`add_lag_features(df, *, entity_col, target_col, lag_steps)` ¶