Skip to content

Input Validation

This section documents validation utilities provided by eb-features to enforce structural assumptions on panel time-series data.

These helpers ensure column presence, monotonic timestamps, and feature integrity.

All content below is generated automatically from NumPy-style docstrings in the source code.

Validation API

eb_features.panel.validation

Validation utilities for panel feature engineering.

This module centralizes lightweight input validation helpers used throughout the panel feature engineering subpackage.

Key invariants
  • Required columns must exist.
  • Within each entity, timestamps must be strictly increasing in the given row order. (No sorting is performed inside validation; callers may sort afterward for deterministic computation, but validation should catch out-of-order input.)

validate_required_columns(df, *, required_cols)

Validate that required columns exist on a DataFrame.

Parameters:

Name Type Description Default
df DataFrame

Input DataFrame.

required
required_cols Sequence[str]

Columns that must be present.

required

Raises:

Type Description
KeyError

If any required columns are missing.

ensure_columns_present(df, *, columns, label)

Ensure that a set of configured columns exists on a DataFrame.

Parameters:

Name Type Description Default
df DataFrame

Input DataFrame.

required
columns Iterable[str]

Columns that must be present.

required
label str

Human-readable label used in error messages (e.g., "Static", "Regressor").

required

Raises:

Type Description
KeyError

If any specified columns are missing.

validate_monotonic_timestamps(df, *, entity_col, timestamp_col)

Validate that timestamps are strictly increasing within each entity in row order.

This function does not sort. It checks the DataFrame exactly as provided.

Parameters:

Name Type Description Default
df DataFrame

Input panel DataFrame.

required
entity_col str

Entity identifier column.

required
timestamp_col str

Timestamp column.

required

Raises:

Type Description
KeyError

If required columns are missing.

ValueError

If any entity has non-strictly increasing timestamps (ties or out-of-order), or if timestamps cannot be parsed.