Validation utilities¶
This section documents validation utilities provided by eb-evaluation.
These utilities perform structural and semantic validation of inputs and outputs used during evaluation workflows, ensuring consistency and correctness before metrics, diagnostics, or model-selection logic is applied.
eb_evaluation.utils.validation
¶
Lightweight DataFrame validation utilities.
This module provides small, explicit validation helpers for pandas DataFrames used throughout the Electric Barometer evaluation and model-selection stack.
The intent is to:
- fail fast with clear error messages
- keep validation logic centralized and reusable
- distinguish data validation errors from other ValueError instances
These helpers are intentionally minimal and do not attempt schema inference or coercion; they only assert required structural properties.
DataFrameValidationError
¶
Bases: ValueError
Raised when a pandas DataFrame fails a validation check.
This is a thin subclass of ValueError that allows callers to explicitly
catch DataFrame-related validation issues and distinguish them from other
value errors (e.g., numerical domain errors).
ensure_columns_present(df, required, *, context=None)
¶
Ensure that all required columns are present in a DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame to validate. |
required |
required
|
Sequence[str]
|
Column names that must be present in |
required |
context
|
str | None
|
Optional context string (e.g., function or module name) included in the error message to aid debugging. |
None
|
Raises:
| Type | Description |
|---|---|
DataFrameValidationError
|
If one or more required columns are missing. |
Notes
This function performs a presence-only check. It does not validate column dtypes or contents.
ensure_non_empty(df, *, context=None)
¶
Ensure that a DataFrame is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame to validate. |
required |
context
|
str | None
|
Optional context string (e.g., function or module name) included in the error message to aid debugging. |
None
|
Raises:
| Type | Description |
|---|---|
DataFrameValidationError
|
If the DataFrame has zero rows. |
Notes
This check is commonly used after filtering or grouping operations to ensure downstream computations have at least one observation.