Electric Barometer model selection¶
This section documents Electric Barometer–specific model selection utilities.
These utilities integrate evaluation metrics, diagnostics, and readiness signals to support holistic model ranking and selection.
eb_evaluation.model_selection.electric_barometer
¶
Cost-aware model selection using the Electric Barometer workflow.
This module defines ElectricBarometer, a lightweight selector that evaluates a set of
candidate regressors using Cost-Weighted Service Loss (CWSL) as the primary objective and
selects the model that minimizes expected operational cost.
Selection preference is governed by asymmetric unit costs:
cu: underbuild (shortfall) cost per unitco: overbuild (excess) cost per unit
A convenient summary is the cost ratio:
Notes
ElectricBarometer is intentionally a selector (not a trainer that optimizes CWSL directly). Candidate models are trained using their native objectives (e.g., squared error) and are selected using a chosen selection objective on validation data (holdout) or across folds (CV).
ElectricBarometer
¶
Cost-aware selector that chooses the best model by minimizing a selection objective.
ElectricBarometer evaluates each candidate model on either:
- a provided train/validation split (
selection_mode="holdout"), or - K-fold cross-validation on the provided dataset (
selection_mode="cv"),
and selects the model with the best (lowest) score under the chosen selection objective. For interpretability, it also reports reference diagnostics (CWSL, RMSE, wMAPE).
Operational preference is captured by asymmetric costs and the induced ratio:
$$
R = \frac{c_u}{c_o}
$$
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
dict[str, Any]
|
Mapping of candidate model name to an unfitted estimator implementing:
Models can be scikit-learn regressors/pipelines or EB adapters implementing the same interface. |
required |
include
|
set[str] | None
|
Optional allowlist of model names to include from |
None
|
exclude
|
set[str] | None
|
Optional blocklist of model names to exclude from |
None
|
metric
|
('cwsl', 'rmse', 'wmape')
|
Selection objective used to choose the winning model. All metrics are computed and reported; this parameter determines which column is optimized. |
"cwsl"
|
tie_tol
|
float
|
Absolute tolerance applied to the selection metric when determining ties. Any model with score <= (best_score + tie_tol) is considered tied. |
0.0
|
tie_breaker
|
('metric', 'simpler', 'name')
|
How to break ties among models within
|
"metric"
|
validate_inputs
|
('strict', 'coerce', 'off')
|
Input validation level.
|
"strict"
|
error_policy
|
('raise', 'skip', 'warn_skip')
|
Behavior when a candidate model fails to fit/predict or otherwise errors.
|
"raise"
|
time_budget_s
|
float | None
|
Optional wall-clock time budget (seconds) for the full selection run. If exceeded, remaining models are not evaluated. Note: this cannot forcibly interrupt a model already running; it gates starting new candidates and can mark a candidate as timed out if it exceeds budgets. |
None
|
per_model_time_budget_s
|
float | None
|
Optional wall-clock time budget (seconds) per candidate model (across folds in CV).
If exceeded, that model is marked as timed out and skipped (or raises under
|
None
|
cu
|
float
|
Underbuild (shortfall) cost per unit. Must be strictly positive. |
2.0
|
co
|
float
|
Overbuild (excess) cost per unit. Must be strictly positive. |
1.0
|
tau
|
float
|
Reserved for downstream diagnostics (e.g., HR@τ) that may be integrated into selection reporting. Currently not used in the selection criterion. |
2.0
|
training_mode
|
'selection_only'
|
Training behavior. In the current implementation, candidate models are trained using their native objectives and only selection is external. |
"selection_only"
|
refit_on_full
|
bool
|
Refit behavior in holdout mode:
In CV mode, the selected model is always refit on the full dataset provided to
|
False
|
selection_mode
|
('holdout', 'cv')
|
Selection strategy:
|
"holdout"
|
cv
|
int
|
Number of folds when |
3
|
random_state
|
int | None
|
Seed used for CV shuffling/splitting. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
best_name_ |
str | None
|
Name of the winning model after calling |
best_model_ |
Any | None
|
Fitted estimator corresponding to |
results_ |
DataFrame | None
|
Per-model comparison table.
|
failures_ |
dict[str, str]
|
Mapping of model name to a failure reason for models that errored or timed out. |
validation_cwsl_ |
float | None
|
CWSL of the winning model on validation (holdout) or mean across folds (CV). |
validation_rmse_ |
float | None
|
RMSE of the winning model on validation (holdout) or mean across folds (CV). |
validation_wmape_ |
float | None
|
wMAPE of the winning model on validation (holdout) or mean across folds (CV). |
candidate_names_ |
list[str]
|
Names of candidate models remaining after include/exclude filtering. |
evaluated_names_ |
list[str]
|
Names of models that were actually attempted during the most recent |
stopped_early_ |
bool
|
Whether evaluation stopped early due to the global time budget. |
stop_reason_ |
str | None
|
If |
r_
property
¶
Cost ratio.
Returns:
| Type | Description |
|---|---|
float
|
The ratio: $$ $$ |
fit(X_train, y_train, X_val, y_val, sample_weight_train=None, sample_weight_val=None, refit_on_full=None)
¶
Fit candidate models and select the best one by minimizing the chosen metric.
predict(X)
¶
Predict using the selected best model.
cwsl_score(y_true, y_pred, sample_weight=None, cu=None, co=None)
¶
Compute CWSL using this selector's costs (or overrides).