Automated model selection¶
This section documents automated model selection utilities provided by eb-evaluation.
These utilities support automated evaluation and selection of candidate models based on performance metrics, diagnostics, and decision-oriented criteria.
eb_evaluation.model_selection.auto_engine
¶
Auto model-zoo builder for Electric Barometer selection.
This module defines :class:~eb_evaluation.model_selection.auto_engine.AutoEngine, a
convenience factory that constructs an unfitted
:class:~eb_evaluation.model_selection.electric_barometer.ElectricBarometer with a curated
set of candidate regressors (a "model zoo") and asymmetric cost parameters.
The intent is to provide a batteries-included entry point:
- choose asymmetric costs (cu, co)
- choose a speed preset (fast, balanced, slow)
- optionally include additional engines (XGBoost, LightGBM, CatBoost) when installed
- get back an unfitted selector ready for cost-aware model selection
Model selection is performed using cost-aware criteria (e.g., CWSL) rather than symmetric error alone, enabling operationally aligned choices.
AutoEngine
¶
Convenience factory for :class:~eb_evaluation.model_selection.electric_barometer.ElectricBarometer.
AutoEngine builds an ElectricBarometer with a curated set of candidate models chosen by a speed preset:
speed="fast": small, inexpensive zoo; suitable for quick experiments and CI.speed="balanced"(default): trade-off between runtime and modeling power.speed="slow": heavier ensembles/boosting; use when wall-clock time is acceptable.
Asymmetric costs define the primary selection objective via the cost ratio R = cu / co.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cu
|
float
|
Underbuild (shortfall) cost per unit. Must be strictly positive. |
2.0
|
co
|
float
|
Overbuild (excess) cost per unit. Must be strictly positive. |
1.0
|
tau
|
float
|
Tolerance parameter forwarded to ElectricBarometer for optional diagnostics (e.g., HR@tau). |
2.0
|
selection_mode
|
str
|
Selection strategy used by ElectricBarometer. Must be |
'holdout'
|
cv
|
int
|
Number of folds when |
3
|
random_state
|
int | None
|
Seed used for stochastic models and (when applicable) cross-validation. |
None
|
speed
|
SpeedType
|
Controls which models are included and their approximate complexity. |
'balanced'
|
Notes
Optional engines are included only when their packages are installed: xgboost,
lightgbm, catboost.
available_models()
¶
List the model names that would be included in the zoo for this AutoEngine.
This reflects:
- the selected speed preset, and
- which optional packages are available in the current environment.
Returns:
| Type | Description |
|---|---|
list[str]
|
Model names in deterministic insertion order. |
build_zoo()
¶
Build and return the unfitted model zoo.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Mapping of |
Notes
The returned dict is a shallow copy so callers may filter/mutate it without affecting future calls.
build_selector(X, y, *, include=None, exclude=None, metric='cwsl', error_policy='warn_skip', time_budget_s=None, per_model_time_budget_s=None, refit_on_full=False)
¶
Build an unfitted ElectricBarometer configured with the default model zoo.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
Feature matrix. Currently unused by the builder (reserved for future heuristics). |
required |
y
|
ndarray
|
Target vector. Currently unused by the builder (reserved for future heuristics). |
required |
include
|
set[str] | None
|
Optional allowlist of model names to include from the zoo. |
None
|
exclude
|
set[str] | None
|
Optional blocklist of model names to exclude from the zoo. |
None
|
metric
|
_MetricName
|
Selection objective used by ElectricBarometer to choose the winning model. |
'cwsl'
|
error_policy
|
_ErrorPolicy
|
Behavior when a candidate model fails to fit/predict or otherwise errors. |
'warn_skip'
|
time_budget_s
|
float | None
|
Optional wall-clock time budget (seconds) for the full selection run. |
None
|
per_model_time_budget_s
|
float | None
|
Optional wall-clock time budget (seconds) per candidate model. |
None
|
refit_on_full
|
bool
|
Forwarded to ElectricBarometer; controls whether the winning model is refit on train+validation in holdout mode. |
False
|
Returns:
| Type | Description |
|---|---|
ElectricBarometer
|
Unfitted selector instance. |