Skip to content

Automated model selection

This section documents automated model selection utilities provided by eb-evaluation.

These utilities support automated evaluation and selection of candidate models based on performance metrics, diagnostics, and decision-oriented criteria.

eb_evaluation.model_selection.auto_engine

Auto model-zoo builder for Electric Barometer selection.

This module defines :class:~eb_evaluation.model_selection.auto_engine.AutoEngine, a convenience factory that constructs an unfitted :class:~eb_evaluation.model_selection.electric_barometer.ElectricBarometer with a curated set of candidate regressors (a "model zoo") and asymmetric cost parameters.

The intent is to provide a batteries-included entry point:

  • choose asymmetric costs (cu, co)
  • choose a speed preset (fast, balanced, slow)
  • optionally include additional engines (XGBoost, LightGBM, CatBoost) when installed
  • get back an unfitted selector ready for cost-aware model selection

Model selection is performed using cost-aware criteria (e.g., CWSL) rather than symmetric error alone, enabling operationally aligned choices.

AutoEngine

Convenience factory for :class:~eb_evaluation.model_selection.electric_barometer.ElectricBarometer.

AutoEngine builds an ElectricBarometer with a curated set of candidate models chosen by a speed preset:

  • speed="fast": small, inexpensive zoo; suitable for quick experiments and CI.
  • speed="balanced" (default): trade-off between runtime and modeling power.
  • speed="slow": heavier ensembles/boosting; use when wall-clock time is acceptable.

Asymmetric costs define the primary selection objective via the cost ratio R = cu / co.

Parameters:

Name Type Description Default
cu float

Underbuild (shortfall) cost per unit. Must be strictly positive.

2.0
co float

Overbuild (excess) cost per unit. Must be strictly positive.

1.0
tau float

Tolerance parameter forwarded to ElectricBarometer for optional diagnostics (e.g., HR@tau).

2.0
selection_mode str

Selection strategy used by ElectricBarometer. Must be "holdout" or "cv".

'holdout'
cv int

Number of folds when selection_mode="cv".

3
random_state int | None

Seed used for stochastic models and (when applicable) cross-validation.

None
speed SpeedType

Controls which models are included and their approximate complexity.

'balanced'
Notes

Optional engines are included only when their packages are installed: xgboost, lightgbm, catboost.

available_models()

List the model names that would be included in the zoo for this AutoEngine.

This reflects: - the selected speed preset, and - which optional packages are available in the current environment.

Returns:

Type Description
list[str]

Model names in deterministic insertion order.

build_zoo()

Build and return the unfitted model zoo.

Returns:

Type Description
dict[str, Any]

Mapping of {name: estimator} for candidate regressors.

Notes

The returned dict is a shallow copy so callers may filter/mutate it without affecting future calls.

build_selector(X, y, *, include=None, exclude=None, metric='cwsl', error_policy='warn_skip', time_budget_s=None, per_model_time_budget_s=None, refit_on_full=False)

Build an unfitted ElectricBarometer configured with the default model zoo.

Parameters:

Name Type Description Default
X ndarray

Feature matrix. Currently unused by the builder (reserved for future heuristics).

required
y ndarray

Target vector. Currently unused by the builder (reserved for future heuristics).

required
include set[str] | None

Optional allowlist of model names to include from the zoo.

None
exclude set[str] | None

Optional blocklist of model names to exclude from the zoo.

None
metric _MetricName

Selection objective used by ElectricBarometer to choose the winning model.

'cwsl'
error_policy _ErrorPolicy

Behavior when a candidate model fails to fit/predict or otherwise errors.

'warn_skip'
time_budget_s float | None

Optional wall-clock time budget (seconds) for the full selection run.

None
per_model_time_budget_s float | None

Optional wall-clock time budget (seconds) per candidate model.

None
refit_on_full bool

Forwarded to ElectricBarometer; controls whether the winning model is refit on train+validation in holdout mode.

False

Returns:

Type Description
ElectricBarometer

Unfitted selector instance.