objectif

Utiliser lightgbm.train avec Dataset et early stopping pour régression.

code minimal

import numpy as np
import lightgbm as lgb

rng = np.random.RandomState(0)
X = rng.randn(500, 5)
y = X @ np.array([1.0,-2.0,0.5,0.0,1.5]) + rng.randn(500)*0.1

dtrain = lgb.Dataset(X[:400], label=y[:400])
dval   = lgb.Dataset(X[400:], label=y[400:])

params = {
    "objective": "regression",
    "metric": "l2",
    "learning_rate": 0.05,
    "num_leaves": 31,
    "feature_fraction": 0.9,
    "bagging_fraction": 0.9,
    "bagging_freq": 1,
    "seed": 0,
}
bst = lgb.train(
    params, dtrain,
    valid_sets=[dval],
    num_boost_round=5000,
    early_stopping_rounds=100,
    verbose_eval=False,
)
print(bst.best_iteration > 0)

utilisation

# Prédire avec best_iteration
pred = bst.predict(X[400:], num_iteration=bst.best_iteration)
print(len(pred) == 100)

variante(s) utile(s)

# Sauvegarde/chargement
bst.save_model("lgb.txt")
bst2 = lgb.Booster(model_file="lgb.txt")
print(len(bst2.feature_name()) == X.shape[1])

notes

Avec lgb.train, utilisez lgb.Dataset pour train/valid; exploitez best_iteration pour figer le modèle.

Menu

lightgbm: regressor avec early stopping

objectif

code minimal

utilisation

variante(s) utile(s)

notes