objectif
Utiliser lightgbm.train avec Dataset et early stopping pour régression.
code minimal
import numpy as np
import lightgbm as lgb
rng = np.random.RandomState(0)
X = rng.randn(500, 5)
y = X @ np.array([1.0,-2.0,0.5,0.0,1.5]) + rng.randn(500)*0.1
dtrain = lgb.Dataset(X[:400], label=y[:400])
dval = lgb.Dataset(X[400:], label=y[400:])
params = {
"objective": "regression",
"metric": "l2",
"learning_rate": 0.05,
"num_leaves": 31,
"feature_fraction": 0.9,
"bagging_fraction": 0.9,
"bagging_freq": 1,
"seed": 0,
}
bst = lgb.train(
params, dtrain,
valid_sets=[dval],
num_boost_round=5000,
early_stopping_rounds=100,
verbose_eval=False,
)
print(bst.best_iteration > 0)
utilisation
# Prédire avec best_iteration
pred = bst.predict(X[400:], num_iteration=bst.best_iteration)
print(len(pred) == 100)
variante(s) utile(s)
# Sauvegarde/chargement
bst.save_model("lgb.txt")
bst2 = lgb.Booster(model_file="lgb.txt")
print(len(bst2.feature_name()) == X.shape[1])
notes
- Avec lgb.train, utilisez lgb.Dataset pour train/valid; exploitez best_iteration pour figer le modèle.