← retour aux snippets

sklearn: Pipeline + StandardScaler

Chaîner scaling et modèle proprement.

objectif

Chaîner scaling et modèle proprement.

code minimal

from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
X = [[0.0],[1.0],[2.0],[3.0]]; y=[0,0,1,1]
clf = make_pipeline(StandardScaler(), LogisticRegression(max_iter=1000)).fit(X, y)
print(hasattr(clf, "predict"))

utilisation

from sklearn import set_config
set_config(transform_output="pandas")
print(True)

variante(s) utile(s)

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer([("cat", OneHotEncoder(), [0])], remainder="drop")
print(hasattr(ct, "fit"))

notes

  • Pipeline garantit qu’on ne fuit pas d’info du test vers le train.