sklearn IterativeImputer (MICE)
objectif
Expliquer et montrer comment imputation itérative multivariée des valeurs manquantes.
code minimal
import numpy as np
from sklearn.experimental import enable_iterative_imputer # noqa: F401
from sklearn.impute import IterativeImputer
rng = np.random.RandomState(0)
X = rng.randn(200, 3)
X[rng.rand(*X.shape) < 0.1] = np.nan
imp = IterativeImputer(random_state=0, max_iter=10)
X_imp = imp.fit_transform(X)
X_imp[:2]
utilisation
# vérifier la convergence: imp.imputation_sequence_ donne l'ordre
len(imp.imputation_sequence_), imp.n_iter_
variante(s) utile(s)
# utiliser des estimateurs non linéaires comme RandomForestRegressor
from sklearn.ensemble import RandomForestRegressor
imp_rf = IterativeImputer(estimator=RandomForestRegressor(n_estimators=50, random_state=0), random_state=0)
Xm = imp_rf.fit_transform(X)
Xm[:1]
notes
- Activez l’API expérimentale via l’import enable_iterative_imputer.
- MICE peut préserver les relations entre variables.