← retour aux snippets

sklearn IterativeImputer (MICE)

imputation itérative multivariée des valeurs manquantes

sklearn IterativeImputer (MICE)

objectif

Expliquer et montrer comment imputation itérative multivariée des valeurs manquantes.

code minimal

import numpy as np
from sklearn.experimental import enable_iterative_imputer  # noqa: F401
from sklearn.impute import IterativeImputer

rng = np.random.RandomState(0)
X = rng.randn(200, 3)
X[rng.rand(*X.shape) < 0.1] = np.nan
imp = IterativeImputer(random_state=0, max_iter=10)
X_imp = imp.fit_transform(X)
X_imp[:2]

utilisation

# vérifier la convergence: imp.imputation_sequence_ donne l'ordre
len(imp.imputation_sequence_), imp.n_iter_

variante(s) utile(s)

# utiliser des estimateurs non linéaires comme RandomForestRegressor
from sklearn.ensemble import RandomForestRegressor
imp_rf = IterativeImputer(estimator=RandomForestRegressor(n_estimators=50, random_state=0), random_state=0)
Xm = imp_rf.fit_transform(X)
Xm[:1]

notes

  • Activez l’API expérimentale via l’import enable_iterative_imputer.
  • MICE peut préserver les relations entre variables.