← retour aux snippets

sklearn: KMeans & MiniBatchKMeans

Clustering KMeans et variante MiniBatch pour gros jeux.

objectif

Clustering KMeans et variante MiniBatch pour gros jeux.

code minimal

from sklearn.cluster import KMeans, MiniBatchKMeans
from sklearn.datasets import make_blobs

X, _ = make_blobs(n_samples=200, centers=3, random_state=0)
print(KMeans(n_clusters=3, n_init=10, random_state=0).fit(X).inertia_ >= 0.0)

utilisation

from sklearn.cluster import MiniBatchKMeans
print(MiniBatchKMeans(n_clusters=3, random_state=0).fit(X).cluster_centers_.shape[0] == 3)

variante(s) utile(s)

from sklearn.metrics import silhouette_score
print(silhouette_score(X, KMeans(n_clusters=3, n_init=10, random_state=0).fit_predict(X)) <= 1.0)

notes

  • Tester plusieurs k; standardiser si échelles différentes.