← retour aux snippets

sklearn: KMeans + silhouette

Choisir k via score silhouette.

objectif

Choisir k via score silhouette.

code minimal

from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from sklearn.datasets import make_blobs
X, _ = make_blobs(n_samples=200, centers=3, random_state=0)
k = 3
labels = KMeans(n_clusters=k, n_init="auto", random_state=0).fit_predict(X)
print(round(silhouette_score(X, labels), 3) <= 1.0)

utilisation

from sklearn.cluster import MiniBatchKMeans
print(hasattr(MiniBatchKMeans(n_clusters=2), "fit"))

variante(s) utile(s)

from sklearn.metrics import silhouette_samples
print(len(silhouette_samples(X, labels)) == X.shape[0])

notes

  • Éviter silhouette pour k=1; comparer plusieurs k.