objectif
Choisir k via score silhouette.
code minimal
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from sklearn.datasets import make_blobs
X, _ = make_blobs(n_samples=200, centers=3, random_state=0)
k = 3
labels = KMeans(n_clusters=k, n_init="auto", random_state=0).fit_predict(X)
print(round(silhouette_score(X, labels), 3) <= 1.0)
utilisation
from sklearn.cluster import MiniBatchKMeans
print(hasattr(MiniBatchKMeans(n_clusters=2), "fit"))
variante(s) utile(s)
from sklearn.metrics import silhouette_samples
print(len(silhouette_samples(X, labels)) == X.shape[0])
notes
- Éviter silhouette pour k=1; comparer plusieurs k.