objectif
Choisir un seuil de décision selon métrique (ex: F1 ou recall).
code minimal
from sklearn.metrics import f1_score
import numpy as np
proba = np.array([0.1,0.4,0.8,0.6])
ytrue = np.array([0,0,1,1])
ths = np.linspace(0,1,11)
best = max(((t, f1_score(ytrue, proba>=t)) for t in ths), key=lambda x: x[1])
print(best[0] >= 0.0)
utilisation
from sklearn.metrics import recall_score
import numpy as np
proba = np.array([0.2,0.9,0.4]); ytrue = np.array([0,1,0])
print(max((recall_score(ytrue, proba>=t) for t in [0.3,0.5,0.7])) <= 1.0)
variante(s) utile(s)
from sklearn.metrics import precision_recall_curve
import numpy as np
prec, rec, th = precision_recall_curve(ytrue, proba)
print(len(th) == len(rec)-1)
notes
- Utiliser un set de validation distinct pour fixer le seuil.