roc曲线与sklearn [python]

2019年8月3日 158次阅读

通过使用roc库我有一个理解问题.

我想用python绘制一条roc曲线
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html

我正在编写一个评估探测器(haarcascade,神经网络)并想要评估它们的程序.
所以我已经将数据保存在文件中,格式如下：

 0.5 TP
 0.43 FP
 0.72 FN
 0.82 TN 
 ...

TP表示真阳性,FP表示假阳性,FN表示假阴性,TN表示真阴性

我解析它并用这个数据集填充4个数组.

然后我想把它放进去

   fpr, tpr = sklearn.metrics.roc_curve(y_true, y_score, average='macro', sample_weight=None)

但是怎么做呢？在我的情况下y_true和y_score是什么？
之后,我把它放到fpr,tpr中

auc = sklearn.metric.auc(fpr, tpr)

最佳答案引用维基百科：

The ROC is created by plotting the FPR (false positive rate) vs the TPR (true positive rate) at various thresholds settings.

为了计算FPR和TPR,您必须将正二进制值和目标分数提供给函数sklearn.metrics.roc_curve.

所以在你的情况下,我会做这样的事情：

from sklearn.metrics import roc_curve
from sklearn.metrics import auc

# Compute fpr, tpr, thresholds and roc auc
fpr, tpr, thresholds = roc_curve(y_true, y_score)
roc_auc = auc(y_true, y_score)

# Plot ROC curve
plt.plot(fpr, tpr, label='ROC curve (area = %0.3f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--')  # random predictions curve
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.xlabel('False Positive Rate or (1 - Specifity)')
plt.ylabel('True Positive Rate or (Sensitivity)')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")

如果您想更深入地了解如何计算所有可能的阈值的假阳性率和真阳性率,我建议您阅读this article