Model Evaluation and Validation

You can find this article and source code at my GitHub

Testing

Two types of our problems

《Model Evaluation and Validation》

Think about a simple case… How well is my model doing with a regression problem?

《Model Evaluation and Validation》

It seems that, though the line in the right graph fits better to the original data points. But if we add one more new data point for testing purpose, the left one works better since it’s more generalized.

How do we measure the generalization?

For a regression problem…

《Model Evaluation and Validation》

For a classification problem…

《Model Evaluation and Validation》

Notice that both models fit the training set well, but once we introduce the testing set, the model on the left makes less mistakes than the model on the right.

This issue can be handled easily in a Python package called “sklearn”.

from sklearn.model_selection import train_test_split
X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.25) # 25% total samples will be split into the test set

A golden rule is…

Never use your testing data for training purpose.
That is, never let your model know anything about your testing data. Your model should not learn anything from the testing data.

Evaluation

There is a metric for classification problems called “confusion matrix”

《Model Evaluation and Validation》
《Model Evaluation and Validation》

You can fill the blank by yourself to see whether you understand this metric correctly.

《Model Evaluation and Validation》

The answers are 6, 1, 2 and 5 for True Positives, False Negatives, False Positives, and True Negatives, respectively.

Accuracy

We have a very basic method to calculate the accuracy…

《Model Evaluation and Validation》

Again, “sklearn” can do this simply with several lines of code

from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_predict)

Regression metrics

《Model Evaluation and Validation》

from sklearn.metrics import mean_absolute_error
from sklearn.linear_model import LinearRegression

classifier = LinearRegression()
classifier.fit(X_train, y_train)

guesses = classifier.predict(X_test)
error = mean_absolute_error(y_test, guesses)

But there is a problem with the mean absolute error (MAE) is that the formula of MAE is not differentiable, therefore it cannot be adopted to some common method we will use later such as the gradient descent.

An alternative method is the mean squared error (MSE).

《Model Evaluation and Validation》

from sklearn.metrics import mean_squared_error
from sklearn.linear_model import LinearRegression

classifier = LinearRegression()
classifier.fit(X_train, y_train)

guesses = classifier.predict(X_test)
error = mean_squared_error(y_test, guesses)

Another common metric we use here is the R2 score.

The formula is as below, and the error in the two figures is calculated with the MSE formula.

《Model Evaluation and Validation》

from sklearn.metric import r2_score

y_true = [1, 2, 3]
y_pred = [3, 2, 3]

r2_score(y_true, y_pred)

Type of Errors

Error due to bias (underfitting)

《Model Evaluation and Validation》

Error due to variance (overfitting)

《Model Evaluation and Validation》

There is the trade-off…

《Model Evaluation and Validation》

Model Complexity Graph

《Model Evaluation and Validation》

K-Fold Cross Validation

This is a very useful way to recycle our data…

《Model Evaluation and Validation》

With this algorithm, for example, in the above graph, we will go train our model 4 times with the different splitting result. And then we average the 4 results in order to find the final model.

“sklearn” is awesome!

from sklearn.model_selection import KFold

kf = KFold(12, 3)
for train_idx, test_idx in kf:
    print(train_idx, test_idx)

If we want to “eliminate” possible bias, we can also add randomized selection in the K-Fold algorithm.

《Model Evaluation and Validation》

“sklearn” is awesome AGAIN!

from sklearn.model_selection import KFold

kf = KFold(12, 3, shuffle=True)
for train_idx, test_idx in kf:
    print(train_idx, test_idx)

Thanks for reading. If you find any mistake / typo in this blog, please don’t hesitate to let me know, you can reach me by email: jyang7[at]ualberta.ca

    原文作者:Kulbear
    原文地址: https://www.jianshu.com/p/42a4e6986ef5
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞