05.logistic regression

  • Classification is to predict categorical variables.

  • Logistic regression is one of the basic ways to perform classification.

  • Logistic regression allows us to solve classification problems, where we are trying to predict discrete categories.

  • The convention for binary classification is to have two classes 0 and 1.

logistic

lr-logistic

The result of the sigmoid or logistic function is between 0 and 1.

Evaluation

We train a logistic regression model on training data and we evaluate the model's performance on testing data.

We can use confusion matrix to evaluate classification models.

confusion-matrix

What constitues the 'good' matrics, will really depends on the situation.
In some cases, accuracy is really important. In some cases, precision and recall are important.

Accuracy = (TP + TN) / Total Accuracy tells how often the model is correct?

Misclassification Rate (Error Rate) = (FP + FN) / Total Error rate tells how often the model is wrong?

Type I Error -- False Positive
Type II Error -- False Negative

  • Binary Classification has some of its own classification metrics.
  • These include visualizations of metrics from the confusion matrix.
  • The Receiver Operator Curve (ROC) curve was developed during World War II to help analyze radar data.

roc-1

roc-2

Logistic Regression example1

Logistic Regression example2

Logistic Regression Project