Linear vs Logistic Regression
April 07, 2024
Overview
Linear and Logistic Regression are mathematical techniques that are used in the context of line of best fit of a cluster of data or binary classification (respectively) in various fields, including Machine Learning. This blog post will focus on the context of Machine Learning.
Both techniques use training data that is used to iteratively find the line of best fit using gradient descent.
These techniques for finding a line of best fit for unary data (linear regression) and binary classification (logistic regression) are provided by libraries like TensorFlow. However, it's important to understand how these libraries work under-the-hood. A great way to gain a solid, concise understanding is through this Coursera course offered by Standford: Supervised Machine Learning: Regression and Classification.
Table of Contents
Gradient Descent
Gradient descent is a technique for finding the minimum cost or loss for a given set of parameters (i.e. features).
- Linear Regression: This minimum cost is a function of the cumulative distance of the line of best fit vs each training example.
- Logistic Regression: This minimum loss is a function of the cumulative distance of the line of best fit vs each training example. However, this function takes into account the binary classification of the data.
In either case, the line can be curved (more parameters add more curves).
Gradient descent is a process that is run computationally (in code, using a library like TensorFlow) that iteratively finds a local minimum (that is also hopefully a global minimum) in a 3-dimensional graph. In Linear Regression, this 3D graph is shaped like a bowl, so there's only a single local minimum, and it is also the global minimum. In Logistic Regression, this 3D graph can take any shape with many local minima.
Parameters (i.e. Features)
Parameters (i.e. features) are literal additions in the sequence of mathematical descriptors for how the values should behave along the graph. For example, you may want to describe the growth in house prices in: lot size, age, number of bedrooms, zip code. Each one can be a parameter (i.e. feature).
Learn more: Logistic regression: Many explanatory variables.
Linear Regression
In Linear Regression, we want to find a line of best fit in a space (most commonly in a 2D space, but can be applied to spaces with more dimensions). The purpose of this is to find the estimated value of one axis vs the other axis (or axes).
Logistic Regression
In Logistic Regression, we want to find a curved line of best fit that fits between two clusters of data. This is called binary classification. However, Logistic Regression can be extended to more than two clusters of data, through multinomial logistic regression.
Over and Under-Fitting
Over and Under-Fitting are when the number of parameters (i.e. features) used in the equation of the line of best fit doesn't fit the training data well. This an occur for two reasons:
Under-Fitting
Under-fitting is when too few parameters are used in the equation for the line of best fit. This results in a line that is too flat or straight.
Over-Fitting
Over-fitting is when too many parameters are used in the equation for the line of best fit. This results in a line that matches the data very closely, however, it does not accurately predict new data.
Conclusion
Linear and Logistic Regression are the most fundamental tools used in Machine Learning classification. This blog post is intended to provide a concise overview.
To be updated with diagrams and equations (and a breakdown of the equations). It is currently intended as supplementary content (to academic material that already contains these equations). It will also include a step-by-step walk-through of finding the line of best fit.
It will be updated with nuances. For example, usages in classification and clustering.