202412091014
Status: #idea
Tags: Classification, Discriminative Models
State: #nascient
Logistic Regression
One of the big two of Linear Methods for Classification and also just a classic classification method. That is Logistic Regression and Linear Discriminant Analysis (LDA).
It arises from the desire to fit a model that can predict the posterior probability given some vector
- The probability of each class sums to
- Every individual probability is between
and inclusive.
It takes the form:
Where as we can see we are using
This has two main effects:
- No matter how small or even negative the
part is, the output will be strictly positive (the more negative the closer to ) - As
gets very big, the approaches one as we'd expect.
It is called logistic because if we take the
Before the application of the transformation analogous to the activation function in typical machine learning, the output of the linear model
The logistic model is in the family of Linear Methods for Classification exactly because of the above, as we see the logit is linear in
Note that while we show that the logit of the logistic regression is a linear regression model you might mistakenly assume that the vector
This stems from the fact that while we recover the linear equation using
What about if we have multiple features?
Well in such a case, you can easily extend the model to more features by analogy.
After all if we know that the logit for a single variable is simply
So
The rest is just a matter of solving for
Obviously throughout the way we assumed
Covered more in depth in Multinomial Logistic Regression.
In the case of more classes while in theory we can still deal with it by adding more equations for everything tht we are tryign to compute (one probability for each class or