202405181928
Status: #idea
Tags: Regression Analysis
Multiple Linear Regression
This is what we do when we have more than one regressor variables, more than one independent variables.
So
This is a linear relationship with
Technically you could do The Method Of Least Squares by hand and do differentiation for 2 regressor, 3 regressors, etc.
But that'd be dumb, painful and slow. So instead we generalize it by writing everything in matrix form.
Assumptions
Notation
The above format while clear, is unwieldy and long.
We can instead write everything as matrices and store all of that same information in matrix format.
With all of those matrices defined I can write the regression for an arbitrary number of regressors:
Hypothesis Testing in Multiple Linear Regression
Covariance
We define the covariance of some vector Y as:
The above is often referred to as the variance-covariance matrix, because while it computes the covariance, since the covariance of a variable with itself is equal to it's variance, this matrix will contain the variance at diagonal entries, and the covariance at non-diagonal entries, hence variance-covariance. Note that this is a symmetric matrix.
From there it follows that:
Multivariate Normal
We already know that in the univariate case, a random variable
Therefore by analogy, we can easily define a pdf for a normal that would be distributed according to multiple variables as follows:
Where:
is a vector is the variance-covariance matrix is the number of variables.
We say that such a random variable is:
Relevant Theorem

This is directly analogous to how in the univariate case, the linear combination of normally distributed random variables is a normally distributed variable.
Special Cases:
- One of the predictor variables is a qualitative variable
- One of the predictor has a quadratic (polynomial) relation with response variable
- The predictor variables have a linear relation with a transformation of Y, example
. - There is some interaction effect between predictor variables represented as a prodcut fo the predictor variables.

Hypothesis Testing
Checking for Significance of Regression
It checks if there's a linear relationship between the response variable and any of the
Checking for Significance of Specific Parameters
Typically we know that the model is significant, simply from looking at the data. Or at least we're pretty sure of it. While it's typically a slam dunk, we still want to show it for completeness and safety.
But the issue is that the previous test only tells us that some
It is not scary, it's pretty much the same thing as the Simple Linear Regression case. We know that
Except we don't really ever have
Test On Individual Regression Coefficient (Assuming We know our Model Is Significant)
Confidence Intervals
Confidence Interval for Individual Coefficients
Are you able to do an hypothesis for individual coefficients? Adjusted Coefficient of Determination
If so, your confidence interval is nothing more than:
As is a theme in stats, since we do not know
Recall that here,
Confidence Interval for Mean Response
Future me, you have a brain. So read this, and remember what it means. I ain't typing allat.


Confidence Interval for Prediction of New Observations
As usual our estimate is
Then the confidence interval is simply:

Very much like the univariate case.
Simultaneous Confidence Interval (Joint CI)

Where

We get the standard deviations for each