Multiple Regression

Forrest Young's Notes

Copyright © 1997-9 by Forrest W. Young.


Multiple Regression Example:
1997-99 GPA with Math and Verbal SAT

We return to the GPA, Math SAT and Verbal SAT variables.

The visualization for these data produces the plots shown below. We see postive relationships of Verbal and Math SAT with GPA. Note that the black dots are for Psych30 1997, the blue for Psych30C 1998, and the red for Psych30C 1999.

Rescale the data. We now prepare to do the regression analysis. We convert the data by dividing the two SAT variables by 100 to clarify the discussion of the slope (so that we can see a change of one unit on the plot). This change in the variable (dividing by a constant) does not change the relationship between the two variables, and does not change either the correlation or regression analysis.

We do this by clicking on the data object, and then typing in the listener:

(transform
 :use pstat9799
 :variables '(GPA MSAT/100 VSAT/100)
 :program
 (let ((a (/ MathSAT 100))
       (b (/ VerbSAT 100))
       )
   (list gpa a b)))

Now we do the regression analysis using ViSta's Regression Analysis module, which can be done by clicking on the Regres button on the workmap, and selecting GPA as the response variable and the two SAT variables as the predictor variables.

Report: We then ask for the regression report. It is shown below.

The regression analysis report has three major sections, each containing important information about the analysis:

  1. Parameter Estimates: The parameter estimates section of the report presents information about the slopes for each SAT variable, as well as about the intercept.
  2. Under the "Estimate" column the report presents the values for the intercept and slopes of the function that regression analysis estimates produces the best fit to the points.

    The intercept and slopes are often called the "coefficients", because they are the coefficients of the regression function. They are called "estimates" (short for "estimated coefficients") because they are estimates of what the coefficients are in the population.
     

  3. Summary of Fit
  4. Analysis of Variance: An analysis of variance is reported that tells us whether the entire regression model significantly fits the response variable. The entire model includes both slopes and the intercept simultaneously. The null hypothesis is that there is no relation between the variables. The F-Ratio and P-Value summarize this test's results. The R-Squared tells us the proportion of variance in GPA that is understood from the two SAT variables.