• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

View

# Residual plot

last edited by 6 years, 8 months ago

A residual plot is a display of the residuals on the y-axis and the independent variables on the x-axis. This shows the relationship between the independent variable and the response variable. A residual can be defined as the observed value minus the predicted value (ey– ŷ). The purpose of a residual plot is to determine whether or not a linear regression model is appropriate for the data.  No model is perfect, so knowing how and where it fails is important. To see that, we analyze residuals. Residuals help us see whether or not the model makes sense. When a regression model is appropriate, it should model the underlying relationship. Nothing of interest should be left behind.

A residual plot is used to check an important condition known as the Does the Plot Thicken Condition? This condition states that there should be no "thickening," or in other words no widening, of the residuals within the residual plot. We have to check this condition because the typical violation of the condition is that the spread increases as the x or predicted values increase.  This condition states that when viewing the residuals on the plot, they should stretch horizontally with a neutral spread. By a neutral spread, we mean that there should be about the same amount of scatter throughout. There should be no outliers and it should show no bends. If the residual plot lacks any interesting features, such as direction or shape, you can carry on with your regression analysis. In other words, the residual plot should be the most boring display you have ever seen, ultimately displaying a cloud of "nothing".

For example, suppose we are creating a regression analysis discerning the relationship between household size and the weight of discarded plastic goods. After we have checked that they have an appropriate correlation and meet the requirements for regression. The requirements being: both variables have to be quantitative; the scatterplot graphed from the correlation must be in a reasonably straight line; and there must be no outliers within the data.  We can create a residual plot in SPSS. For our example, we used household size as our independent (or explanatory) variable, and the weight of discarded plastic goods as our dependent (or response) variable. It is important to remember that although SPSS labels residual plots as "scatterplots," they are not the type of scatterplots used to plot two quantitative variables. Scatterplots that plot two quantitative variables are used for correlation. We can see from this residual plot that there is nothing of interest in regards to shape and direction of the residuals. If we saw any type of distinct patterns in the plot, we would assume that the Does the Plot Thicken Condition? was not met, according to the above stated explanation. Because this condition was met, we can continue on with the regression analysis regarding the relationship between household size and weight of discarded plastic.

# Generating a Residual Plot in SPSS

• Go to the "Analyze" menu and select "Regression"
• Under the "Regression" options, select "Linear"
• In the "Linear Regression" dialogue box, click and drag the explanatory variable (x) into the "Independent" variable box
• In the "Linear Regression" dialogue box, click and drag the response variable (y) into the "Dependent" variable box
• On the right hand side of the "Linear Regression" dialogue box, click the "Plots" option to open the "Linear Regression: Plots" dialogue box
• In the "Linear Regression: Plots" Click and drag "*ZPRED" into the X box
• In the "Linear Regression: Plots" Click and drag "*ZRESID" into the Y box
• Click Continue to close and save the "Linear Regression: Plots"; then press okay in the "Linear Regression" dialogue box to close it
• SPSS has now created an output; scroll down to the charts section to view the residual plot
• Look for the model entitled "Scatterplot" with "Regression Standardized Residual" on the Y-axis and "Regression Standardized Predicted Value" on the X-axis. This is the residual plot.

**Please remember that although SPSS labels the residual plot as a "scatterplot," it is in fact a residual plot.

The following video demonstrates the steps for creating a residual plot using SPSS: