• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

View

# Linear regression

last edited by 5 years, 8 months ago

Linear regression demonstrates the relationship between two quantitative variables: an explanatory variable and a dependent variable. Running a linear regression is extremely useful in finding the equation of the regression line, and lending the ability to interpret both the slope and the intercept in contextually meaningful ways.

Before we can do a linear regression we have to check the all conditions for correlation.

For example, if you wanted to run a linear regression on the relationship between household size (explanatory variable)  and the amount of plastic waste (response variable) after conducting the correct correlation analysis on a specific data set, the first step would be to check and see if there is homoscedasticity (does the plot thicken). For this, you would run a linear regression, and then navigate to the residual plot, where you then check it for homoscedasticity. Once there, you would analyze the data and determine whether or not you believe that the data expands its range. For this scenario, the plot doesn't thicken, although there are some minor outliers.

Since the linear regression passes this condition, the next step is to find the equation of the regression line, which takes the form of:  $\hat{y}&space;=&space;b_{0}&space;+&space;b_{1}x$ Where y is equal to the response variable (amount of plastic waste), and x is equal to the explanatory variable (household size).

 Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) .392 .195 2.014 .049 Household size .409 .047 .751 8.798 .000 a. Dependent Variable: Weight of discarded plastic goods

By looking at theCoefficient table, we can state the equation to be:

Now we have to interpret the slope in a contextually meaningful way. So the model predicts that when the household size increases by 10 units of measurement, the amount of plastic waste would increase by 4.09 units.

Additionally, we provide a literal interpretation of the intercept as well. So the model predicts that when the household size is 0, the amount of plastic waste would be 0.392 units. To interpret this in a contextually meaningful way, we can say that the literal interpretation is not an appropriate one, because if there are no people in the house, there can’t be any plastic waste. However, the scatterplot shows us that is not extrapolation, since the predicted value is not far away from the actual data.

 Model Summaryb Model R R Square Adjusted R Square Std. Error of the Estimate 1 .751a .563 .556 .70990

The R² value tells us that 56.3% of the variability of the amount of plastic waste can be explained by the variability in the household size.

# Linear regression in SPSS:

•          Go to the "Analyze" tab and then scroll down to "Regression," and click on "Linear."
•          Add the explanatory variable to the Independent box, and the response variable to the Dependent box.
•          Now, Click the "Plots" button, and move *ZRESID to the Y box and *ZPRED to the X box.
•          Finally, click "Continue" and then click "OK".

Once you have the SPSS output generated, you can identify the coefficients, which are necessary to interpret the slope and the intercept, under the "Coefficients" table. If you scroll down to the bottom, you will need to identify the Residual Plot to check if the plot thickens.

The following video illustrated these steps: