# Using SPSS to carry out a quadratic regression analysis

You can follow the instructions below, or use the SPSS syntax:

Syntax for example*Syntax for example in chapter 3, copy this syntax, paste it into a syntax window and run the syntax. *The following command causes the cases to be weighted by the design weight variable 'dweight'.

*The following commands cause SPSS to select for analysis those cases that belong to the Swedish sample (the cases whose value on the country variable is SE) and have lower values than 1975 on the birth year variable (& stands for AND, while < stands for 'less than'). *In this process, the commands create a filter variable (filter_$) with value 1 for the selected cases and value 0 for the non-selected cases.

*The values of the variable created by the following commands are the squared values of the two-digit 'birthyear' variable.

*These commands instruct SPSS to run a blockwise regression analysis with the variable 'birthyear' as independent variable in the initial model and to add the variable 'sqbirthyear' as a second independent variable in an expanded model.

Proceed as follows:

- Open the data set ‘Regression’ that you have downloaded from Nesstar WebView.
- Use the design weight to weight the cases.
- Select the Swedish cases. Remember to include only those respondents that were born before 1975
- Compute the square of the two-digit birth year variable by first selecting ‘Compute’ from the ‘Transform’ menu. Then type a name for the variable in the ‘Target variable’ field (for example ‘sqbirthyear’) and an explanatory label in the field that pops up when you click ‘Type & Label’ (for example ‘Squared two-digit year of birth’). Finally, put the function to be used in the ‘Numeric expression’ field (birthyear * birthyear). Click ‘OK’. Information about the new variable will be added as a new row at the bottom of the Data Editor’s ‘Variable View’ window. Go to ‘Data View’ by double clicking the new row’s first (shaded) cell if you want to view the new variable’s values.
- Finally, trick SPSS into performing a curvilinear regression analysis on the regression function y
_{i}= a + b_{1}∙x_{i}+ b_{2}∙x_{i}^{2}+ e_{i}, by adding the squared two-digit birth year as another independent variable beside the two-digit birth year variable in its original form. The technical procedure is essentially the same as before: - Click ‘Regression’ and ‘Linear’ on the ‘Analyses’ menu. Put the name of the education length variable in the ‘Dependent’ field and the name of the two-digit birth year variable in the ‘Independents’ field. From here, we could proceed by adding the name of the squared two-digit birth year variable in the same field as the name of the two-digit variable, but in order to demonstrate the difference between a linear and a curvilinear regression model, we click ‘Next’ and put the name of the squared variable in the blank field (Figure 9.) and finish with ‘OK’.

Figure 9. Running a quadratic regression analysis blockwise

By using the ‘Next’ option, we have made SPSS compute coefficients for two different models. Firstly, a_{2} and b_{3} of the model (function) y_{i} = a_{2} + b_{3}∙x_{i} + e_{i} and secondly, a_{1}, b_{1}, and b_{2} of the model y_{i} = a_{1} + b_{1}∙x_{i} + b_{2}∙x_{i}^{2} + e_{i} (We use different coefficient subscripts in the two functions because the coefficient values are different.) The resulting coefficients are presented in Table 3.

Table 3. SPSS output: Blockwise quadratic regression coefficients

The constant value (the a_{2}) of model 1 is very different from the one we estimated for Norway in example 2, see Table 1. The reason is that the zero point of the birth year variable now corresponds to year 1900 rather than year 0. Thus, those who dare extrapolate may interpret the constant as the predicted education length of a person born in year 1900.

The b coefficient of the linear Model 1 is of the same order as the Norwegian b-coefficient (0.108 as against 0.097). As expected, however, the analysis indicates that the linear model is not the best choice in the Swedish case. The b_{2} coefficient of the quadratic Model 2 is not high (-0.001), but it is high enough to have a discernible impact on the regression curve. This can be seen from Figure 10, where the regression line (based on the Model 2 coefficients) clearly rises at a decreasing rate as the birth year value increases. Figure 10 seems to corroborate our expectation that a quadratic regression line would follow the cohort mean education length more closely than the straight regression line in Figure 8 does. The procedures used to create Figure 10 have been described in chapter 1. The only difference is that one must tick ‘Quadratic’ instead of ‘Linear’ in the ‘Chart editor’s’ last dialogue box to get the curved regression line of Figure 10.

Figure 10. Scatterplot with quadratic regression line. Swedish ESS data

Finally, Table 4 reveals an (admittedly small) increase in R^{2} from Model 1 to Model 2, which indicates that the latter model fits the data somewhat better than the former. See more about this in the next chapter.

Table 4. SPSS output: Blockwise quadratic regression goodness of fit statistics

Regression analyses based on the function type y_{i} = a_{1} + b_{1}∙x_{i} + b_{2}∙x_{i}^{2} + e_{i} can produce regression lines of many shapes. The shapes vary with the signs and values of the computed coefficients. Figures that illustrate the effects of coefficient sign changes can be found below.

Here are some examples that demonstrate the curve shapes that can be created by means of quadratic functions. For instance, Figure A2 presents the curve that corresponds to the function y = 6 - 2∙x + 0.5∙x^{2}. It descends towards the right for low values of x. The reason is that, for these values, the negative term - 2∙x dominates over the positive term + 0.5∙x^{2}. But for higher values of x, the positive term dominates over the negative one, because the value of the squared x becomes much higher than the value of x and, consequently, compensates for the lower value of b_{2} compared with b_{1}. (While x = 1 implies x^{2} = 1, x = 2 implies x^{2} = 4, and so forth.)

^{2}

^{2}

^{2}

^{2}

## Exercise

Compute the squared ‘General subjective health’ variable and add it to the model you used in the exercise in chapter 2. What do the results tell you? Make a chart with a quadratic line by following the steps described in chapter 1, but tick ‘Quadratic’ instead of ‘Linear’ in the final dialogue box while in the ‘Chart editor’.