Using SPSS to carry out a quadratic regression analysis

You can follow the instructions below, or use the SPSS syntax:

Syntax for example

*Syntax for example in chapter 3, copy this syntax, paste it into a syntax window and run the syntax. *The following command causes the cases to be weighted by the design weight variable 'dweight'.

WEIGHT BY dweight.

*The following commands cause SPSS to select for analysis those cases that belong to the Swedish sample (the cases whose value on the country variable is SE) and have lower values than 1975 on the birth year variable (& stands for AND, while < stands for 'less than'). *In this process, the commands create a filter variable (filter_$) with value 1 for the selected cases and value 0 for the non-selected cases.

USE ALL.
COMPUTE filter_$=cntry = 'SE' & yrbrn < 1975.
VARIABLE LABEL filter_$ "cntry = 'SE' & yrbrn < 1975 (FILTER)".
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

*The values of the variable created by the following commands are the squared values of the two-digit 'birthyear' variable.

COMPUTE sqbirthyear = birthyear * birthyear.
VARIABLE LABELS sqbirthyear 'Squared two-digit year of birth'.
EXECUTE.

*These commands instruct SPSS to run a blockwise regression analysis with the variable 'birthyear' as independent variable in the initial model and to add the variable 'sqbirthyear' as a second independent variable in an expanded model.

REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT eduyrs
/METHOD=ENTER birthyear /METHOD=ENTER sqbirthyear.
GRAPH
/SCATTERPLOT(BIVAR)=birthyear WITH eduyrs
/MISSING=LISTWISE.

Proceed as follows:

Figure 9. Running a quadratic regression analysis blockwise

By using the ‘Next’ option, we have made SPSS compute coefficients for two different models. Firstly, a2 and b3 of the model (function) yi = a2 + b3∙xi + ei and secondly, a1, b1, and b2 of the model yi = a1 + b1∙xi + b2∙xi2 + ei (We use different coefficient subscripts in the two functions because the coefficient values are different.) The resulting coefficients are presented in Table 3.

Table 3. SPSS output: Blockwise quadratic regression coefficients

The constant value (the a2) of model 1 is very different from the one we estimated for Norway in example 2, see Table 1. The reason is that the zero point of the birth year variable now corresponds to year 1900 rather than year 0. Thus, those who dare extrapolate may interpret the constant as the predicted education length of a person born in year 1900.

The b coefficient of the linear Model 1 is of the same order as the Norwegian b-coefficient (0.108 as against 0.097). As expected, however, the analysis indicates that the linear model is not the best choice in the Swedish case. The b2 coefficient of the quadratic Model 2 is not high (-0.001), but it is high enough to have a discernible impact on the regression curve. This can be seen from Figure 10, where the regression line (based on the Model 2 coefficients) clearly rises at a decreasing rate as the birth year value increases. Figure 10 seems to corroborate our expectation that a quadratic regression line would follow the cohort mean education length more closely than the straight regression line in Figure 8 does. The procedures used to create Figure 10 have been described in chapter 1. The only difference is that one must tick ‘Quadratic’ instead of ‘Linear’ in the ‘Chart editor’s’ last dialogue box to get the curved regression line of Figure 10.

Figure 10. Scatterplot with quadratic regression line. Swedish ESS data

Finally, Table 4 reveals an (admittedly small) increase in R2 from Model 1 to Model 2, which indicates that the latter model fits the data somewhat better than the former. See more about this in the next chapter.

Table 4. SPSS output: Blockwise quadratic regression goodness of fit statistics

Regression analyses based on the function type yi = a1 + b1∙xi + b2∙xi2 + ei can produce regression lines of many shapes. The shapes vary with the signs and values of the computed coefficients. Figures that illustrate the effects of coefficient sign changes can be found below.

Appendix: Quadratic functions (second degree polynomials)

Here are some examples that demonstrate the curve shapes that can be created by means of quadratic functions. For instance, Figure A2 presents the curve that corresponds to the function y = 6 - 2∙x + 0.5∙x2. It descends towards the right for low values of x. The reason is that, for these values, the negative term - 2∙x dominates over the positive term + 0.5∙x2. But for higher values of x, the positive term dominates over the negative one, because the value of the squared x becomes much higher than the value of x and, consequently, compensates for the lower value of b2 compared with b1. (While x = 1 implies x2 = 1, x = 2 implies x2 = 4, and so forth.)

Figure A1. Graph of the function y = 6 + 2∙x + 0.5∙x2

Figure A2. Graph of the function y = 6 - 2∙x + 0.5∙x2

Figure A3. Graph of the function y = 6 - 2∙x - 0.5∙x2

Figure A4. Graph of the function y = 6 + 2∙x - 0.5∙x2

Exercise

Compute the squared ‘General subjective health’ variable and add it to the model you used in the exercise in chapter 2. What do the results tell you? Make a chart with a quadratic line by following the steps described in chapter 1, but tick ‘Quadratic’ instead of ‘Linear’ in the final dialogue box while in the ‘Chart editor’.

Go to next chapter >>