# Estimation of causal model with correction for measurement errors

In the social sciences, causal models are often used in order to estimate direct and indirect effects. For example, in the previous section we found that Income does not have a significant direct effect on satisfaction with democracy when we correct for measurement errors. However, this result can lead to the wrong conclusion that this variable has no effect at all on satisfaction with democracy. It is reasonable to assume that this variable affects the variables Free, Critic and Equal and thereby has an indirect effect on satisfaction with democracy. In order to study this, we have to use a causal model. This model is presented in Figure 5.4.

Because the regression analysis with correction for measurement errors indicated that Inc has no significant effect on Satdem, we have omitted this effect in this model. The correlations between the variables Free, Critic and Equal have been omitted in the figure to keep the picture simple, but these correlations have been introduced in the analyses. This type of model is often used in the social sciences. Therefore, we also want to show that it is very easy to estimate this type of model with correction for measurement errors.

Below, we will illustrate how to run the causal model specified above (Figure 5.4), using both LISREL1 and Stata2. As both programs produce very similar results, please select which program you want to continue the analysis with. The syntax of the causal model analysis with correction for measurement errors is available for the statistical computing program R3 in Syntax A2.2 of Appendix 2.

In Figure 5.5, all effects have been indicated using the symbols from LISREL. The betas (be) represent the effects of the explanatory (or endogenous) variables (i.e. Free, Critic and Equal) on satisfaction with democracy (i.e. Satdem). For example, be(1,2) indicates the effect of the variable freedom and fairness of elections on satisfaction with democracy. Similarly, the gammas (ga) represent the effects of the control (or exogenous) variables (i.e. LRplace and Inc) on the other variables in the model. For example, ga(1,1) indicates the effect of the control variable ‘left-right placement’ on satisfaction with democracy, while ga(3,2) indicates the effect of the other control variable, Income (i.e. Inc), on the variable equality by law (i.e. Equal). The effect of the variable Inc on Satdem is specified by a dashed line because it represents an effect that has been omitted because it was not significant in the analysis with correction for measurement errors, but we are going estimate this coefficient in the analysis without corrections.

We do not expect the control variables to completely explain the correlations that exist between the evaluation of democracy questions. We can therefore expect correlations between the disturbance terms of these variables (ς_{22}, ς_{33} and ς_{44}). These correlations are not indicated in the model but are denoted in LISREL by ps(3,2), ps(4,2) and ps(4,3), while the variances of the disturbances are denoted as ps(1,1), ps(2,2), ps(3,3) and ps(4,4). For details of the procedure, we refer to the LISREL manual [Jör96] and introductions to the program LISREL [Sar84]. First, the LISREL input for this analysis without corrections is presented in Syntax 5.3. We then present the same input corrected for measurement errors (see Syntax 5.4).

**Syntax 5.3:**The LISREL syntax for the estimation of the parameters of the causal model without correction for measurement errors*

*Note that the effect of Income on Satdem (ga(1,2)) has not been introduced in this syntax without correction for measurement errors.

**Syntax 5.4:** The LISREL syntax for the estimation of the parameters of the causal model with correction for measurement errors

The most important point is that the coefficients that have to be estimated are presented in the lines starting with ‘free’. Comparing these two inputs, we see that only the matrix with the data to be analysed has been changed. Focusing on the input for the model with correction for measurement errors, the effects will be estimated on the basis of the covariance matrix in Table 4.7 (i.e. the matrix with the correlations corrected for cmv and with the qualities on the diagonal). Because we ask in the data line that the matrix to be analysed (ma) should be the correlation matrix (km), Table 4.7 is transformed by the program into the correlation matrix corrected for measurement errors, Table 4.8.

The nice feature of this approach, correcting the correlations for measurement errors before estimating the effects, is that the input for the analysis is exactly the same with and without correction for measurement errors, except for the matrix of correlations that is used in the analysis. This point is illustrated in the input for the analyses with and without correction for measurement errors presented in Syntaxes 5.3 and 5.4.

It is important in the estimation of causal models to test whether the model fits to the data, i.e. that the model is not misspecified. Without going into detail, see [Sar09], we can say that the model fits very well to the data corrected for measurement errors. So there is no reason to change the model.

However, analysing the matrix without correction for measurement errors, the program indicates that the fit of the model is not good. This suggests that the effect, ga(1,2), of the control variable Income (Inc) on satisfaction with democracy has to be introduced in the model. If we do so, this model also fits well to the data and we get the standardized results presented in Table 5.1.

In Syntaxes 5.3 and 5.4, we can observe that all standardized effects have been indicated using the Stata notation [Aco13]. We can expect variances between the disturbance terms of the evaluation of democracy questions, which are denoted in Stata as e.free*e.critic, e.free*e.equal and e.critic*e.equal. Comparing the two inputs, we see that only the matrix with the data to be analysed has been changed. Focusing on the input for the model with correction for measurement errors, the effects will be estimated on the basis of the covariance matrix in Table 4.7 (i.e. the matrix with the correlations corrected for common method variance and with the qualities on the diagonal).

**Syntax 5.3:** The Stata syntax for the estimation of the parameters of the causal model without correction for measurement errors*

*Note that the effect of Income on Satdem has not been introduced in this syntax without correction for measurement errors.

**Syntax 5.4:** The Stata syntax for the estimation of the parameters of the causal model with correction for measurement errors

The nice feature of this approach, correcting the correlations for measurement errors before estimating the effects, is that the input for the analysis is exactly the same with and without correction for measurement errors, except for the matrix of correlations that is used in the analysis. This point is illustrated in the input for the analyses with and without correction for measurement errors presented in Syntaxes 5.3 and 5.4.

However, analysing the matrix without correction for measurement errors, the program indicates that the fit of the model is not good. This suggests that the effect of the control variable income (Inc) on satisfaction with democracy has to be introduced in the model. If we do so, this model also fits well to the data and we get the standardized results presented in Table 5.1.

If we compare the results with and without correction for measurement error, we see first of all that the model is different. After correction for measurement errors, the standardized effects of the control variable (Inc) on satisfaction with democracy are not significantly different from zero, while, without correction for measurement errors, this standardized effect is necessary in order to achieve a good fit of the model. In the latter case, we say that this variable has a direct effect on satisfaction with democracy, while in the former analysis, we have to conclude that there is no direct effect, only an indirect effect.

Furthermore, comparing Figures 5.6 and 5.7, we see that nearly all other standardized effects after correction for measurement error are greater than without correction for errors. The difference is in fact close to a factor of two in some cases. All significant effects are indicated in the figures by an asterisk (*). Using this approach we see that, in both analyses, the effect of left-right placement on freedom to criticise is not significant on the 5% significance level. This example again illustrates how different the results can be if one corrects for measurement errors. From the outputs, it can also be seen, as for the regression analysis, that, after correcting for measurement errors, the explained variance in the dependent variable increases from 22.7% to 46.6%.4

So far we have shown how correction for measurement errors can be done by first correcting the correlation matrix. In this way, we obtain the estimates of the standardized coefficients corrected for measurement errors.

In this chapter, we have shown that, after correction of the correlation matrix, the analyses with or without correction for measurement error are exactly the same. Thus, we have shown that analysis with correction for measurement errors is not a problem anymore. We have also seen that the correction for measurement errors should be performed because the results are quite different.

## Exercise 5.1

Repeat the estimation of the regression model for the variables introduced in exercise 3.1 as illustrated in the figure below.

In exercise 4.3, we have obtained the results of the correlation matrices with and without correction for measurement errors. These are reproduced in the following tables.

Note the large difference between the two matrices. We also expect big differences in the effects. Use this information to compute either in LISREL or Stata the results of the analysis for the explanation of the opinion about immigration by people from outside Europe to the Netherlands, with and without correction for measurement errors.

**LISREL syntax for the regression analysis without corrections for measurement errors*:**Regression analysis without correction for measurement errorsda ni=4 no=1801 ma=kmkm1.000-0.351 1.000-0.402 0.534 1.000-0.340 0.557 0.530 1.000labelsimpcntr imwbcnt imbgeco imuecltmodel ny=1 nx=3pdout nd=3**LISREL syntax for the regression analysis with corrections for measurement errors*:**Regression analysis with correction for measurement errorsda ni=4 no=1801 ma=kmkm0.763-0.351 0.639-0.402 0.440 0.702-0.340 0.447 0.413 0.641labelsimpcntr imwbcnt imbgeco imuecltmodel ny=1 nx=3pdout nd=3

*Note that the only difference between the two inputs is the matrix used as input.

**Stata syntax for the regression analysis without corrections for measurement errors*:***Regression analysis without correction for measurement errorsclear allssd init impcntr imwbcnt imbgeco imuecltssd set observations 1801*Correlation matrix#delimit ;ssd set correlations1.000\-0.351 1.000\-0.402 0.534 1.000\-0.340 0.557 0.530 1.000;#delimit crsave ssdmatrix.dat, replace*Regression modelclearuse ssdmatrix.datssd listsem (impcntr <- imwbcnt imbgeco imueclt), standardizedestat eqgof**Stata syntax for the regression analysis with correction for measurement errors*:***Regression analysis with correction for measurement errorsclear allssd init impcntr imwbcnt imbgeco imuecltssd set observations 1801*Covariance matrix#delimit ;ssd set covariances0.763\-0.351 0.639\-0.402 0.440 0.702\-0.340 0.447 0.413 0.641;#delimit crsave ssdmatrix.dat, replace*Regression modelclearuse ssdmatrix.datssd listsem (impcntr <- imwbcnt imbgeco imueclt), standardizedestat eqgof

*Note that the only difference between the two inputs is the matrix used as input.

From the results obtained we can observe that the correlations between all variables have increased considerably after correcting the correlation matrix for measurement errors. The corrected correlations are presented in the following table:

Allow | Better | Economy | Culture | |
---|---|---|---|---|

(B37) Allow | 1.000 | |||

(B40) Better | -0.503 | 1.000 | ||

(B38) Economy | -0.549 | 0.657 | 1.000 | |

(B39) Culture | -0.486 | 0.698 | 0.616 | 1.000 |

Although we expected large differences in the effect estimates with and without correction for measurements error because of the changes in the correlations, the results presented in the figures for the regression model without (the figure above) and with (the figure below) corrections for measurement errors indicate that all effects have increased slightly after correcting for measurement errors. The reason of this small change is that not only the correlations between Allow and the explanatory variables increase considerably after correction for measurement errors (see table with the corrected correlations above) but also the correlations between explanatory variables did so. Because the effects on the variable Allow are direct effects, part of the effect of each of these variables on Allow is taken away by the similar increase of the effects of the other explanatory variables and the correlations between them. This is illustrated by the change in the explained variance (R^{2}) in the model. Without corrections, R^{2} is quite low (19.7%), while with corrections it increases to 34.9%. This means that these variables together explain much more. However, because of the increase in the correlations between the explanatory variables, the direct effects cannot increase too much. If there was only one explanatory variable instead of three, the effect would have been much larger (see Chapter 7).

## Exercise 5.2:

Taking into account the results obtained in exercise 5.1, repeat the estimation of the causal model presented in this chapter for the variables introduced in exercise 3.1. The correlation matrices with and without corrections obtained in exercise 4.3 were reproduced in the previous exercise. Use this information to compute either in LISREL or Stata the results for the analysis for the causal explanation of the opinion about immigration by people from outside Europe to the Netherlands with and without correction for measurement errors, illustrated in the next model.

**LISREL syntax for the regression analysis without corrections for measurement errors:**Causal analysis without correction for measurement errorsda ni=4 no=1801 ma=kmkm1.000-0.351 1.000-0.402 0.534 1.000-0.340 0.557 0.530 1.000labelsimpcntr imwbcnt imbgeco imuecltmodel ny=2 nx=2 be=fu,fi ga=fu,fi ps=sy,fifree be(1,2)free ga(1,1) ga(1,2) ga(2,1) ga(2,2)free ps(1,1) ps(2,2)pdout nd=3**LISREL syntax for the regression analysis with corrections for measurement errors:**Causal analysis with correction for measurement errorsda ni=4 no=1801 ma=kmkm0.763-0.351 0.639-0.402 0.440 0.702-0.340 0.447 0.413 0.641labelsimpcntr imwbcnt imbgeco imuecltmodel ny=2 nx=2 be=fu,fi ga=fu,fi ps=sy,fifree be(1,2)free ga(1,1) ga(1,2) ga(2,1) ga(2,2)free ps(1,1) ps(2,2)pdout nd=3

**Stata syntax for the regression analysis without corrections for measurement errors:***Causal analysis without correction for measurement errorsclear allssd init impcntr imwbcnt imbgeco imuecltssd set observations 1801*Correlation matrix#delimit ;ssd set correlation1.000\-0.351 1.000\-0.402 0.534 1.000\-0.340 0.557 0.530 1.000;#delimit crsave ssdmatrix.dat, replace*Causal modelclearuse ssdmatrix.datssd listsem (impcntr <- imwbcnt imbgeco imueclt) ///(imbgeco imueclt -> imwbcnt), ///standardizedestat eqgof**Stata syntax for the regression analysis with corrections for measurement errors:***Causal analysis with correction for measurement errorsclear allssd init impcntr imwbcnt imbgeco imuecltssd set observations 1801*Covariance matrix#delimit ;ssd set covariances0.763\-0.351 0.639\-0.402 0.440 0.702\-0.340 0.447 0.413 0.641;#delimit crsave ssdmatrix.dat, replace*Causal modelclearuse ssdmatrix.datssd listsem (impcntr <- imwbcnt imbgeco imueclt) ///(imbgeco imueclt -> imwbcnt), ///standardizedestat eqgof

The top figure presents the results for the causal model without corrections. The lower figure shows the results for the causal model with corrections. Again, we see from these models that all effects are significant. Comparing these two models, we again see, on the one side, a small increase in the effects after correcting for measurement errors and, on the other side, a considerable increase in the variance explained. For the causal model without corrections, the variance in Allow that is explained is still very low, 19.7%. For the causal model with corrections, the variance in Allow that is explained increases to 34.9%. The reason is the same as in the case of the regression model: the increase in the correlations between the explanatory variables.

#### Footnotes

- [1] The following illustration and results are based on the LISREL 8.7 software version: Jöreskog, K.G. & Sörbom, D. (2004). LISREL 8.7 for Windows [Computer software]. Skokie, IL: Scientific Software International, Inc.
- [2] The following illustration and results are based on the Stata 12 software version: StataCorp. 2011.
**Stata Statistical Software: Release 12**. College Station, TX: StataCorp LP. - [3] R: A Language and Environment for Statistical Computing, R Core Team, R Foundation for Statistical Computing.
- [4] The explained variance can be obtained in LISREL from the section
**Squared Multiple Correlations for Structural Equations**(R^{2}). Similarly, in Stata the command**estat eqgof**will show you the value of the R^{2}.

#### References

- [Aco13] Acock, A. C. (2013). Discovering Structural Equation Modeling Using Stata, Revised Edition.
*Stata press*. - [Jör96] Jöreskog, K. G. and Sörbom, D. (1996). LISREL 8 User’s Reference Guide.
*Scientific Software International*. - [Sar09] Saris, W. E., Satorra, A. and Van der Veld, W. M. (2009). Testing Structural Equation Models or Detection of Misspecifications?
*Structural Equation Modeling: A Multidisciplinary Journal, 16 (4), 561-582*. - [Sar84] Saris, W. E. and Stronkhorst, L. H. (1984). Causal modelling in nonexperimental research: an introduction to the LISREL approach.
*Sociometric Research Foundation*.