Estimation of causal model with correction for measurement errors

In the social sciences, causal models are often used in order to estimate direct and indirect effects. For example, in the previous section we found that Income does not have a significant direct effect on satisfaction with democracy when we correct for measurement errors. However, this result can lead to the wrong conclusion that this variable has no effect at all on satisfaction with democracy. It is reasonable to assume that this variable affects the variables Free, Critic and Equal and thereby has an indirect effect on satisfaction with democracy. In order to study this, we have to use a causal model. This model is presented in Figure 5.4.

Figure 5.4: The causal model for the evaluation of democracy

Because the regression analysis with correction for measurement errors indicated that Inc has no significant effect on Satdem, we have omitted this effect in this model. The correlations between the variables Free, Critic and Equal have been omitted in the figure to keep the picture simple, but these correlations have been introduced in the analyses. This type of model is often used in the social sciences. Therefore, we also want to show that it is very easy to estimate this type of model with correction for measurement errors.

Below, we will illustrate how to run the causal model specified above (Figure 5.4), using both LISREL1 and Stata2. As both programs produce very similar results, please select which program you want to continue the analysis with. The syntax of the causal model analysis with correction for measurement errors is available for the statistical computing program R3 in Syntax A2.2 of Appendix 2.

Continue with LISREL

In Figure 5.5, all effects have been indicated using the symbols from LISREL. The betas (be) represent the effects of the explanatory (or endogenous) variables (i.e. Free, Critic and Equal) on satisfaction with democracy (i.e. Satdem). For example, be(1,2) indicates the effect of the variable freedom and fairness of elections on satisfaction with democracy. Similarly, the gammas (ga) represent the effects of the control (or exogenous) variables (i.e. LRplace and Inc) on the other variables in the model. For example, ga(1,1) indicates the effect of the control variable ‘left-right placement’ on satisfaction with democracy, while ga(3,2) indicates the effect of the other control variable, Income (i.e. Inc), on the variable equality by law (i.e. Equal). The effect of the variable Inc on Satdem is specified by a dashed line because it represents an effect that has been omitted because it was not significant in the analysis with correction for measurement errors, but we are going estimate this coefficient in the analysis without corrections.

Figure 5.5: The causal model for the evaluation of democracy with LISREL notation

We do not expect the control variables to completely explain the correlations that exist between the evaluation of democracy questions. We can therefore expect correlations between the disturbance terms of these variables (ς22, ς33 and ς44). These correlations are not indicated in the model but are denoted in LISREL by ps(3,2), ps(4,2) and ps(4,3), while the variances of the disturbances are denoted as ps(1,1), ps(2,2), ps(3,3) and ps(4,4). For details of the procedure, we refer to the LISREL manual [Jör96] and introductions to the program LISREL [Sar84]. First, the LISREL input for this analysis without corrections is presented in Syntax 5.3. We then present the same input corrected for measurement errors (see Syntax 5.4).

Syntax 5.3: The LISREL syntax for the estimation of the parameters of the causal model without correction for measurement errors*

Causal analysis without corrections for measurement errors !Title
data ni=6 no=1468 ma=km !ni=number of variables no=number of observations ma=matrix
km !km=correlation matrix
1.00
.3951.00
.268 .474 1.00
.310 .299 .271 1.00
.188 .070 .018 .094 1.00
.163 .227 .174 .064 .009 1.00
labels
satdem free critic equal lrplace inc !Labels of the variables
model ny=4 nx=2 be=fu,fi ga=fu,fi ps=sy,fi !Causal model ny=dependent variables nx=control variables
free be(1,2) be(1,3) be (1,4) !free=coefficients to be estimated
free ga(2,1) ga(2,2) ga(3,1) ga(3,2) ga(4,1) ga(4,2) ga(1,1)
free ps(1,1) ps(2,2) ps(3,3) ps(4,4) ps(3,2) ps(4,2) ps(4,3)
pd !To obtain a path diagram
out nd=3 !out= output nd=number of decimals

*Note that the effect of Income on Satdem (ga(1,2)) has not been introduced in this syntax without correction for measurement errors.

Syntax 5.4: The LISREL syntax for the estimation of the parameters of the causal model with correction for measurement errors

Causal analysis with correction for measurement errors
data ni=6 no=1468 ma=km
km
.710 !The correlation matrix corrected for measurement errors
.395 .643
.268 .333 .604
.310 .160 .114 .605
.112 .070 .018 .094 .682
.163 .227 .174 .064 .009 .624
labels
satdem free critic equal lrplace inc
model ny=4 nx=2 be=fu,fi ga=fu,fi ps=sy,fi
free be(1,2) be(1,3) be (1,4)
free ga(2,1) ga(2,2) ga(3,1) ga(3,2) ga(4,1) ga(4,2) ga(1,1)
free ps(1,1) ps(2,2) ps(3,3) ps(4,4) ps(3,2) ps(4,2) ps(4,3)
pd
out nd=3

The most important point is that the coefficients that have to be estimated are presented in the lines starting with ‘free’. Comparing these two inputs, we see that only the matrix with the data to be analysed has been changed. Focusing on the input for the model with correction for measurement errors, the effects will be estimated on the basis of the covariance matrix in Table 4.7 (i.e. the matrix with the correlations corrected for cmv and with the qualities on the diagonal). Because we ask in the data line that the matrix to be analysed (ma) should be the correlation matrix (km), Table 4.7 is transformed by the program into the correlation matrix corrected for measurement errors, Table 4.8.

The nice feature of this approach, correcting the correlations for measurement errors before estimating the effects, is that the input for the analysis is exactly the same with and without correction for measurement errors, except for the matrix of correlations that is used in the analysis. This point is illustrated in the input for the analyses with and without correction for measurement errors presented in Syntaxes 5.3 and 5.4.

It is important in the estimation of causal models to test whether the model fits to the data, i.e. that the model is not misspecified. Without going into detail, see [Sar09], we can say that the model fits very well to the data corrected for measurement errors. So there is no reason to change the model.

However, analysing the matrix without correction for measurement errors, the program indicates that the fit of the model is not good. This suggests that the effect, ga(1,2), of the control variable Income (Inc) on satisfaction with democracy has to be introduced in the model. If we do so, this model also fits well to the data and we get the standardized results presented in Table 5.1.

Table 5.1: The LISREL results of the estimation of the causal model presented in Figure 5.4 with and without corrections

Continue with Stata

In Syntaxes 5.3 and 5.4, we can observe that all standardized effects have been indicated using the Stata notation [Aco13]. We can expect variances between the disturbance terms of the evaluation of democracy questions, which are denoted in Stata as e.free*e.critic, e.free*e.equal and e.critic*e.equal. Comparing the two inputs, we see that only the matrix with the data to be analysed has been changed. Focusing on the input for the model with correction for measurement errors, the effects will be estimated on the basis of the covariance matrix in Table 4.7 (i.e. the matrix with the correlations corrected for common method variance and with the qualities on the diagonal).

Syntax 5.3: The Stata syntax for the estimation of the parameters of the causal model without correction for measurement errors*

*Causal analysis without correction for measurement errors
clear all
ssd init satdem free critic equal lrplace inc /*variables*/
ssd set observations /*observations*/

*Correlation matrix
#delimit ;
ssd set correlations
1.00\
.395 1.00\
.268 .474 1.00\
.310 .299 .271 1.00\
.188 .070 .018 .094 1.00\
.163 .227 .174 .064 .009 1.00;
#delimit cr
save ssdmatrix.dat, replace

*Causal model
clear
use ssdmatrix.dat
ssd list
sem (satdem <- free critic equal lrplace) ///
(lrplace inc -> free critic equal), ///
standardized
estat eqgof /*Equation-level goodness of fit*/

*Note that the effect of Income on Satdem has not been introduced in this syntax without correction for measurement errors.

Syntax 5.4: The Stata syntax for the estimation of the parameters of the causal model with correction for measurement errors

*Causal model with correction for measurement errors
clear all
ssd init satdem free critic equal lrplace inc
ssd set observations 1468
*Covariance matrix
#delimit ;
ssd set covariance /*The correlation matrix corrected for measurement errors */
.710\
.395 .643\
.268 .333 .604\
.310 .160 .114 .605\
.112 .070 .018 .094 .682\
.163 .227 .174 .064 .009 .624;
#delimit cr
save ssdmatrix.dat, replace
*Causal model
clear
use ssdmatrix.dat
ssd list
sem (satdem <- free critic equal lrplace) ///
(lrplace inc -> free critic equal), ///
covariance(e.free*e.equal e.free*e.critic e.critic*e.equal)
sem, standardized
estat eqgof /*Equation-level goodness of fit*/

The nice feature of this approach, correcting the correlations for measurement errors before estimating the effects, is that the input for the analysis is exactly the same with and without correction for measurement errors, except for the matrix of correlations that is used in the analysis. This point is illustrated in the input for the analyses with and without correction for measurement errors presented in Syntaxes 5.3 and 5.4.

However, analysing the matrix without correction for measurement errors, the program indicates that the fit of the model is not good. This suggests that the effect of the control variable income (Inc) on satisfaction with democracy has to be introduced in the model. If we do so, this model also fits well to the data and we get the standardized results presented in Table 5.1.

Table 5.1: The Stata results of the estimation of the causal model presented in Figure 5.4 with and without corrections

If we compare the results with and without correction for measurement error, we see first of all that the model is different. After correction for measurement errors, the standardized effects of the control variable (Inc) on satisfaction with democracy are not significantly different from zero, while, without correction for measurement errors, this standardized effect is necessary in order to achieve a good fit of the model. In the latter case, we say that this variable has a direct effect on satisfaction with democracy, while in the former analysis, we have to conclude that there is no direct effect, only an indirect effect.

Figure 5.6: The standardized coefficients of the causal model for the evaluation of democracy without correction for measurement errors

Figure 5.7: The standardized coefficients of the causal model for the evaluation of democracy with correction for measurement errors

Furthermore, comparing Figures 5.6 and 5.7, we see that nearly all other standardized effects after correction for measurement error are greater than without correction for errors. The difference is in fact close to a factor of two in some cases. All significant effects are indicated in the figures by an asterisk (*). Using this approach we see that, in both analyses, the effect of left-right placement on freedom to criticise is not significant on the 5% significance level. This example again illustrates how different the results can be if one corrects for measurement errors. From the outputs, it can also be seen, as for the regression analysis, that, after correcting for measurement errors, the explained variance in the dependent variable increases from 22.7% to 46.6%.4

So far we have shown how correction for measurement errors can be done by first correcting the correlation matrix. In this way, we obtain the estimates of the standardized coefficients corrected for measurement errors.

In this chapter, we have shown that, after correction of the correlation matrix, the analyses with or without correction for measurement error are exactly the same. Thus, we have shown that analysis with correction for measurement errors is not a problem anymore. We have also seen that the correction for measurement errors should be performed because the results are quite different.

Exercise 5.1

Repeat the estimation of the regression model for the variables introduced in exercise 3.1 as illustrated in the figure below.

Regression model for attitudes towards immigration

In exercise 4.3, we have obtained the results of the correlation matrices with and without correction for measurement errors. These are reproduced in the following tables.

1) Correlation matrix (n = 1801):

2) Correlation matrix with correction for measurement errors:

Note the large difference between the two matrices. We also expect big differences in the effects. Use this information to compute either in LISREL or Stata the results of the analysis for the explanation of the opinion about immigration by people from outside Europe to the Netherlands, with and without correction for measurement errors.

Solution for LISREL users

  1. LISREL syntax for the regression analysis without corrections for measurement errors*:
    Regression analysis without correction for measurement errors
    da ni=4 no=1801 ma=km
    km
    1.000
    -0.351 1.000
    -0.402 0.534 1.000
    -0.340 0.557 0.530 1.000
    labels
    impcntr imwbcnt imbgeco imueclt
    model ny=1 nx=3
    pd
    out nd=3
  2. LISREL syntax for the regression analysis with corrections for measurement errors*:
    Regression analysis with correction for measurement errors
    da ni=4 no=1801 ma=km
    km
    0.763
    -0.351 0.639
    -0.402 0.440 0.702
    -0.340 0.447 0.413 0.641
    labels
    impcntr imwbcnt imbgeco imueclt
    model ny=1 nx=3
    pd
    out nd=3

*Note that the only difference between the two inputs is the matrix used as input.

Solution for Stata users

  1. Stata syntax for the regression analysis without corrections for measurement errors*:
    *Regression analysis without correction for measurement errors
    clear all
    ssd init impcntr imwbcnt imbgeco imueclt
    ssd set observations 1801
    *Correlation matrix
    #delimit ;
    ssd set correlations
    1.000\
    -0.351 1.000\
    -0.402 0.534 1.000\
    -0.340 0.557 0.530 1.000;
    #delimit cr
    save ssdmatrix.dat, replace
    *Regression model
    clear
    use ssdmatrix.dat
    ssd list
    sem (impcntr <- imwbcnt imbgeco imueclt), standardized
    estat eqgof
  2. Stata syntax for the regression analysis with correction for measurement errors*:
    *Regression analysis with correction for measurement errors
    clear all
    ssd init impcntr imwbcnt imbgeco imueclt
    ssd set observations 1801
    *Covariance matrix
    #delimit ;
    ssd set covariances
    0.763\
    -0.351 0.639\
    -0.402 0.440 0.702\
    -0.340 0.447 0.413 0.641;
    #delimit cr
    save ssdmatrix.dat, replace
    *Regression model
    clear
    use ssdmatrix.dat
    ssd list
    sem (impcntr <- imwbcnt imbgeco imueclt), standardized
    estat eqgof

*Note that the only difference between the two inputs is the matrix used as input.

Solution

From the results obtained we can observe that the correlations between all variables have increased considerably after correcting the correlation matrix for measurement errors. The corrected correlations are presented in the following table:

The corrected correlations
Allow Better Economy Culture
(B37) Allow 1.000
(B40) Better -0.503 1.000
(B38) Economy -0.549 0.657 1.000
(B39) Culture -0.486 0.698 0.616 1.000

The estimated standardized effects for the regression model without corrections for measurement errors

The estimated standardized effects for the regression model with corrections for measurement errors

Although we expected large differences in the effect estimates with and without correction for measurements error because of the changes in the correlations, the results presented in the figures for the regression model without (the figure above) and with (the figure below) corrections for measurement errors indicate that all effects have increased slightly after correcting for measurement errors. The reason of this small change is that not only the correlations between Allow and the explanatory variables increase considerably after correction for measurement errors (see table with the corrected correlations above) but also the correlations between explanatory variables did so. Because the effects on the variable Allow are direct effects, part of the effect of each of these variables on Allow is taken away by the similar increase of the effects of the other explanatory variables and the correlations between them. This is illustrated by the change in the explained variance (R2) in the model. Without corrections, R2 is quite low (19.7%), while with corrections it increases to 34.9%. This means that these variables together explain much more. However, because of the increase in the correlations between the explanatory variables, the direct effects cannot increase too much. If there was only one explanatory variable instead of three, the effect would have been much larger (see Chapter 7).

Exercise 5.2:

Taking into account the results obtained in exercise 5.1, repeat the estimation of the causal model presented in this chapter for the variables introduced in exercise 3.1. The correlation matrices with and without corrections obtained in exercise 4.3 were reproduced in the previous exercise. Use this information to compute either in LISREL or Stata the results for the analysis for the causal explanation of the opinion about immigration by people from outside Europe to the Netherlands with and without correction for measurement errors, illustrated in the next model.

Causal model for attitudes towards immigration

Solution for LISREL users

  1. LISREL syntax for the regression analysis without corrections for measurement errors:
    Causal analysis without correction for measurement errors
    da ni=4 no=1801 ma=km
    km
    1.000
    -0.351 1.000
    -0.402 0.534 1.000
    -0.340 0.557 0.530 1.000
    labels
    impcntr imwbcnt imbgeco imueclt
    model ny=2 nx=2 be=fu,fi ga=fu,fi ps=sy,fi
    free be(1,2)
    free ga(1,1) ga(1,2) ga(2,1) ga(2,2)
    free ps(1,1) ps(2,2)
    pd
    out nd=3
  2. LISREL syntax for the regression analysis with corrections for measurement errors:
    Causal analysis with correction for measurement errors
    da ni=4 no=1801 ma=km
    km
    0.763
    -0.351 0.639
    -0.402 0.440 0.702
    -0.340 0.447 0.413 0.641
    labels
    impcntr imwbcnt imbgeco imueclt
    model ny=2 nx=2 be=fu,fi ga=fu,fi ps=sy,fi
    free be(1,2)
    free ga(1,1) ga(1,2) ga(2,1) ga(2,2)
    free ps(1,1) ps(2,2)
    pd
    out nd=3

Solution for Stata users

  1. Stata syntax for the regression analysis without corrections for measurement errors:
    *Causal analysis without correction for measurement errors
    clear all
    ssd init impcntr imwbcnt imbgeco imueclt
    ssd set observations 1801
    *Correlation matrix
    #delimit ;
    ssd set correlation
    1.000\
    -0.351 1.000\
    -0.402 0.534 1.000\
    -0.340 0.557 0.530 1.000;
    #delimit cr
    save ssdmatrix.dat, replace
    *Causal model
    clear
    use ssdmatrix.dat
    ssd list
    sem (impcntr <- imwbcnt imbgeco imueclt) ///
    (imbgeco imueclt -> imwbcnt), ///
    standardized
    estat eqgof
  2. Stata syntax for the regression analysis with corrections for measurement errors:
    *Causal analysis with correction for measurement errors
    clear all
    ssd init impcntr imwbcnt imbgeco imueclt
    ssd set observations 1801
    *Covariance matrix
    #delimit ;
    ssd set covariances
    0.763\
    -0.351 0.639\
    -0.402 0.440 0.702\
    -0.340 0.447 0.413 0.641;
    #delimit cr
    save ssdmatrix.dat, replace
    *Causal model
    clear
    use ssdmatrix.dat
    ssd list
    sem (impcntr <- imwbcnt imbgeco imueclt) ///
    (imbgeco imueclt -> imwbcnt), ///
    standardized
    estat eqgof

Solution

The estimated standardized effects of the causal model without corrections for measurement errors

The estimated standardized effects of the regression model with corrections for measurement errors

The top figure presents the results for the causal model without corrections. The lower figure shows the results for the causal model with corrections. Again, we see from these models that all effects are significant. Comparing these two models, we again see, on the one side, a small increase in the effects after correcting for measurement errors and, on the other side, a considerable increase in the variance explained. For the causal model without corrections, the variance in Allow that is explained is still very low, 19.7%. For the causal model with corrections, the variance in Allow that is explained increases to 34.9%. The reason is the same as in the case of the regression model: the increase in the correlations between the explanatory variables.

Go to next chapter >>

Footnotes

References