Estimation of the causal model with complex concepts

Now that we have estimates of the quality of all the variables in the model, including the composite score for democracy level, the estimation of the causal model with correction for measurement errors is exactly the same as for the model with variables based on single questions. Thus, we are ready to estimate the parameters of the causal model with the composite score correcting for measurement errors.

Below, we will illustrate how to run the causal model for the composite score specified before in Figure 7.1 using both LISREL1 and Stata2. As both programs produce very similar results, please select which program you want to continue the analysis with:

Continue with LISREL

In Figure 7.2, all effects have been indicated using the symbols from LISREL. The beta (be) represents the effect of the composite score (i.e. Demlevel) on satisfaction with democracy (i.e. Satdem). Similarly, the gammas (ga) represent the effects of the control variables (i.e. LRplace and Inc) on the explained variables in the model (i.e. Satdem and Demlevel). For example, ga(1,1) indicates the effect of the control variable left-right placement on satisfaction with democracy, while ga(2,2) indicates the effect of the other control variable, income, on the composite score variable, democratic level. The effect of the variable Inc on Satdem is specified by a dashed line because it represents an effect that has been omitted because it was not significant in the analysis with correction for measurement errors (Chapters 5 and 6).

Figure 7.2 The causal model for the evaluation of democracy by a composite score in LISREL notation

The variances in the disturbances of the explained variables are denoted as ps(1,1) and ps(2,2). For the details of the procedure, we refer to the LISREL manual [Jör96] and introductions to the program LISREL [Sar84]. First, the LISREL input for this analysis without corrections is presented in Syntax 7.1. Next, we present the same input corrected for measurement errors (see Syntax 7.2).

Syntax 7.1: The LISREL syntax for the estimation of the parameters of the causal model including a composite score without correction for measurement errors
Complex analysis without correction for measurement errors !Title
data ni=4 no=1468 ma=km !ni=number of variables no=number of observations ma=matrix
km !km=correlation matrix
1.00
.429 1.00
.188 .086 1.00
.163 .190 .009 1.00
labels
satdem demlevel lrplace inc !Labels of the variables
model ny=2 nx=2 be=fu,fi ga=fu,fi ps=sy,fi !Causal model ny=dependent variables nx=control variables
free be(1,2) !free=coefficients to be estimated
free ga(2,1) ga(2,2) ga(1,1)
free ps(1,1) ps(2,2)
pd !To obtain a path diagram
out nd=3 !out= output nd=number of decimals

Syntax 7.2: The LISREL syntax for the estimation of the parameters of the causal model including a composite score with correction for measurement errors
Complex analysis with correction for measurement errors
data ni=4 no=1468 ma=km
km
.710 !The covariance matrix corrected for measurement errors
.429 .590
.112 .086 .682
.163 .190 .009 .624
labels
satdem demlevel lrplace inc
model ny=2 nx=2 be=fu,fi ga=fu,fi ps=sy,fi
free be(1,2)
free ga(2,1) ga(2,2) ga(1,1)
free ps(1,1) ps(2,2)
pd
out nd=3

The most important point is that the coefficients that have to be estimated are presented in the lines starting with ‘free’. Comparing these two inputs, we see that only the matrix with the data to be analysed has been changed. Focusing on the input for the model with correction for measurement errors, the effects will be estimated on the basis of the covariance matrix in Table 7.5 (i.e. the matrix with the correlations corrected for cmv and with the qualities on the diagonal). Because we ask in the data line that the matrix to be analysed (ma) should be the correlation matrix (km), Table 7.5 is transformed by the program into the correlation matrix corrected for measurement errors, Table 7.6.

Table 7.6: Correlations corrected for measurement errors using LISREL

The nice feature of this approach, correcting the correlations for measurement errors before estimating the effects, is that the input for the analysis is exactly the same with and without correction for measurement errors, except for the matrix of correlations that is used in the analysis. This point is illustrated in the input for the analyses with and without correction for measurement errors presented in Syntaxes 7.1 and 7.2.

It is important in the estimation of causal models to test whether the model fits to the data, i.e. that the model is not misspecified. Without going into detail, see [Sar09], we can say that the model fits very well to the data corrected for measurement errors. So there is no reason to change the model.

However, analysing the matrix without correction for measurement errors, the program indicates that the fit of the model is not good. This suggests that the effect, ga(1,2), of the control variable Income on satisfaction with democracy has to be introduced in the model. If we do so, this model also fits well to the data and we obtain the results presented in Table 7.7.

Table 7.7: The LISREL results of the estimation of the causal model including the complex concept presented in Figure 7.1 with and without corrections

Continue with Stata

Comparing Syntaxes 7.1 and 7.2, we can observe that all effects have been indicated using the Stata notation [Aco13]. Comparing the two inputs, we see that only the matrix with the data to be analysed has been changed. Focusing on the input for the model with correction for measurement errors, the effects will be estimated on the basis of the covariance matrix in Table 7.5 (i.e. the matrix with the correlations corrected for cmv and with the qualities on the diagonal).

Syntax 7.1: The Stata syntax for the estimation of the parameters of the causal model including a complex concept without correction for measurement errors
*Complex analysis without correction for measurement errors
clear all
ssd init satdem demlevel lrplace inc /*variables*/
ssd set observations 1468 /*observations*/

*Correlation matrix
#delimit;
ssd set correlations
1.00\
.429 1.00\
.188 .086 1.00\
.163 .190 .009 1.00;
#delimit cr
save ssdmatrix.dat, replace

*Causal model with complex concept
clear
use ssdmatrix.dat
ssd list
sem (satdem <- demlevel lrplace) ///
(demlevel<- lrplace inc), ///
standardized
estat eqgof /*Equation-level goodness of fit*/

Syntax 7.2: The Stata syntax for the estimation of the parameters of the causal model including a complex concept with correction for measurement errors
*Complex analysis with correction for measurement errors
clear all
ssd init satdem demlevel lrplace inc
ssd set observations 1468

*Covariance matrix
#delimit ;
ssd set covariance /*The correlation matrix corrected for measurement errors*/
.710\
.429 .590\
.112 .086 .682\
.163 .190 .009 .624;
#delimit cr
save ssdmatrix.dat, replace

*Causal model with complex concept
clear
use ssdmatrix.dat
ssd list
sem (satdem <- demlevel lrplace) ///
(demlevel<- lrplace inc), ///
standardized
estat eqgof /*Equation-level goodness of fit*/

The nice feature of this approach, correcting the correlations for measurement errors before estimating the effects, is that the input for the analysis is exactly the same with and without correction for measurement errors, except for the matrix of correlations that is used in the analysis. This point is illustrated in the input for the analyses with and without correction for measurement errors presented in Syntaxes 7.1 and 7.2.

However, analysing the matrix without correction for measurement errors, the program indicates that the fit of the model is not good. This suggests that the effect of the control variable income (Inc) on satisfaction with democracy has to be introduced in the model. If we do so, this model also fits well to the data and we get the results presented in Table 7.8.

Table 7.8: The Stata results of the estimation of the causal model including a complex concept presented in Figure 7.1 with and without corrections

If we compare the results with and without correction for measurement errors, we see first of all that the model is different. After correction for measurement errors, the effects of the control variable Inc on satisfaction with the democracy is not significantly different from zero, while, without correction for measurement errors, this effect is necessary to achieve a good fit of the model. In the latter case, we say that this variable has a direct effect on satisfaction with democracy, while in the former analysis we have to conclude that there is no direct effect, only an indirect effect.

Figure 7.3: The parameter estimates of the causal model with a complex concept of evaluation of democracy without correction for measurement errors

Figure 7.4:The parameter estimates of the causal model with a complex concept of evaluation of democracy with correction for measurement errors

Furthermore, comparing Figures 7.3 and 7.4, we see that, after correction for measurement errors, nearly all other effects are much bigger than without correction for errors. All significant effects are indicated in the figures by an asterisk (*). In this example, we see that we now have only one variable out of the three indicators of democracy level. The effect of this variable increases much more than before because the effect is not reduced by the correlations between these three indicators, i.e. the variable is now alone. Furthermore, without corrections, the explained variance in the variable Satdem is 21.4%, while, after corrections, it increases to 44.4%.3 This example again illustrates how different the results can be if measurement errors are corrected.

Exercise 7.1

Compute the corrected correlation matrix for the variables introduced in exercise 3.1 using the composite score of the variables Economy and Culture as represented in the figure below. The composite score ‘Country threats’ is created as a simple sum:

Country threats (CS) = Economic threat + Cultural threat

The model to be estimated is:

Causal model for attitudes towards immigration with a composite score

Below, the correlation matrix is provided without corrections, together with the quality predictions obtained in exercise 3.1 using SQP and the descriptive statistics used in exercise 6.1 (adding the composite score descriptives).

  1. Correlation matrix (n=1801):

  2. Quality predictions obtained in exercise 3.1 using SQP:

  3. Descriptive statistics of the variables:

Use all this information to correct the correlation matrix for measurement errors.

Solution

In order to correct the correlation of a composite score, we need to first calculate the predicted quality. This cannot be done by SQP. The alternative is to use the following formula:

Quality of CS = 1 – (var(ecs)/ var(CS))
where: var(ecs) = Σwk2 var(ek) + 2Σwkwk' cov(ekek') over k where k≠k'
var(ei) = (1-qi2)var(yi)
cov(eiej) = cmvij • sisj = (rimimjrj)(sisj)

Using the quality predictions obtained in exercise 3.1, we can derive the quality of the composite score (Threats). To do so, we need to first compute the variance in the error of the composite score using the quality estimates of the variables Economy and Culture and the information from the table above, where we have presented their variances. Computing var(ei) and cov(eiej), we get:

var(eB38) = (1 – qB382)var(B38) = (1-0.702) * 3.640 = 1.085
var(eB39) = (1 – qB392)var(B39) = (1-0.641) * 3.771 = 1.354
cov(eB38eB39) = (cmvB38,B39)(sB38sB39) = rB38mB38mB39rB39 * sB38sB39 =
= (0.896*0.354*0.418*0.881) * (1.908*1.942) = 0.433

Now we have all the components required to compute the variance in the errors of the composite score for the variable country threats, which is:

var(ethreats) = Σwk2 var(ek) + 2Σwkwk' cov(ekek')
= (1.085 + 1.354) + 2 * (0.433) = 3.305

Finally, the quality of the composite score of country threats can be computed as follows:

Qthreats = 1 – (var(ethreats)/ varthreats)
= 1 – (3.305 /11.343) = 0.709

In this case, the common method variance between the composite score (Threats) and the variable (Allow) also has to be taken into account. In this chapter , we have presented the formula for computing the cmv for this pair of variables:

cmvB40,threats = rB40 mB40 [(1/σthreats) mB38rB38 + (1/σthreats)rB39mB39] =
= 0.853 * 0.349 [(1/3.368)0.354 * 0.896 + (1/3.368)0.881 * 0.418] = 0.061

As before, the common method variance has to be subtracted from the correlation between these two variables affected by the same method. So:

corr(B40,Threats) = 0.624 – 0.061 = 0.563

To conclude, we just have to change on the diagonal the variances in the variables Allow and Better for the qualities obtained in SQP. The variance in the variable Threats has to be substituted by the quality, and the correlation between Better and Threats has to be corrected as indicated above. This will result in the correlation matrix corrected for measurement errors.

Solution

Exercise 7.2

Taking into account the results obtained in exercise 7.1, run the estimation of the same causal model with the complex concept. The correlation matrices with and without corrections obtained before are reproduced below. Use this information to compute, either in LISREL or Stata, the results for the analysis of the causal explanation of the opinion about immigration by people from outside Europe to the Netherlands, with and without correction for measurement errors.

  1. Correlation matrix (n = 1801):

  2. Correlation matrix with corrections:

Solution for LISREL users

  1. LISREL syntax for the estimation of the causal model without corrections for measurement errors:
    Causal model without correction for measurement errors
    data ni=3 no=1801 ma=km
    km
    1.000
    -0.351 1.000
    -0.424 0.624 1.000
    labels
    impcntr imwbcnt threats
    model ny=2 nx=1 be=fu,fi ga=fu,fi ps=sy,fi
    free be(1,2)
    free ga(2,1) ga(1,1)
    free ps(1,1) ps(2,2)
    pd
    out nd=3
  2. LISREL syntax for the estimation of the causal model with corrections for measurement errors:
    Causal model with correction for measurement errors
    data ni=3 no=1801 ma=km
    km
    0.763
    -0.351 0.639
    -0.424 0.563 0.709
    labels
    impcntr imwbcnt threats
    model ny=2 nx=1 be=fu,fi ga=fu,fi ps=sy,fi
    free be(1,2)
    free ga(2,1) ga(1,1)
    free ps(1,1) ps(2,2)
    pd
    out nd=3

Solution for Stata users

  1. Stata syntax for the estimation of the causal model without corrections for measurement errors:
    *Causal model without correction for measurement errors
    clear all
    ssd init impcntr imwbcnt threat
    ssd set observations 1801
    *Correlation matrix
    #delimit ;
    ssd set correlation
    1.000\
    -0.351 1.000\
    -0.424 0.624 1.000;
    #delimit cr
    save ssdmatrix.dat, replace
    *Causal model
    clear
    use ssdmatrix.dat
    ssd list
    sem (impcntr <- imwbcnt) ///
    (impcntr imwbcnt <- threat), ///
    standardized
    estat eqgof
  2. Stata syntax for the estimation of the causal model with corrections for measurement errors:
    *Causal model with correction for measurement errors
    clear all
    ssd init impcntr imwbcnt threat
    ssd set observations 1801
    *Covariance matrix
    #delimit ;
    ssd set covariance
    0.763\
    -0.351 0.639\
    -0.424 0.563 0.709;
    #delimit cr
    save ssdmatrix.dat, replace
    *Causal model
    clear
    use ssdmatrix.dat
    ssd list
    sem (impcntr <- imwbcnt) ///
    (impcntr imwbcnt <- threat), ///
    standardized
    estat eqgof

Solution

The figure at the top presents the results for the model with the composite score Threats before corrections, while the lower figure presents the results of the model after corrections. In this case, we observe that the differences in the effects are considerable, i.e. one of the effects is no longer significant, while the other two are much larger. From the exercises, we can conclude that the differences are larger in this small model. It is obvious that the explained variance (R2) in this case again increases after correcting for measurement errors.

This exercise has shown once again how different the results can be if it is corrected for measurement errors. Furthermore, it has also shown that the correction for measurement errors can be done not only in models with simple concepts, but also in models with complex concepts.

Footnotes

References