# Chapter 5: Latent variable models with categorical indicators

### Example 2 on Latent trait models for binary items: Multigroup analysis for cross-national comparison of the factor means

*Note: The discussion of this example focuses on somewhat more advanced topics than most of the rest of this module.*

Using the same three binary indicators of trust in the procedural fairness of the police, estimate and compare the averages of this factor between the countries in the ESS.

This analysis expands that of Example 1 by adding a structural model where the country of the respondent is used as an explanatory variable for the factor. This is thus a multigroup analysis with country as the group and assuming cross-national equivalence of measurement for all of the observed items. Unlike for the *sem* command for factor analysis and structural equation modelling, the *gsem* command in Stata does not have separate command syntax for multigroup analysis. Instead, the group (here country) is simply specified as a categorical explanatory variable for the factor, and entered in the form of dummy variables for the groups (omitting the dummy variable for one reference group, which is here taken to be Belgium).

The Stata commands included above show two ways of carrying out this analysis:

- Method 1: A one-step analysis where the measurement model, and the structural model for the latent factor conditional on the country, are estimated together. This approach is comparable to fitting a multigroup structural equation model for continuous items and with country as the group.
- Method 2: A three-step analysis where we (i) fit the measurement model for all the data, ignoring country; (ii) assign factor scores for each respondent based on the model from (i), and (iii) fit a linear regression model for these factor scores given country. Steps (i) and (ii) were already done in Example 1 above.

These approaches have slightly different characteristics:

- Method 2 is computationally easier. However, it has the theoretical disadvantage that since the value of a factor score is not exactly equal to the value of the unobserved latent factor for each respondent, using the factor score in the role of the factor can induce a measurement error bias in the estimated structural model (here, where the factor score is used as a response variable in a linear model, the bias arises because the factor score is not an
*unbiased*prediction of the factor). - Method 1 is computationally more demanding. It avoids the measurement error problem of Method 2 because estimating the measurement model together with the structural model correctly allows for measurement error in the individual items as measures of the factor. However, this approach has the arguable disadvantage that the estimated measurement model itself – and thus the implied definition of the factor – is affected by the inclusion of the explanatory variable for the factor. This can be seen by observing that the estimated parameters of the measurement model are here somewhat different from what they were in Example 1. The estimated measurement model would change again every time the structural model was changed, for example if we added respondent’s age and sex as explanatory variables.

Two further characteristics are common to both Methods 2 and 3:

- Unlike Method 1, Method 2 does not require fixing the intercept and residual variance of the structural model to identify the scale of the factor, so these parameters are also estimated.
- Method 2 (as it is used here) takes the estimated measurement model from the first step as known. This means that the standard errors of the parameters of the structural model will be underestimated to some extent.

The differences between these different ways of fitting the model matter ultimately only if they lead to meaningful differences in the main conclusions that we aim to draw from the analysis. In this example these questions of interest are the comparisons of average levels of the factor (trust in the procedural fairness of the police) between the countries. Figure 5.2 shows these country averages estimated using Methods 1 and 2. It is clear that both sets of estimates are very similar, in that both give essentially the same relative differences and rankings of the countries. Many of the differences between the countries are clearly statistically significant (the standard errors of the means from Method 1 are around 0.05). The ordering of the countries shows fairly consistent geographic regularities, with levels of trust in procedural fairness of the police mostly highest in the North and West of Europe and lower in the South and East.

*Figure 5.3: Averages of the factor “Procedural fairness of the police” in the countries in the ESS, as estimated in Example 2. The plot shows two estimates of these averages, from a joint model (“Method 1” as discussed in the text, on the horizontal axis), and from a linear model for factor scores derived from the measurement model fitted in Example 1 (“Method 2”, on the vertical axis). The main conclusion from this comparison is that both sets of estimates give very similar results about comparisons between the countries.*