# Chapter 5: Latent variable models with categorical indicators

### Example 1 on Latent trait models for binary items: A measurement model

In this and the next example we continue to use the same data as in the rest of this module. Because these data do not have any items which are originally binary, we create such items by dichotomizing three of the existing items. These are items D15, D16 and D17, the three indicators of trust in the procedural fairness of the police. The items are dichotomized so that their original levels 1 and 2 (“Not at all often” and “Not very often”) are combined as the new level 0 (“Not often”) and levels 3 and 4 (“Often” and “Very often”) as the new level 1 (“Often”). The binary items derived from D15, D16 and D17 are labelled *respect*, *fair* and *explain* respectively. The commands included below show how they can be created in Stata.

Fit a latent trait model for the three binary items *respect*, *fair* and *explain* given one latent factor, and using the data for all the countries together. Interpret the parameters of the estimated measurement model, and hence interpret the factor implied by this measurement model.

*Note*: Only Stata commands are included for this and the next example. In Stata, latent trait models can be fitted with the command *gsem*, which was first included in Stata Version 13. The R package *lavaan* does not currently include functions for fitting these models. There are other add-on packages in R for fitting latent trait models, but they are not used here.

The estimated parameters of the measurement models are shown in Table 5.1, and the item response curves implied by them are plotted in Figure 5.1. The curves show the probabilities of the response coded as 1 – i.e. “Often” – for the items, given different values of the latent factor. All the factor loadings (discrimination parameters) are positive. This means that higher values of the factor correspond to higher probabilities of the response “Often”, and thus that higher values of the factor indicate higher levels of trust in procedural fairness of the police.

Comparing the measurement parameters between the different items, we can see that the item *explain* has the lowest value of the intercept (difficulty) parameter. This means that given the value 0 of the factor (i.e. its mean value), this item has the lowest probability of the “Often” response (around 0.6, compared to nearly 0.9 for the other two items). The item *explain* also has the lowest discrimination parameter, so its item response probabilities are a little less strongly associated with the factor than are the probabilities of the other two items (this can be seen in Figure 5.1, where the item response curve of this item is the least steep of the three). The measurement models of the items *respect* and *fair*, on the other hand, are very similar to each other.

*Table 5.1: Estimated parameters (and their standard errors) of the measurement model for binary items “respect”, “fair” and “explain”, given latent factor “Procedural fairness of the police”. The model is fitted to the pooled sample of all respondents in the ESS (n=50501).*

Item | Intercept ν̂_{j} ("difficulty parameter") |
Loading λ̂_{j} ("discrimination parameter") |
---|---|---|

respect | 1.99 (0.03) | 3.24 (0.05) |

fair | 1.83 (0.03) | 3.52 (0.06) |

explain | 0.44 (0.02) | 2.18 (0.03) |

*Figure 5.2: Item response curves for the binary items “respect”, “fair” and “explain”, given latent factor “Procedural fairness of the police”, from the estimated measurement models in Table 5.1.*