Chapter 5: Latent variable models with categorical indicators
More general models with continuous factors and categorical items
We have focused on the model with binary items and a single latent factor, in order to introduce the basic ideas of latent trait models in the simplest possible case. These models can be generalized in various ways, which we note very briefly here:
- Instead of being binary, categorical items may also have three or more categories, and these categories may be treated as being ordered (ordinal items) or unordered (nominal items). The measurement model of an item then needs to be specified in such a way that it is appropriate for a variable with multiple categories. The multinomial logistic model is used as the measurement model for nominal items, and the ordinal logistic model (proportional odds model) is commonly used for ordinal items.
- Instead of being all of one kind, different indicators for the same factor may be a mixture of binary, ordinal and nominal (and indeed even continuous) items. The measurement model of each individual item is then specified in whatever way is appropriate for that item. This situation of course assumes that we are dealing with items for which it still makes substantive sense to treat them as measures of the same latent variable.
- Instead of a single latent factor, a latent trait model can have two or more factors. The joint distribution of the factors is then assumed to be a multivariate normal distribution. This distribution and the assumptions required to identify the latent scales are specified in the same way as in factor analysis. The measurement model for multiple factors can again be an “exploratory” model where all items measure all factors, or a “confirmatory” model which imposes further constraints on the factor loadings. In an exploratory model, “rotation” of the factors again needs to be fixed, and this can be done in the same ways as in factor analysis.
- Instead of focusing only on measurement models, we can also define models which include structural models for associations and regression models among latent factors and observed explanatory and/or response variables. This is done in the same ways as in linear structural equation models (SEMs). In practice, however, the computational complexity of estimating latent trait measurement models for categorical items often imposes limits on the complexity of structural models that it is practicable to estimate. When estimation of combined measurement and structural models all in one step is infeasible, a practicable approach is often a “three-step” analysis where we (1) estimate the measurement model separately for each distinct factor, (2) use these models to calculate a factor score for each factor for each respondent, and (3) fit the structural model with the factor scores treated as observed values of each factor. This approach has advantages and disadvantages of its own, as discussed in the section on factor scores for factor analysis earlier in this module (Chapter 2) and for models for categorical items by, for example, [Bak13]; this article considers methods of three-step modelling for latent class models, but many of the comments there are also relevant for all latent variable models). An example of this approach is given in Example 2 later in this chapter.
References
- [Bak13] Bakk, Z., Tekle, F. B. and Vermunt, J. K. (2013). Estimating the association between latent class membership and external variables using bias-adjusted three-step approaches. Sociological Methodology, 43, 272-311.