Chapter 5: Latent variable models with categorical indicators

So far in this module we have described only linear factor analysis and structural equation models. In these models, all observed and latent variables are taken to be continuous and normally distributed variables, and the structural and measurement models for them are specified as linear regression models (with the one exception of observed variables which are used only as explanatory variables, which are not modelled and can be of any type).

These assumptions are not the only ones possible, and latent variable models can also be defined with different assumptions about some of the variables. In particular, in many applications it would be useful to consider models where some of the latent variables and/or their observed indicators are not continuous but categorical variables which can only take on two or more discrete possible values (categories). In this chapter we briefly discuss such models. We focus on the case where the latent variables are still continuous but the indicators are categorical. Such models may be called latent trait models or simply “factor analysis models for categorical items”. They are also known as Item Response Theory (IRT) models; this is the most common term in applications in educational or psychological testing, where these models are very widely used.

The discussion in this chapter is brief, and meant to give only a general introductory idea rather than a full description of latent trait models. Much more information on them can be found in the books listed below. Also described in these books are still further types of latent variable models which are not discussed in this module. One very important class of such models is that of latent class models where both latent variables and their indicators are categorical.

References on general types of latent variable models