# Latent variable modelling of cross-national survey data

**By Myrsini Katsikatsou and Jouni Kuha**

### Introduction

**Multi-item measures of latent constructs**

In the social sciences, the quantities of interest are often constructs that cannot be observed directly. Common types of such *latent constructs* include values, opinions, and attitudes towards social phenomena such as immigration or the role of the welfare state. For such a variable there is usually no single measurement instrument which would deliver a sufficiently valid and reliable measure of the construct. Instead, we measure it through a set of multiple observable variables, whose values are then combined to form a measure of the construct. In a survey, these observable measures or *items* are survey questions related to the same construct. For instance, in the example considered throughout this module one of the constructs is the level to which a respondent feels obliged to obey the police. This is measured with the following three items:

"To what extent is it your duty to ..."

- "… back the decisions made by the police even when you disagree with them"
- "… do what the police tell you even if you don’t understand or agree with the reasons"
- "… do what the police tell you to do, even if you don’t like how they treat you?"

**Latent variable models**

Statistical latent variable models provide a framework to work with latent constructs which are measured through multiple items. With such models we can examine the properties of the measurement, for example whether all the items are valid and reliable indicators for the construct we want to measure, and to assign values of the construct to respondents. More importantly, the models can also be used to answer questions about the latent constructs, for example how their means vary between populations or how different constructs are associated with each other.

The most commonly used family of latent variable models are *Structural Equation Models* (SEMs), a special case of which is *factor analysis*. These models are the latent variable analogy of standard linear models in regression analysis of observed data, where all response variables are treated as continuous variables. In the latent variable context, these response variables may be either the latent variables or the items which are used to measure them. If some of them are treated instead as categorical variables, we obtain different families of latent variable models such as *latent class* or *latent trait* (Item Response Theory) models. In this module we focus on structural equation models, but latent trait models are also discussed briefly at the end.

**Multigroup latent variable models for cross-national survey data**

In the analysis of a cross-national survey such as the European Social Survey (ESS), many research questions naturally involve comparisons between countries. In latent variable modelling, these questions can be formulated as *multigroup* models where the country of a respondent is treated as an explanatory variable for the latent variables. This allows us to examine differences between countries in the distributions of the latent constructs. Multigroup models can also be used to assess cross-national *measurement equivalence*, that is whether the measurement of the latent variables by the observed items is comparable across the countries. Such multigroup models are one of the topics of this module.

**Outline of the module**

The module draws on the following publications as references on factor analysis, structural equation modelling, and latent variable modelling in general: [Bar11], [Bar08], [Bol89], [Kap08], [Skr04].

### Chapter 1: Example and computing

Chapter 1 introduces a theoretical model that is used as an example throughout this module. The chapter also provides guidance on how to use the accompanying ESS dataset in two different statistical software packages (Stata and R).

### Chapter 2: Factor Analysis

This chapter provides a description of basic factor analysis.

### Chapter 3: Multigroup Factor Analysis

Multigroup factor analysis can be used to compare the distribution of latent constructs across groups. In this chapter, we make use of the ESS dataset to make cross-national comparisons. Multigroup Factor Analysis models, and the condition measurement invariance are discussed.

### Chapter 4: Structural modelling for single, and multi- group analysis

The fourth chapter introduces structural equation models, which expand factor analysis by including also regression models for relationships among latent and observed variables. Both single-group and multigroup structural equation models are described.

### Chapter 5: Latent trait modelling

The final chapter briefly introduces latent trait modelling where the observed items are treated as dichotomous variables.

#### References

- [Bar08] Bartholomew, D. J., Steele, F., Moustaki, I. and Galbraith, J. G. (2008). Analysis of multivariate social science data (Second edition). Chapman & Hall/CRC.
- [Bar11] Bartholomew, D. J., Knott, M. and Moustaki, I. (2011). Latent Variable Models and Factor Analysis: a unified approach (Third edition). Wiley.
- [Bol89] Bollen, K. A. (1989). Structural equations with latent variables. Wiley.
- [Kap08] Kaplan, D. (2008). Structural equation modeling (Second edition). Sage.
- [Skr04] Skrondal, A. and Rabe-Hesketh, S. (2004) Generalized latent variable modeling: multilevel, longitudinal, and structural equation models. Chapman & Hall/CRC.