All pages

Chapter 3: Multigroup Factor Analysis

Introduction

A key aim of many social surveys is to measure the same constructs in different groups in order to make cross-group comparisons of the distributions of the constructs. This is clearly the case in cross-national surveys such as the ESS, where the populations of individuals in the different countries are the key groups of interest. A defining purpose of a cross-national survey is to provide data for making comparisons between countries, often in terms of the distributions of latent constructs which are measured by multiple indicators.

Since the latent variables in factor analysis are assumed to follow a multivariate normal distribution, such cross-national comparisons focus on the means, variances and covariances of the factors, which together fully define the multivariate normal distribution. Multigroup factor analysis can be used for such comparisons. It extends the standard (single-group) factor analysis model by allowing some parameters of the model to vary across the groups.

In order for cross-group comparisons to be meaningful, the variable of interest should be measured in the same way and on the same scale across the groups. In the case of a latent variable this requirement amounts to the condition that at least a sufficiently large part of the measurement model of the variable should have the same form and identical parameter values in all of the groups. If this is the case, we can say that those parts of the measurement model are invariant (or equivalent) across the groups, and that cross-group measurement invariance (or measurement equivalence) holds for them.

In the rest of this chapter we first describe multigroup factor analysis models where measurement invariance is assumed to hold for the entire measurement model, before discussing how and when this condition may be checked and perhaps partially relaxed by allowing partial non-invariance of measurement.

Page 1

Chapter 3: Multigroup Factor Analysis

Multigroup models under full measurement invariance

Consider again a model for one or more ηj, which are measured by multiple indicators in a way described by a factor analysis measurement model. Suppose now that we have data on respondents from G known groups such as countries. In this section we assume that complete measurement invariance across the groups holds for the measurement model, so that this model is exactly the same in all the groups. There is then nothing new to say about the measurement model, which is defined and interpreted in the same ways as in the single-group situation discussed before. We can thus focus on the changes in how the distribution of the latent factors is specified.

For concreteness of notation, suppose that there are two factors η1 and η2. We now assume that among individuals in each group g = 1, ..., G, the factors are jointly normally distributed with means

E(η1) = κ1(g) and E(η2) = κ2(g)

variances

var(η1) = φ1(g) and var(η2) = φ2(g)

and covariance

cov(η1, η2) = φ12(g)

In other words, this allows all of the parameters which describe the distribution of the factors to be different in different groups.

It is still necessary to impose some constraints on these parameters in order to identify the scales of the latent factors. This, however, now needs to be done only in one group (which can be chosen freely), leaving all the parameters free to be estimated in all the other groups. For example, we may choose group 1 as the reference group and fix the factor means κ1(1) and κ2(1) to be 0 and the factor variances φ1(1) and φ2(1) to be 1 in that group (the factor covariance φ12(g) can be freely estimated in all groups, including group 1). When this is done, the mean of 0 in group 1 becomes a benchmark against which the means κ1(g) and κ2(g) in the other groups, and values of the factors for individuals in all groups, can be compared, and similarly the value 1 in group 1 becomes a benchmark for the factor variances φ1(g) and φ2(g).

Such models can be easily fitted in standard software, and estimates from them allow us to compare distributions of factors across countries. This is illustrated by Example 1 later in this chapter.

Page 2

Chapter 3: Multigroup Factor Analysis

Models with some non-invariance of measurement

To introduce the key concepts related to non-invariance of measurement in factor analysis models, we focus on the simple case of a model with one factor η. In a multigroup context, the measurement model for any item yj (j = 1, ..., p) for a respondent in group g = 1, ..., G can then be expressed as

yj = νj(g) + λj(g)η + εj

where εj is normally distributed with mean 0 and variance θj(g). In other words, such a measurement model allows any or all of the measurement parameters for an item (intercept νj(g), loading λj(g) and/or the error variance θj(g)) to have different values in different groups g.

The first question we should ask about comparability of measurement is whether the overall structure of the measurement model for the items is the same in all groups. In the one-factor case this is the question of whether a one-factor model is indeed adequate in all groups. More generally, it is the question of whether a measurement model with the same number of factors and the same pattern of zero and non-zero factor loadings is adequate in all groups. Example 2 in Chapter 2 was an illustration of this kind of analysis, there for a particular 2-factor confirmatory factor analysis model. If the same model is adequate in all groups in this sense, the measurement model is said to possess configural invariance (or construct invariance) across the groups. In essence this means that the items can be thought to measure the same latent constructs in each group, even if possibly with different exact values of the measurement parameters. If construct invariance does not hold, the items cannot really be used for meaningful comparisons of constructs between the groups. If it does hold, we can proceed to examine whether the parameters of the common measurement model also have equal values across groups.

If any of the parameters of the measurement model for item yj do vary across the groups, the item is non-invariant across the groups. If, in contrast, each of the measurement parameters has the same value in all groups (i.e. νj(g) = νj, λj(g) = λj and θj(g) = θj for g = 1, ..., G), full (or "strict") invariance of measurement holds for that item. When an item is fully invariant, it thus functions as a measure of the factor in exactly the same way in all of the groups. If full invariance holds for all items y1, ..., yp which are treated as measures of the factor η, the measurement of the factor itself is fully invariant.

We may also consider models which possess partial invariance, meaning that some but not all items for a factor and/or some but not all measurement parameters for an item are non-invariant. The following terms are often used to refer to specific kinds of partial invariance in terms of types of parameters:

Page 3

Chapter 3: Multigroup Factor Analysis

Identification of multigroup models

The main reason for estimating a multigroup factor analysis model is typically that we wish to estimate and compare means, variances or covariances of the factors between the groups. The requirement for the identifiability of the model is then that it should be possible to uniquely identify distinct values for these parameters in the different groups.

If the measurement model has full invariance of measurement, these country-specific distributions of the factors are identified if the measurement model is such that it would be identified also for single-group factor analysis (as discussed in Chapter 2) and if the factor means and variances are fixed in one group as discussed earlier in this chapter.

The remaining question is then whether and when the multigroup model is identified if the measurement model includes some non-invariance of measurement. Here we give some conditions for this. Consider first models which have different types of partial non-invariance for all of the observed items which are used as indicators of a factor:

In the case of models for one factor, we have the following results on partial non-invariance by item:

Page 4

Chapter 3: Multigroup Factor Analysis

Effects of non-invariance of measurement on conclusions about latent factors

While it is possible to fit multigroup models with different levels of invariance and non-invariance of measurement, each such model will in general give different results for the distributions of the latent factors which are the focus of interest in the analysis. For example, the following results are worth bearing in mind:

Page 5

Chapter 3: Multigroup Factor Analysis

Assessing levels of non-invariance of measurement

Goodness of fit of models with different levels of non-invariance of measurement can be examined and compared, to assess whether it would be necessary to allow for some non-invariance to achieve a good fit to the data. In such comparisons, the model with full measurement invariance is the most restricted model, and models with different levels of non-invariance are less restricted and thus better-fitting models.

Standard likelihood ratio tests can be used for such comparisons, for example to compare the full invariance model to partial non-invariance models, or to compare nested pairs of non-invariance models (e.g. scalar invariance vs. full non-invariance for a given item, or non-invariance for one vs. two items) to each other. These tests are often quite sensitive in practice, so it is common for them to reject the full invariance model. Because of this sensitivity, the tests may be supplemented by other methods of model assessment such as the AIC and BIC statistics. Examples of the use of these statistics and likelihood ratio tests for such comparisons of different measurement models are given in Example 2 of this chapter.

Page 6

Chapter 3: Multigroup Factor Analysis

Invariance vs. non-invariance models in practice: Sensitivity of main conclusions

Whatever methods we use for model assessment, in many applications it is a common conclusion that there is evidence of at least some non-invariance of measurement. This is certainly the case for large cross-national surveys of general populations, where it appears to be very rare that full invariance is formally judged to hold for any measurement scales with multiple items. This then raises the difficult question of what would be the best way to analyze latent constructs in such situations.

When the main purpose of a multigroup analysis is to obtain cross-group (e.g. cross-national) comparisons of the latent factors, the most relevant criterion for assessing the effect of non-invariance of measurement is how different specifications of the measurement models affect the main conclusions about distributions of the factors. If these conclusions are relatively insensitive in this respect, the choice of the measurement models does not matter much – and, in particular, we can with confidence use the simplest choice, the model with full invariance of measurement. Such sensitivity analysis can be done by fitting models with different levels of non-invariance in turn, and comparing the parameter estimates for the distributions of the factors. This is illustrated in Example 2 of this chapter. An even simpler approach has been proposed by [Obe14]. His EPC-interest statistic (expected parameter change in parameters of interest) requires only that the full invariance model is fitted, and gives a good approximation of how much estimates of parameters of interest such as factor means would change if different measurement parameters were freed to be non-invariant across groups.

Ultimately the outcome of such sensitivity analyses may be that the choice matters, i.e. that conclusions about cross-national comparisons do depend on how much non-invariance of measurement we allow in the model specification. In this situation it might seem natural to use the results obtained from the best-fitting non-invariance model. However, this approach also has its problems, even apart from the fact that there may be no partial non-invariance model which both fits well and is identified. Any non-invariance model presents additional complications of interpretation, for example that (as noted previously) comparative conclusions about the means of latent factors are then really based only on those observed items which are specified as invariant. The opposite alternative approach is to base the conclusions on the full invariance model even when it does not fit well according to formal model selection criteria. This is easy to do and ensures that each item is treated in the same way, for all countries; however, it also ignores the observed evidence of non-invariance and thus in effect defines the latent constructs to be measured on a common scale across the countries. In short, all possible choices on how to treat items which are thought to be cross-nationally non-invariant have their disadvantages as well as different advantages. [Kuh15] present some further discussion of these conceptually difficult questions.

Page 7

Chapter 3: Multigroup Factor Analysis

Example 1 on Multigroup factor analysis: A model under measurement invariance

Consider the data in our example for the questions D18-D23, using a 2-factor confirmatory factor analysis model where questions D18-D20 measure only the factor "obligation to obey the police" and questions D21-D23 only the factor "moral alignment with the police". Fit this as a multigroup model to data from all 27 countries, specifying the measurement model to have full invariance of measurement across the countries. Use the results of the model to compare the estimated means, variances and correlations of the factors between the countries.

Stata commands:

// Example of multigroup analysis under full equivalence:
* Convert the country variable from string to numeric variabl
encode cntry, gen(country)
* Fit the model:
sem (Obey -> bplcdc@v1 doplcsy dpcstrb) ///
(MoralAlign -> plcrgwr@v2 plcipvl gsupplc), ///
var(1: Obey@1) var(1: MoralAlign@1) method(mlmv) ///
group(country) ginvariant(mcons mcoef merrvar)
** Standardized estimates, to get the estimated correlation
** (i.e. standardized covariance) between the factors:
sem, stand

Stata output and notes

R commands:

# Example of multigroup analysis under full equivalence:
library(lavaan)
#
countries <- unique(ESS5Police$cntry)
length(countries) # 27 countries
# The "c(1,rep(NA,26))*" constrains the factor variances to be 1 in the
# first country.
ModelSyntax <- '
Obey =~ NA*bplcdc + doplcsy + dpcstrb
MoralAlign =~ NA*plcrgwr + plcipvl + gsupplc
Obey ~~ c(1,rep(NA,26))* Obey
MoralAlign ~~ c(1,rep(NA,26))* MoralAlign
'
FittedModel.MG <- sem(model = ModelSyntax,
data = ESS5Police, group="cntry",
meanstructure = TRUE,missing="ml",
group.equal=c("intercepts","loadings","residuals"))
summary(FittedModel.MG)
# A more concise table of the estimates:
parameterEstimates(FittedModel.MG,fmi=F,stand=T)

R output and notes

Estimated means, variances and correlation of the two factors from the multigroup model are shown in Table 3.1 for each of the countries, and also in graphical form in Figures 3.1-3.4. Note that here the factor means are fixed at 0 and factor variances at 1 for the first country, Belgium. From these results, we may for example observe the following:

Table 3.1: Estimated distributions of the factors for a 2-factor multigroup confirmatory factor analysis model for indicators of Obligation to obey the police and Moral alignment with the police, fitted to data from 27 countries in the ESS.

Means Variances
Country Obey MoralAl Obey MoralAl corr.
Belgium (BE) 0 0 1 1 0.35
Bulgaria (BG) -0.53 -0.37 2.56 1.81 0.46
Switzerland (CH) 0.36 0.26 1.25 0.83 0.12
Cyprus (CY) 0.45 -0.22 1.33 1.55 0.45
Czech Republic (CZ) 0.23 -0.50 1.54 1.47 0.31
Germany (DE) 0.31 0.36 1.17 0.84 0.35
Denmark (DK) 0.87 0.41 0.82 0.75 0.41
Estonia (EE) -0.23 0.12 1.74 0.82 0.24
Spain (ES) -0.10 0.10 0.94 1.06 0.39
Finland (FI) 0.75 0.57 0.60 0.62 0.50
France (FR) -0.15 -0.30 1.12 1.50 0.36
United Kingdom (GB) -0.01 0.07 1.10 1.03 0.44
Greece (GR) -0.23 -0.62 1.42 1.72 0.55
Croatia (HR) -0.36 -0.33 1.99 1.23 0.42
Hungary (HU) 0.38 -0.33 1.53 1.33 0.37
Ireland (IE) -0.18 0.17 1.30 1.44 0.50
Israel (IL) 0.60 -0.72 1.57 1.86 0.29
Lithuania (LT) -0.03 -0.21 1.64 0.96 0.40
Netherlands (NL) 0.26 0.10 0.87 0.78 0.35
Norway (NO) 0.39 0.41 0.94 0.67 0.52
Poland (PL) 0.04 0.11 1.37 0.87 0.27
Portugal (PT) 0.08 -0.11 1.26 0.94 0.43
Russia (RU) -0.89 -0.98 1.77 1.71 0.47
Sweden (SE) 0.54 0.30 0.98 0.57 0.44
Slovenia (SI) -0.71 -0.30 2.11 1.10 0.34
Slovakia (SK) -0.10 -0.11 1.76 1.14 0.23
Ukraine (UA) -0.67 -1.17 2.05 2.05 0.31

Figure 3.1: Estimated means of the factor Obligation to obey the police (with 95% confidence intervals) for each country, from the model shown in Table 3.1.

Figure 3.2: Estimated means of the factor Moral alignment with the police (with 95% confidence intervals) for each country, from the model shown in Table 3.1.

Figure 3.3: Estimated means of the two factors for each country, from the model shown in Table 3.1.

Figure 3.4: Estimated means against estimated standard deviations of the factor Moral alignment with the police for each country, from the model shown in Table 3.1.

Page 8

Chapter 3: Multigroup Factor Analysis

Example 2 on Multigroup factor analysis: Assessing non-invariance of measurement

Consider the data on the questions D18-D20 which are treated as measures of the factor "obligation to obey the police", for data from Denmark, Norway and Sweden. Fit multigroup models with one factor, and compare models with different specifications of measurement invariance and non-invariance in the items. How well do these different models fit the data, and how do they affect conclusions about cross-national comparisons of the mean of the factor?

Here we consider only three countries and one factor, to keep the command file in Stata relatively short (the commands for all 27 countries and more factors and items would be an obvious extension of these).

Stata commands:

// Example of measurement invariance assessment
// Using three countries, and
// items one factor (obligation to obey) for illustration
* Convert the country variable from string to numeric variables:
* (if not done already earlier)
encode cntry, gen(country)
* Full invariance model:
sem (Obey -> bplcdc@v1 doplcsy dpcstrb) ///
if cntry=="DK" | cntry=="NO" | cntry=="SE", ///
var(7: Obey@1) method(mlmv) ///
group(country) ginvariant(mcons mcoef merrvar)
** Likelihood ratio tests of freeing each of the measurement parameters:
estat ginvariant
estimates store full_invariance
* Scalar invariance: Free error variances for all items
sem (Obey -> bplcdc@v1 doplcsy dpcstrb) ///
if cntry=="DK" | cntry=="NO" | cntry=="SE", ///
var(7:Obey@1) method(mlmv) ///
group(country) ginvariant(mcons mcoef)
estimates store scalar_invariance
* Different levels of non-equivalence for one item (here bplcdc)
** Scalar invariance: Free error variances
sem (Obey -> bplcdc@v1 doplcsy dpcstrb) ///
if cntry=="DK" | cntry=="NO" | cntry=="SE", ///
var(7: Obey@1) method(mlmv) ///
group(country) ginvariant(mcons mcoef merrvar) ///
var(20: e.bplcdc@v11) var(24: e.bplcdc@v12)
estimates store scalar_invariance1
** Metric invariance: Free error variances and intercepts
sem (Obey -> bplcdc@v1 doplcsy dpcstrb) ///
(20: _cons@v31 -> bplcdc) ///
(24: _cons@v32 -> bplcdc) ///
if cntry=="DK" | cntry=="NO" | cntry=="SE", ///
var(7: Obey@1) method(mlmv) ///
group(country) ginvariant(mcons mcoef merrvar) ///
var(20: e.bplcdc@v11) var(24: e.bplcdc@v12)
estimates store metric_invariance1
** Full non-equivalence: Free error variances, intercepts and loadings
sem (Obey -> doplcsy@v1 bplcdc dpcstrb) ///
(20: _cons@v31 -> bplcdc) ///
(20: Obey -> bplcdc@v21) ///
(24: _cons@v32 -> bplcdc) ///
(24: Obey -> bplcdc@v22) ///
if cntry=="DK" | cntry=="NO" | cntry=="SE", ///
var(7: Obey@1) method(mlmv) ///
group(country) ginvariant(mcons mcoef merrvar) ///
var(20: e.bplcdc@v11) var(24: e.bplcdc@v12)
estimates store full_nonequivalence1
* Likelihood ratio tests against full invariance model
lrtest full_invariance scalar_invariance
lrtest full_invariance scalar_invariance1
lrtest full_invariance metric_invariance1
lrtest full_invariance full_nonequivalence1

Stata output and notes

R commands:

# Example of measurement invariance assessment
# Using three countries, and
# items one factor (obligation to obey) for illustration:
library(lavaan)
#
ind <- ESS5Police$cntry=="DK" | ESS5Police$cntry=="NO" |
ESS5Police$cntry=="SE"
# Full invariance model:
ModelSyntax <- '
Obey =~ NA*bplcdc + doplcsy + dpcstrb
Obey ~~ c(1,rep(NA,2))* Obey
'
FittedModel.fullinv <- sem(model = ModelSyntax,
data = ESS5Police[ind,], group="cntry",
meanstructure = TRUE,missing="ml",
group.equal=c("intercepts","loadings","residuals"))
summary(FittedModel.fullinv)
# Scalar invariance: Free error variances for all items
FittedModel.scalar <- sem(model = ModelSyntax,
data = ESS5Police[ind,], group="cntry",
meanstructure = TRUE,missing="ml",
group.equal=c("intercepts","loadings"))
summary(FittedModel.scalar)
# Different levels of non-equivalence for one item (here bplcdc)
## Scalar invariance: Free error variances
FittedModel.scalar1 <- sem(model = ModelSyntax,
data = ESS5Police[ind,], group="cntry",
meanstructure = TRUE,missing="ml",
group.equal=c("intercepts","loadings","residuals"),
group.partial=c("bplcdc~~bplcdc"))
summary(FittedModel.scalar1)
## Metric invariance: Free error variances and intercepts
FittedModel.metric1 <- sem(model = ModelSyntax,
data = ESS5Police[ind,], group="cntry",
meanstructure = TRUE,missing="ml",
group.equal=c("intercepts","loadings","residuals"),
group.partial=c("bplcdc~~bplcdc","bplcdc~1"))
summary(FittedModel.metric1)
## Full non-equivalence: Free error variances, intercepts and loadings
FittedModel.noneq1 <- sem(model = ModelSyntax,
data = ESS5Police[ind,], group="cntry",
meanstructure = TRUE,missing="ml",
group.equal=c("intercepts","loadings","residuals"),
group.partial=c("bplcdc~~bplcdc","bplcdc~1","Obey=~bplcdc"))
summary(FittedModel.noneq1)
# Likelihood ratio tests against full invariance model:
anova(FittedModel.fullinv,FittedModel.scalar)
anova(FittedModel.fullinv,FittedModel.scalar1)
anova(FittedModel.fullinv,FittedModel.metric1)
anova(FittedModel.fullinv,FittedModel.noneq1)

R output and notes

Results for the fitted models are summarized in Table 3.2. Here we consider for illustration seven different specifications for the measurement models across the three countries: The model where full invariance holds, scalar invariance for all of the three items, scalar invariance, metric invariance and complete non-invariance for item D18, and complete non-invariance for items D19 and D20. Note that models with metric invariance in all the items or models with non-invariance in two or more items are not included, because such models would not allow the identification of distinct estimates of factor means for the different countries.

Considering first the goodness of fit of the models, likelihood ratio tests indicate that allowing for non-invariance in any one item would improve the fit compared to the full invariance model. The tests of partial invariance models shown for item D18 suggest that for this item at least the measurement intercepts in particular are significantly different between the countries. The AIC and BIC statistics indicate that the model preferred by each of them includes non-invariance of measurement. So in these data, even with just three items and three culturally and linguistically fairly similar countries, a multigroup analysis suggests significant deviations from exact invariance of measurement.

What matters most for substantive interpretation, however, is whether comparative conclusions about the constructs being measured are affected by different choices for the measurement models. Here they are not. Table 3.2 also shows that all the models considered here yield very similar estimates for the estimated means of the factor. According to all of them, the average level of felt obligation to obey the police is around -0.52 in Norway and around -0.35 in Sweden, on a scale where the mean in Denmark is fixed at 0, and the standard deviation in Denmark fixed at 1. The estimated standard errors of these estimates are around 0.04, so all the differences between the country means are statistically significant. Since all the models give similar results about the factors, here we could without difficulty focus on the simplest model which assumes invariance of measurement.

Table 3.2: Estimated factor means and model assessment statistics for a 1-factor multigroup confirmatory factor analysis model for indicators of Obligation to obey the police, under different specifications of non-invariance of measurement. The models are fitted to data from Denmark, Norway and Sweden.

Estimated mean
(and standard error)
of the factor
[Obligation to obey the police]
Measurement model LR-test againt full invariance model: P-Value AIC BIC Denmark
(Constrained)
Norway Sweden
Full invariance 57417 57501 0 -0.52
(0.04)
-0.36
(0.04)
Scalar invariance
(error variances free)
<0.001 57394 57516 0 -0.51
(0.04)
-0.36
(0.04)
Scalar invariance for item D18
0.10 57417 57513 0 -0.52
(0.04)
-0.36
(0.04)
Metric invariance for item D18
(error variance and intercept free)
<0.001 57384 57493 0 -0.50
(0.04)
-0.37
(0.04)
Complete non-invariance for item D18 <0.001 57381 57503 0 -0.50
(0.04)
-0.37
(0.04)
Complete non-invariance for item D19 <0.001 57397 57519 0 -0.57
(0.05)
-0.34
(0.04)
Complete non-invariance for item D20 <0.001 57396 57518 0 -0.52
(0.04)
-0.37
(0.04)

Go to next chapter >>

Page 9