# All pages

# CHAPTER 2: The comparability of attitude measurements

### Measurement equivalence

In this module, we study recent changes in anti-immigration attitudes in 17 European countries. Anti-immigration attitudes are operationalised by means of a scale consisting of three items. Thus, the purpose of this module is to compare an attitude scale over different time points and across countries. However, comparing abstract concepts - such as attitude scales - involves additional methodological issues. Before meaningful comparisons can be made over time and across countries, it is necessary to guarantee that the variable of interest is measured in a sufficiently comparable way. After all, it could be the case that respondents in different countries interpret the items in very divergent ways, due, for example, to culture-specific terms or translation errors. It is also possible that the interpretation of items changes over time. Obviously, if the meaning of items is not constant over time and across countries, then we end up comparing apples and oranges.

The notion of measurement equivalence refers to precisely this important question of the comparability of measurements. An instrument is said to be equivalent when it measures exactly the same attribute under different conditions (e.g. at various time points or in different countries) [Hor92]. When measurement equivalence is absent, our conclusions risk being flawed. Observed differences between countries or time points could reflect method bias rather than substantial differences in the concepts we intend to measure. Similarly, finding no differences would not necessarily guarantee that ‘real’ differences are absent. For these reasons, measurement equivalence should not be taken for granted, but, instead, should be regarded as a hypothesis that needs to be tested.

This second chapter explains how the comparability of measurement scales can be tested in practice, using AMOS. No previous knowledge of AMOS is required to understand this chapter. The chapter is nevertheless both conceptually and statistically more demanding than the other chapters in this module. The other side of the coin is that, once you have completed this chapter, you will have gained knowledge about a very powerful methodological tool that is rapidly gaining in importance in the field of comparative research. Do not let the statistical notation in the first paragraphs of this chapter discourage you. Once you have worked through the theory, the chapter offers a very concrete, step-by-step guide to performing equivalence tests. Those who are not interested in the topic or who consider it too complex, can skip this chapter and go directly to Chapter 3.

# Multiple Group Confirmatory Factor Analysis, part 1

Several techniques have been proposed to test measurement equivalence. Multiple group confirmatory factor analysis (MGCFA) [Jör71] is one of the most popular techniques to assess
measurement equivalence [Bil03] [Byr89] [Ren98] [Ste98]. MGCFA is a quite straightforward extension from conventional confirmatory factor analysis (CFA). In CFA, observed items are
considered to be indicators of an unobserved -or latent- concept. These indicators are imperfect, in the sense that, besides measuring the intended concepts, they are also affected
by measurement error. Systems of equations are used to describe the relations between observed items and the latent concepts these items are supposed to measure. In our case, we have
three items measuring one latent concept, ‘anti-immigration attitudes’. More in general, let us assume that we have p items measuring m latent variables. The observed items
scores x_{i} (i = 1,..,p) can then be written as linear functions of latent variables **ξ**_{j} (j = 1,...,m):

x = τ + Λξ + δ | (1) |

In expression (1), **x** refers to a p*1 vector containing the observed item scores. This vector is modelled as the sum of three components. **Λξ**
is the product of a p*m matrix containing the factor loadings (**Λξ**) and an m*1 vector with the latent variable scores
(**ξ**). The factor loadings can be seen as the slopes of a regression of x_{i} on **ξ**_{j}.
**τ** is a p*1 vector with the intercepts of the functions. These intercepts refer to the expected value of the observed items when the latent variable score is
equal to zero. Finally, **δ** is a p*1 vector containing stochastic error terms that are assumed to follow a multivariate normal distribution and to have the
expected value 0.

# Multiple Group Confirmatory Factor Analysis, part 2

When correctly identified, this model results in the following mean structure **μ** and covariance structure
**Σ**:

μ = τ + Λκ | (2) |

Σ = ΛΦΛ + Θ | (3) |

where **μ** equals a p*1 vector with observed item means and **κ** an m*1 vector
with means of latent variables **ξ**_{j}; **Σ** is the p*p covariance
matrix of the observed indicators, **Φ** an m*m covariance matrix of the latent variables and
**Θ** a p*p matrix with the error (co)variances [Ste98]. Comparing these implied mean and covariance structures with the item means
and covariances observed in the dataset makes it possible to assess how well the measurement model fits. The most straightforward way of assessing model fit is the chi-square test. However, the
chi-square value is known to be very sensitive to large sample sizes. As an alternative, several goodness-of-fit indices have been developed, such as the Root Mean Square Error of Approximation (RMSEA)
or the Comparative Fit Index (CFI) [Ben90] [Bro92].

In order to be useful for measurement equivalence testing, the CFA model described above has to be extended to a multi-group setting [Jör71]. Specifically, this means that a CFA model is estimated separately but simultaneously for different groups g (g = 1,...,G) of respondents.

x^{g} = τ^{g} + Λ^{g}ξ + δ^{g} | (4) |

In this case, we have 51 different groups, namely the inhabitants of 17 countries at three time points (17 x 3 = 51).

# Various levels of measurement equivalence, part 1

In the MGCFA approach, measurement equivalence can be tested by examining how similar the measurement models are across groups. Obviously, the more similar the models are, the greater the comparability. Several levels of measurement equivalence can be distinguished, each with its own implications for the comparability of scores. These levels are ordered hierarchically, in the sense that higher equivalence levels presuppose lower ones. Higher equivalence levels are harder to obtain as they present a stronger test of cross-cultural equivalence, but also allow a more extensive form of cross-cultural or cross-time comparison.

The lowest level of equivalence is called configural equivalence [Hor92 [Ste98]]. Configural equivalence means that the measurement model for the latent concept has the same factor structure across groups. In other words, configural equivalence implies that, if an item that loads strongly on the latent factor in one group, it also has a high factor loading in other groups. However, the strength of the factor loadings can differ across countries and time points, as no restrictions are placed on the magnitude of these parameters [Ste98]. Generally, this basic level of measurement equivalence is relatively easy to reach. The other side of the coin is that configural equivalence does not guarantee comparability across groups. Configural equivalence only means that the latent concepts can be meaningfully discussed in all groups. Since configural equivalence is a prerequisite for further equivalence testing, it is often used as a baseline [Van00].

# Various levels of measurement equivalence, part 2

A second and higher level of equivalence is called metric equivalence [Ste98], which has also been referred to as construct equivalence [Van97]. Operationally, metric equivalence presupposes that factor loadings in the measurement model are equal across groups:

Λ^{1} = Λ^{2} = ... = Λ^{G} | (5) |

where **Λ** stands for the factor loading vector and G for the group number (country at a specific time point). Metric equivalence implies the cross-cultural equality of the intervals of the scale on which the latent concept is measured. In other words, an increase of one unit on the measurement scale of the latent variable has the same meaning across groups. However, latent variable scores can still be uniformly biased upwards or downwards. Because of this possibility of additive bias, metric equivalence does not guarantee that latent means are comparable over groups. Nevertheless, metric equivalence is a necessary and sufficient condition for comparing statistics that are based on mean-corrected scores (such as regression coefficients and covariances) across groups 1.

An even higher level of equivalence, scalar invariance, should be established to justify comparing the means of the latent variables across countries or over time [Mee97] [Ste98]. Scalar equivalence holds if, in addition to factor loadings, the intercepts of the indicators in the measurement model also have to be equal across groups:

τ^{1} = τ^{2} = ... = τ^{G} | (6) |

where **τ** stands for the indicator intercept vector and G for the group number (country at a specific time point). Scalar equivalence implies that the measurement scales not only have the same intervals, but also share origins. This makes it possible to compare raw scores in a valid way, which is a prerequisite for latent mean comparisons across countries or over time.

To summarise, if we want to compare attitude means over countries and time points, factor loadings as well as intercepts should be equal across groups.

- [1] Steenkamp and Baumgartner 1998: 80.

# Full vs. partial equivalence

When the parameters for all items in the measurement model are equal across groups, we speak of full measurement equivalence. However, Byrne et al. (1989) [Byr89] have argued that full equivalence is not a necessary condition for comparisons to be valid. If at least two items per latent variable (namely, the item that is fixed at unity to identify the model and one other item) are equivalent, comparisons can be validly made across countries and time points. Thus, partial equivalence does not necessarily require the invariance of all loadings and intercepts. This idea is also supported by Steenkamp and Baumgartner (1998) [Ste98].

In this study, we want to study the evolution of anti-immigration attitudes in 17 different European countries. This boils down to comparing the means of the latent variable over the 51 groups. Before such comparisons can be made in a valid way, we need to test whether the scale possesses the characteristic of partial scalar equivalence. In other words, we need to assess whether, for at least two out of three items, factor loadings and indicator intercepts are equal across groups.

In practice, the different levels of measurement equivalence can be tested by fitting various, increasingly restrictive multi-group models. The first model will have the same factor structure across groups, but with no constraints on the parameters. In other words, factor loadings and intercepts can vary across countries (= configural equivalence). In a second model, we will constrain the factor loadings to be equal across groups (= metric equivalence). The third model will, besides loadings, also have equal intercepts across groups (= scalar equivalence). The level of measurement equivalence can then be determined by judging the fit of these various models. In addition, we will look at modification indices (cf. infra) to determine the sources of possible misfit.

# Exercise 2.1. A measurement equivalence test of the anti-immigration attitudes scale

In this exercise, we perform a test of measurement equivalence for the anti-immigration attitudes scale. We use the software package AMOS to estimate a MGCFA model for the three indicators. Our model will consist of 51 groups, namely 17 countries at three time points. We perform this analysis step-by-step, so that prior knowledge of AMOS is not required to perform the exercise. For more detailed information about AMOS, refer to the AMOS user’s guide [Arb05].

### Step 1: Create a group membership variable

Before a multi-group model can be estimated in AMOS, we should perform one more preparatory step in SPSS: creating a variable that indicates the group to which a respondent belongs. Combine CNTRY and ESSROUND into a new variable, called GROUP. Hint: one possible way of doing this is to use the CONCAT function for merging string variables. Do not to forget to save your dataset afterwards.

*Open the file you worked with in the first chapter and start where it ended. *Please do not forget to change ‘C:\’ to the path where you stored the ESS datasets.

*Create a new variable ‘group’, and define it as a character variable of length 12. *This variable will be the group membership variable.

*Create a new variable ‘round_char’, and define it as a character variable of length 2. *round_char contains the same information as ESSROUND, but is a character variable. *This is necessary, because only character variables can be used with the CONCAT function.

*Merge the character variables cntry and round_char into a group. *The functions rtrim and ltrim are used to remove all blanks. *If the blanks are not removed, AMOS will run into problems at a later stage of the analysis. *Please do not forget to change ‘C:\’ to the path where you stored the ESS datasets.

### Step 2: Create groups and assign data

Save your work in SPSS and close the program.

AMOS is part of the SPSS software suite, but not all of you will have access to this package.

Open ‘AMOS Graphics’. This is AMOS’s graphical interface that makes it quite straightforward to perform multi-group analyses. Click ‘File > New’ to create a new Amos project. The first thing you do is to create 51 different groups. Click ‘Analyze > Manage groups...’.

The ‘Manage groups’ window pops up. This window can be used to create, delete and rename groups. Replace the text ‘Group number 1’ with the name of the first group (AT1). Next, click ‘New’ to create a new group and name it AT2. Continue until the 51 groups have been created. (The groups to be created are: AT1 AT2 AT3 BE1 BE2 BE3 CH1 CH2 CH3 DE1 DE2 DE3 DK1 DK2 DK3 ES1 ES2 ES3 FI1 FI2 FI3 FR1 FR2 FR3 GB1 GB2 GB3 HU1 HU2 HU3 IE1 IE2 IE3 NL1 NL2 NL3 NO1 NO2 NO3 PL1 PL2 PL3 PT1 PT2 PT3 SE1 SE2 SE3 SI1 SI2 SI3).

In the group pane, you can scroll through all groups that have been created. Check whether all necessary groups have been created. The next step consists of assigning data to the 51 groups. Click the ‘Select data file(s)’ tab.

The ‘data files’ window pops up. This window is used to specify what part of the dataset belongs to which group. Click on AT1 to select the first group. Now click ‘File Name’ and browse the SPSS dataset we created before (ESS123_immig.sav). Next, click ‘Grouping Variable’, and select the variable indicating group membership. In our case, this is the variable ‘GROUP’. Finally, click ‘Group Value’, and select the value that corresponds to the selected group (here: AT1). These steps should be repeated for all 51 groups (which can be a fairly time-consuming task). If the data are correctly assigned, the number of observations in the group should appear in the last column of the ‘Data Files’ window. Group AT1, for example, contains 1,677 observations (out of 84,331 in the complete data set).

To avoid any risk of losing the work done, save the project (File > Save) under the name ‘basemodel’.

# Exercise 2.1, step 3: Specify the model

The graphic interface in AMOS includes a very user-friendly way of specifying models by making a drawing. You only have to specify one model that will be estimated in all 51 groups. Click on the ‘Draw latent variable tab’. Move your cursor into the square on the right, and draw an oval (first click, then drag). This oval represents the latent variable. Now click three times insides the oval to draw the three indicators that measure the latent variable. The resulting drawing is the backbone of our model.

Next, we assign names to the variables in the model. The rectangles refer to observed indicators that are present in the dataset. Click the ‘List variables in dataset’ tab. The ‘Variables in Dataset’ window pops up, providing you with a list of all variables present in the dataset. Drag each of the three anti-immigration items (IMSMETN, IMDFETN, IMPCNTR) into one of the rectangles. Now the variable labels will appear in the rectangles (although they are hard to read because of the overlap). Close the ‘Variables in Dataset’ window.

Now we still have to assign names to the latent variables (represented by circles or ovals) in the model. These latent variables are by definition unobserved, and thus do not correspond to specific items in the dataset. As a result, the names are arbitrary. AMOS contains a useful tool for naming all latent variables at once. Select ‘Plugins’ from the main toolbar, and then click ‘Name Unobserved Variables’. Amos automatically assigns name F1 to the latent factor (the oval), and e1-e3 to the error variances (the three circles).

# Exercise 2.1, step 4: Estimating the model

Before the actual model is estimated, a few options need to be specified. Go to the ‘Analysis Properties’ Window (View > Analysis Properties). Check the box ‘Estimate means and intercepts’. By default, AMOS does not take the mean structure of the data into account; only factor loadings are estimated, no intercepts. In order to test for scalar equivalence, however, we also need to look at the intercepts.

Select the ‘Output’ tab of the Analysis Properties window. Make sure the following boxes are checked: ‘Standardized estimates’ and ‘Modification indices’.

The goal of this multi-group analysis is to test the equality of factor loadings and intercepts across groups. As mentioned, we do this by inspecting the fit of models with and models without constrained factor loadings and intercepts. Thus, different models with different configurations of constraints have to be estimated: an unconstrained model (configural equivalence), a model with equal factor loadings (metric equivalence) and a model with equal factor loadings and intercepts (scalar equivalence). Fortunately, AMOS contains a ‘magic button’ that allows you to estimate all the required models in just a few clicks: the ‘Multiple-group analysis’ tab.

If you click the ‘Multiple-group analysis’ tab, the ‘Multiple-group analysis’ window pops up. This window summarises all models that are estimated by default. Checked boxes indicate that certain parameters are constrained to be equal across groups. The models are nested. In model 1, for example, only measurement weights (i.e. factor loadings) are constrained to be equal. Model 2 contains equal factor loadings as well as intercepts. Only these first two models are of interest to the current analysis. We accept the default settings, and click OK.

Finally, click the ‘Calculate Estimates’ tab. The estimation procedure can take from a few seconds to a few minutes, depending on the computational power of your computer. If the estimation converged, the XXs in the model pane should have turned into OKs.

# Exercise 2.1, step 5: Interpretation of the output

Now comes the most important step of the analysis: the interpretation of the output. The output can be accessed by clicking the ‘View text’ button.

The estimation has produced a massive amount of output. The AMOS ‘Output window’ allows you to navigate through this output in a very efficient way by selecting different parts of the output, different groups and different models.

We start by inspecting the fit of the different models. Click ‘Model fit’, and then ‘CMIN’. The ‘CMIN’ heading contains the chi-square values and the number of degrees of freedom for the various models. Remember that we are only interested in three models: unconstrained (configural equivalence), measurement weights (metric equivalence) and measurement intercepts (scalar equivalence).

### Question

What do these chi-square values tell you about the fit of the models?

SolutionThe unconstrained model has a chi-square value of 0. This is not surprising - models with only three indicators and without additional constraints are just-identified (this means that they have 0 degrees of freedom). Fit indices for this model are therefore not very informative. The ‘measurement weights’ model has a chi-square value of 1.573,024 for 100 degrees of freedom; the measurement incepts model has a chi-square value of 15.536,361 for 250 degrees of freedom. The higher the chi-square value, the larger the discrepancy between our model and the observed data. The p-values for both models are 0. This indicates that it is very unlikely that the model will give an adequate description of the data. In other words, the chi-square tests indicate bad model fit and reject metric as well as scalar equivalence. However, chi-square tests are known to be very sensitive to large sample sizes. If the sample size is sufficiently large, even negligible misspecifications in the model may lead to the rejection of the whole model. Because we are analysing a massive sample size here (over 80,000 respondents!), it is better not to pay too much attention to these chi-square tests.

To avoid the problems of these chi-square tests, alternative indices of model fit have been developed. Root Mean Squared Error of Approximation (RMSEA) and Comparative Fit Index (CFI) are two very informative measures of how close the model corresponds with the data. Click on ‘Baseline Comparisons’ and ‘RMSEA’ to view this alternative measure of model fit.

### Question

What do you conclude based on the CFI and RMSEA values?

SolutionAs a rule of thumb, RMSEA values lower than 0.05 are considered to indicate acceptable model fit. Judging by RMSEA, the ‘measurement weights’ and the ‘measurements intercepts’ models fit the data reasonably well. Generally, it is assumed that models with CFI values larger than 0.90 are acceptable. Thus, the CFI-criterion also points in the direction of reasonable model fit, although the CFI of the ‘measurement intercepts’ model only marginally exceeds 0.90.

Overall fit indices such as RMSEA and CFI thus seem to provide some evidence that the REJECT-scale is comparable across countries and time points. However, to test measurement equivalence, we should not rely completely on RMSEA and CFI. RMSEA and CFI are measures of overall model fit. They summarise the goodness-of-fit of a complete model in a single number. The model could, for example, contain a severe misspecification in one of the groups or for one specific parameter, but still have a reasonable overall fit. Therefore, we should also check that no misfit is present in various parts of the model separately. In our case, a thorough test of measurement equivalence requires that we look at possible misfit due to the constrained factor loadings and intercepts. So-called ‘modification indices’ are a very useful tool for inspecting for local misfit. For all constrained parameters in the model, AMOS calculates a modification index. Modification indices indicate how much the chi-square value of a model would drop if the parameter were free instead of constrained (in other words, by how much the model fit would improve). Modification indices are in fact chi-square tests for individual equality constraints: high values indicate that the respective parameter constraint is ‘wrong’. However, since chi-square values are known to be very sensitive to large sample sizes (cf. supra), only really high values (larger than 20, say) should be taken as serious evidence of misfit. Because the chi-square values can be misleading, it is advisable to look at the expected parameter change as well. The parameter change indicates how much a parameter would change if the equality constraint were removed. Obviously, we are only interested in parameter changes that are substantial (larger than 0.10 in absolute value, say).

To summarise, a conclusive test of measurement equivalence can be performed by looking at modification indices and expected parameter changes for the factor loadings and measurement intercepts. We start by looking at the equality constraints for the factor loadings. Click ‘modification indices’ and then ‘measurement weights (as mentioned above, these are the factor loadings). Next, select the ‘Measurement intercepts’ model in the bottom left corner of the output window, (this is the model with the constrained factor loadings and intercepts). Now you see modification indices and parameter changes for the first group (AT1). Not all the modification indices presented refer to factor loadings - you should only pay attention to relations between items and the latent variable F1 (these modification indices are indicated by a statement with the following structure: ‘item name <--- F1’). Modification indices for the other groups can be examined by scrolling through the groups in the left-hand column.

- Select 'Modification indices - Regression weights'
- Select 'Measurement intercepts' model
- Scroll through the groups

### Question

Inspect modification indices and parameter changes for the factor loadings in each of the 51 groups. Do you see violated equality constraints (i.e. a modification index larger than 20 and a parameter change larger than 0.10 in absolute value)? If so, in which groups and for which items?

SolutionQuite a few equality constraints on the factor loadings appear to be violated. An overview of violations can be found in the table below:

Group | Parameter | Modification Index | Parameter change |
---|---|---|---|

AT1 | imsmetn | 33.94 | 0.107 |

CH2 | imsmetn | 23.343 | -0.112 |

DK1 | imsmetn | 47.884 | -0.2 |

DK2 | imsmetn | 112.224 | -0.357 |

DK2 | imdfetn | 22.905 | 0.108 |

DK3 | imsmetn | 74.691 | -0.309 |

DK3 | imdfetn | 32.185 | 0.131 |

ES1 | imsmetn | 45.221 | 0.109 |

ES3 | imsmetn | 49.963 | 0.105 |

HU1 | imsmetn | 128.311 | -0.407 |

HU2 | imsmetn | 63.555 | -0.246 |

HU3 | imsmetn | 149.543 | -0.406 |

NO2 | imsmetn | 78.555 | -0.241 |

NO3 | imsmetn | 29.665 | -0.137 |

PT1 | imsmetn | 108.786 | 0.163 |

PT2 | imsmetn | 115.86 | 0.17 |

PT3 | imsmetn | 241.82 | 0.209 |

PT3 | imdfetn | 135.839 | -0.122 |

SI2 | imsmetn | 22.375 | 0.108 |

Modification index larger than 20 and a parameter change larger than 0.10.

In these specific cases, the assumption of equal factor loadings is untenable. If we were to give up the equality constraint on the factor loading of IMSMET in group AT1 (Austria at time point 1), for example, the chi-square value would drop by 33.94 units. The freely estimated factor loading would be 0.107 higher than now estimated (this is the expected parameter change). Specifically, this means that, in this group, attitudes towards immigrants from the same ethnic group are more closely connected to anti-immigration attitudes in general than in other groups.

It is remarkable that almost all violations refer to the same item, namely IMSMETN. This indicates that factor loadings for the item on immigrants from the same group vary considerably across groups, while the meaning of the other two items is more cross-culturally robust. In the very few cases where a different items is involved (groups DK2, DK3 and PT3), the factor loading for IMSMETN is violated as well, and to a much greater extent. Most probably, freeing the loading for IMSMETN would also solve the problem for the other item.

Now we take a closer look at the modification indices for the measurement intercepts. Click ‘intercepts’ in the upper left corner of the output window and scroll through the groups.

### Question

Do you see violated equality constraints (i.e. a modification index larger than 20 and a parameter change larger than 0.10 in absolute value)? If so, in which groups and for which items?

SolutionWe find even more violated constraints on the intercepts than on the factor loadings:

Group | Parameter | Modification Index | Parameter change |
---|---|---|---|

AT1 | imsmetn | 104.657 | 0.119 |

AT3 | imsmetn | 103.782 | -0.129 |

CH2 | imsmetn | 160.483 | -0.166 |

CH3 | imsmetn | 193.101 | -0.217 |

DE2 | imsmetn | 112.686 | -0.13 |

DE2 | impcntr | 74.386 | 0.1 |

DE3 | imsmetn | 131.481 | -0.138 |

DK1 | imsmetn | 147.913 | -0.197 |

DK2 | imsmetn | 287.81 | -0.324 |

DK2 | impcntr | 60.796 | 0.119 |

DK3 | imsmetn | 346.252 | -0.363 |

DK3 | impcntr | 76.311 | 0.132 |

ES1 | imsmetn | 140.924 | 0.133 |

ES2 | imsmetn | 126.432 | 0.133 |

ES3 | imsmetn | 274.059 | 0.181 |

FI3 | impcntr | 98.745 | 0.125 |

HU1 | impcntr | 109.019 | 0.137 |

HU1 | imdfetn | 102.614 | 0.128 |

HU1 | imsmetn | 195.478 | -0.331 |

HU2 | impcntr | 135.029 | 0.199 |

HU2 | imdfetn | 54.169 | 0.112 |

HU2 | imsmetn | 123.575 | -0.275 |

HU3 | impcntr | 244.826 | 0.273 |

HU3 | imdfetn | 59.741 | 0.124 |

HU3 | imsmetn | 203.752 | -0.378 |

NL1 | imsmetn | 140.326 | 0.112 |

NO1 | impcntr | 91.435 | -0.105 |

NO2 | imsmetn | 65.157 | -0.123 |

NO3 | imsmetn | 104.006 | -0.15 |

PT1 | imsmetn | 262.235 | 0.189 |

PT2 | imsmetn | 242.332 | 0.185 |

PT3 | imsmetn | 446.51 | 0.232 |

SE1 | impcntr | 118.767 | -0.109 |

Modification index larger than 20 and a parameter change larger than 0.10.

In Austria at time point 1, setting the intercept of IMSMETN free would cause the chi-square value of the model to drop by 104,657. The freely estimated intercept would increase by 0.119.

The vast majority of the violations are related to IMSMETN here as well.

# Exercise 2.1, step 6: Re-specifying the model

This inspection of the modification indices shows that not all factor loadings and intercepts are equal across groups. This indicates that there could be problems with the comparability of the attitude scale. Remarkably, virtually all deviations from full scalar equivalence relate to one item, namely IMSMETN. Apparently, the exact meaning of this specific item concerning immigrants from own social groups differs considerably from one group to another. The parameters for the other two items, however, are stable across groups.

Based on these findings, we re-estimate the model, this time without equality constraints on the problematic item IMSMETN. This newly estimated model implies partial rather than full scalar equivalence, since factor loadings and intercepts are not equal for the whole set, but only for two items.

In order to avoid any confusion about variables names, it is advisable to completely re-specify the model to be estimated. First, save the current project under the name ‘fullequivalence’. Next, open the previously saved AMOS project ‘basemodel’, so that you do not have to create the groups and assign the data again. Repeat steps 3 (Specify the model) and 4 (Estimate the model) as explained before. However, there are two things that need to be done differently.

- When assigning the variable names to the indicators (i.e. dragging the variable names into the rectangles), drag IMDFETN into the first rectangle and IMSMETN into the second. By default, AMOS constrains the factor loading of the first item to 1. This item is called the marker item. The constraint on the marker item is a necessary condition to enable the model to be estimated. You can see that the first item is the marker item because the number ‘1’ appears next to the factor loading. Because the factor loading of the marker item is constrained to 1 in all groups, it is impossible to allow factor loadings to vary across groups. Because we want to set free factor loadings for IMSMETN, we have to make sure that this item is not used as the marker item.
- Just before clicking the ‘Calculate Estimates’ button, we have to manually remove the equality constraints on IMSMETN. Double-click the ‘Measurement intercepts’ model in the left-hand column. The ‘Manage Models’ window pops up. Here, you get an overview of all parameter constraints that are imposed across groups. The combination of letters and numbers refer to specific parameters in the model - they are also represented in the graphical representation of the model. The letter ‘a’ refer to factor loadings, and the letter ‘i’ to intercepts. A1_1, for example refers to the factor loading of the second item in group 1 (not the factor loading of the first item; the factor loading of the marker item is not a parameter since it is constrained to one). I3_7 refers to the intercept for the third item in group 7. The equality signs indicate that two factor loadings and three intercepts are constrained to be equal across all 51 groups. To remove all constraints on IMSMETN, the first (with the a1s) and fourth (with the i2s) lines need to be deleted. Select these lines, and press delete. The remaining equality constraints refer to the other two items. Close the ‘Manage Models’ window.

Now click the ‘calculate estimates’ button. Once the models are estimated, save the project as ‘partialequivalence’.

### Question

Look at the fit of the ‘measurement intercepts’ model (this is the model for which we deleted some equality constraints). What do you see?

The CFI of the re-specified model equals 0.94, and the RMSEA 0.027. These indices suggest that the model without equality constraints on IMSMETN fits the data quite well. Moreover, the CFI of the partially equivalent model is substantially better than the CFI of the previously estimated fully equivalent model (0.94 vs. the 0.902 we found before).

# Ex. 2.1, step 7: Conclusion

To test the cross-time and cross-country comparability of the anti-immigration scale, we estimated a series of multi-group models with different sets of parameters constrained. The analysis showed that a model with equal factor loadings and intercepts - i.e. full scalar equivalence - fitted the data reasonably well. However, closer inspection of modification indices and expected parameters changes showed that some of the equality constraints for one of the items, namely IMSMETN, were not tenable. The model was re-estimated without these constraints on IMSMETN and model fit improved substantially.

These results indicate that our scale has the characteristic of partial scalar equivalence: for at least two items, factor loadings and intercepts are equal across the groups. It is therefore justified to compare means for the latent variable over all time points and countries in the study. In other words, our measurements are sufficiently comparable to continue the substantive analysis1.

- [1] Ideally, we would continue the analyses with the scores for the latent variable. This is exactly what was done by Meuleman et al. (2009). In the remainder of this module, however, we will work with a sum scale based on the items instead. The use of this sum scale has the disadvantage that it does not correct for the fact that we only found partial and not full equivalence. However, this approach makes it possible for students with no interest in the topic of measurement equivalence to skip Chapter 2.

- [Arb05] Arbuckle, J. L. (2005). Amos 6.0 user’s guide. Chicago: SPSS.
- [Ben90] Bentler, P.M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238-246.
- [Bil03] Billiet, J. (2003). Cross-cultural equivalence with structural equation modeling. In: Harkness, J., Van de Vijver, F., Mohler, P. (Eds.), Cross-cultural survey methods. John Wiley and Sons, Hoboken, NJ, pp. 247-264.
- [Bro92] Browne, M.W., and Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230-258.
- [Byr89] Byrne, B. M., Shavelson, R. J. and Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: the issue of partial measurement invariance. Psychological Bulletin, 105, 456-466.
- [Hor92] Horn, J. L. and McArdle, J. J. (1992). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18(3), 117-144.
- [Jör71] Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36(4), 408-426.
- [Mee97] Meertens, R. W. and Pettigrew, T. F. (1997). Is subtle prejudice really prejudice? Public Opinion Quarterly, 61(1), 54-71.
- [Ren98] Rensvold, R. B. and Cheung, G. W. (1998). Testing measurement models for factorial invariance: a systematic approach. Educational and Psychological Measurement, 58, 1017-1034.
- [Ste98] Steenkamp, J. E. and Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78-90.
- [Van00] Vandenberg, R. J. and Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-70.
- [Van97] van de Vijver, F. and Leung, K. (1997). Methods and data-analysis for cross-cultural research. Sage, London.