# Chapter 2: Factor Analysis

### Identifying the Latent scales

**Identification of the model: Scales of individual factors**

Since a factor cannot be directly observed, it also does not have a unique and natural scale of units on which it is measured. Instead, we need to make some assumptions to define the scales of all the factors and give meaning to factor values. To better understand the issue we face here, we may first consider the same question for a directly observable variable such as temperature. A temperature exists and is a well-defined concept, but that does not mean that the scale we use to assign numbers to it is also uniquely defined. For example, if we say that the temperature is 20 what does this mean exactly? Is it cold or warm? The answer would be different for temperatures measured on Fahrenheit and Celsius scales which have different definitions for what the origin (zero) represents and how much of a change is a change of one unit. A specific temperature is assigned a different number depending on the temperature scale we use. For example, 20 degrees Celsius is 68 degrees Fahrenheit. However, both numbers convey the same message, that it is a rather warm day! Hence, when we report the temperature we do not only give a number but we also name the scale we use.

Measuring a factor works in a similar way to temperature. We need to make some assumptions to define a factor’s scale. These assumptions are arbitrary, in that infinitely many different sets of them will produce exactly the same fit for the observed data, and that we are free to choose any one set of assumptions to identify the scales.

First, for each individual factor we need to specify its origin, i.e. what zero represents, and its measurement unit, i.e. how much of a change is a change of one unit. There are two common ways to define the origin: we either fix the expected value *κ* of the factor at 0, or fix at 0 the intercept of the measurement model of one item which measures the factor (e.g. *ν*_{1} = 0 if *y*_{1} measures the factor). The former is the default in almost all statistical packages. It implies that the average level of a factor in a population is given the value 0, which thus becomes the reference point for all other values. In the case that one measurement intercept is set at 0, this means that the indicator and the factor have the same origin – i.e. that whenever the average of the corresponding indicator is 0, the mean of the factor will be 0 as well.

Analogously, there are two common ways to define the measurement unit of any one factor: fixing either the variance *φ* of the factor at 1, or fixing at 1 the loading of that factor in the measurement model of one item which measures the factor (e.g. *λ*_{11} = 1 above, if *y*_{1} measures factor *η*_{1}). The former implies that the standard deviation of the factor in the population is defined to be 1 unit, and the latter that the factor has approximately the same unit of measurement as the indicator for which the loading is set to be 1.

**Identification of the model: Direction (rotation) of the factor scales**

As the final part of identifying the latent scales, the "direction" of the scale of each factor needs to fixed. For any one factor, we may reverse its scale (i.e. whether large values of the factor are associated with large or small values of the items) by changing the signs of all the loadings associated with it. If there are two or more factors, there are an infinite number of ways of choosing the directions of the scales. These choices are known as *rotations* of the factors. Statistical packages for factor analysis provide various default rules for choosing such rotations. An alternative but equivalent way to choose a rotation is to select for each factor one item (an "anchor item") for which we specify that that item measures only that factor, and has factor loadings of 0 for all other factors. For example, in the two-factor model above, where in the initial specification each of the six items measures both factors, choosing *λ*_{12} = 0 and *λ*_{61} = 0 fixes the scales – here in such a way that *η*_{1} is the factor which is measured by all items except for *y*_{6}, and *η*_{2} the factor which is measured by all but *y*_{1}. A model with only this minimum number of zero loadings needed to fix a rotation is still an exploratory factor analysis model. A confirmatory factor analysis model which specifies more than the minimum number of zero loadings also automatically fixes the directions of the scales.