# Chapter 2: Factor Analysis

### Identification of the model: Introduction

Before a factor analysis model – or any other latent variable model – can be estimated, we need to make sure that it is identified. A statistical model is identified if any given distribution of the observed variables is produced by a unique set of values for the parameters of the model. If a model is not identified, exactly the same observed distribution is implied by different values of the model parameters, in which case it will be impossible to give a single interpretation to what the fitted model is telling about the questions we are using it to answer.

For example, suppose that we are interested only in one variable y and its distribution is assumed to be normal with 0 mean and a variance. Also let us assume that the hypothesized statistical model implies that *Var(y)=αβ*. Knowing the sample variance of *y*, *Var(y)*, is then not enough to identify unique estimated values for the two model parameters *α* and *β* separately. If a specific pair of values *{α, β}* implies a variance which matches the sample variance of *y*, then so do also *{α/2, 2*β}*, *{3*α, β/3}* and an infinite number of other pairs of values for the parameters. However, if the statistical model implies that *Var(y)= α*, using just the one parameter *α*, then knowing the variance of *y* we can identify a unique value for *α*.

This example with just one variable is rather simplistic and artificial. In the context of multivariate models such as factor analysis, however, unidentified models may arise more easily and be less easy to spot. For factor analysis models, two types of questions of identifiability need to be resolved. The first is the identification of the latent scales, and the second is the inherent identifiability of the model parameters even after the latent scale has been selected. These two questions are discussed separately over the next two pages.