# Chapter 2: Factor Analysis

### Model for the observed variables

A model which includes latent factors also implies a model for the distribution of the observed variables. For example, consider again the two-factor model for six items where

y_{1}=ν_{1}+λ_{11}η_{1}+ε_{1}

y_{2}=ν_{2}+λ_{21}η_{1}+ε_{2}

y_{3}=ν_{3}+λ_{31}η_{1}+ε_{3}

y_{4}=ν_{4}+λ_{42}η_{2}+ ε_{4}

y_{5}=ν_{5}+λ_{52}η_{2}+ ε_{5}

y_{6}=ν_{6}+λ_{62}η_{2}+ ε_{6}

and the factors where *η*_{1} and *η*_{2} are asumed to be jointly normally distributed, with means *κ*_{1} and *κ*_{2}, variances *φ*_{1} and *φ*_{2}, and *φ*_{12}. This model implies that the means of the items depend on the model parameters as follows:

E(*y _{j})* =

*ν*+

_{j}*λ*

_{j1}κ_{1}for

*j*= 1, 2, 3 and E(

*y*) =

_{j}*ν*+

_{j}*λ*for

_{j2}κ_{2}*j*= 4, 5, 6

and their variances as

*var*(*y*_{j}) = *λ*^{2}_{j1}*φ*_{1} + *θ*_{j} for *j* = 1, 2, 3

*var*(*y*_{j}) = *λ*^{2}_{j2}*φ*_{2} + *θ*_{j} for *j* = 4, 5, 6

and covariances between pairs of items are

*cov(y _{j},y_{k}) = λ_{j1}λ_{k1}φ_{1} * for

*j, k*in 1, 2, 3

*cov(y _{j},y_{k}) = λ_{j2}λ_{k2}φ_{2} * for

*j, k*in 4, 5, 6

*cov(y _{j},y_{k}) = λ_{j1}λ_{k2}φ_{12} * for

*j*in 1, 2, 3 and

*k*in 4, 5, 6

### Estimation of the models

The implied model for the observed variables is what allows a factor analysis model to be actually estimated from observed data. The basic idea is to find values (estimates) for the parameters in such a way that the means, variances and covariances implied by the model are as close as possible to the sample means, variances and covariances of the observed variables. There are different ways (methods of estimation) for how this idea is implemented, corresponding to different criteria for “as close as possible”. The method of estimation which we use in this module is that of *Maximum Likelihood (ML) estimation*.