Chapter 2: Factor Analysis

Model for the observed variables

A model which includes latent factors also implies a model for the distribution of the observed variables. For example, consider again the two-factor model for six items where

			    y1 = ν1 + λ11η1		+ ε1
y2 = ν2 + λ21η1 + ε2
y3 = ν3 + λ31η1 + ε3
y4 = ν4 + λ42η2 + ε4
y5 = ν5 + λ52η2 + ε5
y6 = ν6 + λ62η2 + ε6

and the factors where η1 and η2 are asumed to be jointly normally distributed, with means κ1 and κ2, variances φ1 and φ2, and φ12. This model implies that the means of the items depend on the model parameters as follows:

E(yj) = νj + λj1κ1 for j = 1, 2, 3 and E(yj) = νj + λj2κ2 for j = 4, 5, 6

and their variances as

var(yj) = λ2j1φ1 + θj for j = 1, 2, 3

var(yj) = λ2j2φ2 + θj for j = 4, 5, 6

and covariances between pairs of items are

cov(yj,yk) = λj1λk1φ1 for j, k in 1, 2, 3

cov(yj,yk) = λj2λk2φ2 for j, k in 4, 5, 6

cov(yj,yk) = λj1λk2φ12 for j in 1, 2, 3 and k in 4, 5, 6

Estimation of the models

The implied model for the observed variables is what allows a factor analysis model to be actually estimated from observed data. The basic idea is to find values (estimates) for the parameters in such a way that the means, variances and covariances implied by the model are as close as possible to the sample means, variances and covariances of the observed variables. There are different ways (methods of estimation) for how this idea is implemented, corresponding to different criteria for “as close as possible”. The method of estimation which we use in this module is that of Maximum Likelihood (ML) estimation.

Go to next page >>