# Chapter 2: Factor Analysis

### Example 1 on Factor analysis: Models in one country

Consider the data in our example for respondents in the UK, for the questions D12-D17. It is proposed that questions D12 – D14 measure only the factor "effectiveness of the police" and questions D15 – D17 only the factor "procedural fairness of the police". Fit a 2-factor CFA model with this measurement structure and a correlation between the factors, and also fit 1-factor and a 2-factor EFA models for these items. Do the models fit the data well? In particular, does the CFA model fit the data well enough to be adequate for subsequent use for these items? Interpret the scales of the two factors and their estimated correlation in the CFA model.

Some model assessment statistics are shown in Table 2.1. From them, we observe the following:

- The overall goodness of fit test indicates that all three models fit poorly (with p<0.001 in each case). This is not unusual and is not in itself conclusive, because with moderate to large sample sizes the test is sensitive to even small amounts of lack of fit.
- The 2-factor EFA model differs from the 2-factor CFA model in that the CFA model sets to 0 several factor loadings which are not zero in the EFA model. These are the "cross-loadings" between a factor and the items which are not expected to be indicators of that factor, such as the loading of Effectiveness in the measurement model for item D15 (plcrspc). The likelihood ratio test between these two models is a test of the null hypothesis that all of these cross-loadings are 0 in the population. Here this hypothesis is rejected (p<0.001), indicating that the EFA model fits better than the CFA model. However, this may again be partly due to the sensitivity of the test.
- The RMSEA and CFI fit indices suggest that the 1-factor model fits badly, the 2-factor EFA model very well, and the 2-factor CFA model with a moderate to good fit.
- Both the AIC and BIC statistics have their smallest values for the 2-factor EFA model, suggesting that this model achieves the best balance between goodness of fit and parsimony (number of parameters).

*Table 2.1: Model assesment statistics*

Model |
LR test vs. saturated model: p-value |
AIC |
BIC |
RMSEA |
CFI |
---|---|---|---|---|---|

1-factor model | <0.001 |
39490 |
39595 |
0.188 |
0.808 |

2-factor EFA model | <0.001 |
38729 |
38863 |
0.007 |
1.000 |

3-factor CFA model | <0.001 |
38781 |
38891 |
0.054 |
0.986 |

LR test: 2-factor EFA vs. 2-factor CFA model: |
|||||

p-value | <0.001 |

A path diagram for this CFA model, with estimated values of the parameters also included, is shown in Figure 2.2 (this graph was obtained from Stata, where the default is to show also the measurement errors as circled latent variables). All of the factor loadings have positive values. Recalling the coding of the items, this implies the interpretation that high values of the factors correspond to positive evaluations of the effectiveness and procedural fairness of the police. The estimated correlation of the factors is +0.58, so in the UK individuals who have a positive view of the effectiveness of the police also tend to have a positive view of their procedural fairness.

*Figure 2.2: Path diagram and parameter estimates for a 2-factor Confirmatory Factor Analysis model for indicators of Effectiveness and Procedural fairness of the police (UK respondents).*