# Analytical strategy

In OLS regression, there are two fundamentally different analytical strategies: top-down or bottom-up. A top-down strategy means starting with the full model and eliminating seemingly irrelevant explanatory variables. This is feasible given a reasonable number of explanatory variables. A bottom-up strategy means starting with a simple model and adding complexities. In multilevel analysis, the unanimous recommendation is to follow the bottom-up strategy. There are two reasons for this. Firstly, adding complexities in steps can help complex models to converge, whereas an initially complex model may fail to converge. Secondly, the number of parameters can easily become unmanageable in multilevel models. The number of fixed (regression) coefficients will be the same as in OLS regression, but the number of random parameters could easily become very high. If k is the number of regression coefficients (including the intercept) that are defined as random, the number of variances and covariances to be estimated is (k+1)*k/2.

The recommended analytical strategy is explained in more detail in Hox (2010: 56-59). Here, we briefly review the recommended steps.

**
Step 1 Estimate the null model
**

The first step is to estimate the null model and compute the intraclass correlation.

Y |

The purpose of this is twofold: to evaluate whether a multilevel analysis is necessary, and, if so, to estimate the baseline values of the variance components (random parameters). The intraclass correlation indicates how large a proportion of the variance in Y stems from variation between the level 2 units (groups, contexts). There is no agreed threshold, but five per cent between-group variation would certainly justify a multilevel approach. Stata also automatically performs a useful likelihood ratio test comparing the multilevel model with the OLS equivalent. The estimated random parameters can be used to calculate explained variation at both levels (pseudo R squares).

**
Step 2 Develop the level 1 model
**

The second step is to develop the full level 1 model, normally the individual level model. In the equation, K is the number of explanatory variables at the individual level.

Y_{ij} = β_{0} + β_{1}X_{1ij} + β_{2}X_{2ij} + ... + β_{K}X_{Kij} + u_{0j} + e_{ij}

The reason for developing a full level 1 model before proceeding is to avoid Hauser’s contextual fallacy and to find out how much of the level 2 variation is explained by *compositional* effects. This could be the case if the distributions of important explanatory variables differ between contexts, such as firms that differ in terms of the skill and educational levels of their employees in a study of wage determination.

**
Step 3 Develop the random model
**

In this step, we want to develop the random part of the model (this step comes later in Hox’s recommendation). Guided primarily by theory, we should test whether important regression coefficients in the individual level equations vary significantly among the level 2 units (contexts, groups). This should be done in steps, testing one regression coefficient, i.e. the effect of one level 1 variable, at a time. The first rough indication is obtained by seeing whether the variance in the level 2 residuals exceeds two times its standard error. Then perform a likelihood ratio test, comparing a model with and without the random effect defining the covariance matrix as unstructured, i.e. by including the covariance(s) of the level 2 residuals. Next, consider whether the covariance structure can be simplified by eliminating the covariance(s).

χ^{2}_{H} = -2LL_{K-H} - -2LL_{K}

It is also possible to model heteroskedasticity in the level 1 residual variance. This is most easily done using special purpose software, such as Mlwin, which contains a step-by-step guide to how to model level 1 variance.

**
Step 4 Add level 2 explanatory variables
**

Level 2 explanatory variables may be unique characteristics (global variables) of the level 2 units (the location and the age of the firm or the context a country belongs to, or they can be aggregated as in official statistics or aggregated from the data set. They can have two kinds of effects, on the intercepts and/or on the slopes. The ‘main’ effects of any level 2 variable can be seen as affecting the intercept (regression coefficient), as we saw in the separate equations for each level. It is recommended to start adding these first.

Level 2 regressors can also affect one or more regression coefficients through cross-level interactions. This means that, in some software programs, we will have to compute an interaction term as the product of the two variables involved, or this can be done from scratch in the software program. It is recommended that we only include cross-level interaction for individual level variables whose effects were found to vary in the previous step, but you will find exceptions to this in applications, especially if the varying effect is backed by strong theory. In situations with few level 2 units, the statistical power may also be too low to reliably detect cross-level interactions by means of variance components.

In situations with many level 1 units and few level 2 units, as is the case when analysing data from the European Social Survey, we need to be careful not to introduce too many explanatory variables at the second level. If our data set includes 20 countries, this is the n to consider when thinking about the number of level 2 regressors. In such situations, two to three is better than five to ten. If there are a larger number of level 2 explanatory variables, try them out one by one. This is especially important if cross-level interactions are also included.

The full model including a cross-level interaction now has the following structure:

Y_{ij} = β_{0} + β_{1}X_{1ij} + β_{2}X_{2ij} + β_{3}Z_{j} + β_{4}X_{1ij}Z_{j} + (u_{0j} + u_{1j}X_{1ij} + e_{ij})

In research applications, variance component models with and without level 2 regressors are found more frequently than models with cross-level interactions.