# The problem with ignoring the multilevel structure

Ignoring the multilevel structure of the data creates both conceptual and statistical problems. If we drop the contextual levels, such as schools in studies of learning and firms in the study of wage determination, we ignore the arenas for learning and wage determination. If we focus our analysis on only one problem, we also run into statistical problems. Ignoring the contextual level, and conducting the analysis at the individual level, could lead to underestimation of the standard errors and result in invalid statistical tests, especially for contextual variables. The opposite solution - aggregating the data to the contextual analysis and ignoring the individual level - opens for the ecological fallacy. The safest solutions are to correct for clustering and compute robust standard error if the â€˜contextâ€™ or clusters are artificial and of little theoretical interest, and to use multilevel analysis if the contexts are theoretically important.

### Comparative research and multilevel models

Comparative research has traditionally involved cross-cultural comparisons of two or more countries. A wider definition is to include any comparison of cases in space or time. The cases can be any type of macro-unit, organizational units (firms, schools) or geographic units (communities, regions, countries). A classification of comparative research is presented in Figure 2.1. The first dimension in the table classifies the data structure. We can distinguish between the use of either the micro or macro level, or both. In the latter situation, the micro-level units are embedded in the macro-units, such as respondents in countries in the European Social Survey. The other dimension classifies designs by the number of contexts (cases), i.e. countries in studies based on the European Social Survey. Studies of a single case (country) may be made implicitly comparative by comparing them with other studies of other countries with similar research questions. Comparative studies have traditionally involved comparison of two or a few cases. We can distinguish between comparative system analysis, separate micro-level analysis and pooled fixed-effects analysis. The first type is mainly qualitative, such as studies comparing political systems. The second type could involve the estimation of a common regression model for two countries, while the third would be based on a pooled data file with country fixed effects, i.e. a set of dummy variables to represent countries. This implies that the intercept in the regression model would vary between countries. The regression coefficients are assumed to be common, however, but they may be allowed to vary by adding country by variables interactions. The main advantages of the two latter approaches are that the countries can be chosen for theoretical reasons and that they will remain in focus throughout the statistical analysis.

In the last column, designs based on many countries are presented. The first one is based on aggregated statistics and a longitudinal perspective can also be added. The two remaining designs are widely used in the analysis of European Social Survey data. The pooled fixed-effects analysis would use dummy variables to represent countries. The multilevel design differs from the fixed effects model in important ways. Firstly, the variation in the intercept and/or the regression coefficients (slopes) is captured by variance components that constitute the random parameters in the multilevel model. Secondly, it allows for explanatory variables at country level to be entered into the analysis.

Figure 2.1. A typology of comparative design