# All pages

# Chapter 5: Taking the Sample Design into Account: Design Effects

The need for adequate consideration of the effects of a sample design on the precision of estimators is being recognised in an increasing number of sample survey projects. These projects make use of a concept called **design effect** that can serve as a measure of variance inflation in the estimator, due to a departure from simple random sampling. The European Social Survey was the first general social survey to make explicit use of design effects already at the planning stage [Ess05a]. In the ESS, each participating country is responsible for its sample meeting certain pre-defined quality criteria. One of these criteria concerns the precision of estimators: the samples of all participating countries shall yield estimators of comparable precision1. Design effects play a crucial role in the planning of samples that will yield estimators with these properties.

The foundations for sampling in Europe-wide surveys like the ESS are quite diverse. In some countries, such as Sweden, Norway or Finland, researchers are allowed to draw a sample directly from population registers. In other countries, such as Portugal, Spain or Poland, access to population registers is either limited or not possible at all. This diversity of sampling frames results in a diversity of sample designs. Whereas in countries in the first group, a simple random sample or a stratified random sample of contact persons can be drawn directly, this is not possible in the second group of countries due to the structure of the sampling frame. These countries often have to resort to more complex sample designs such as cluster or multi-stage sample designs. It is an empirical fact, however, that persons socialised within the same social context (e.g. living in the same neighbourhood or municipality) are more similar to each other than to persons who are socialised in a different social context in their responses to many questions in a general social survey like the ESS. This **homogeneity** can have a negative effect on the precision of estimators.

The accuracy of an estimator calculated using data from a simple random sample differs from the quality of the same estimator calculated on the basis of a cluster sample design described above, given that the two samples are of the same size. Nevertheless, all samples in the ESS have to comply with the aforementioned quality standards in terms of the precision of estimators. The question, then, is how to design different samples so that these criteria are met under the practical restriction of divergent sampling frames.

- [1] ESS, 2005b, 1.

# Example 5

One way to ensure that the precision of an estimator is independent of the sample design is to plan samples with an equal **effective sample size**. The effective sample size is a concept that incorporates the design effect. However, there is no unique design effect for a given sample. Design effects will vary in magnitude depending on the characteristics of the item under study. The following example illustrates the connection between a) a study variable, b) the definition of clusters and c) the effects of different sample designs [Kis89].

Let us assume a clustered population in which values of the study variable are distributed as shown in the following table. The column and row means are shown along with the variable values.

Row means | ||||||
---|---|---|---|---|---|---|

Column means | 3 | 8 | 13 | 18 | 23 | |

1 | 6 | 11 | 16 | 21 | 11 | |

2 | 7 | 12 | 17 | 22 | 12 | |

3 | 8 | 13 | 18 | 23 | 13 | |

4 | 9 | 14 | 19 | 24 | 14 | |

5 | 10 | 15 | 20 | 25 | 15 |

From this population n = 10 elements are to be drawn by a) srs and b) cluster sampling where clusters are either defined by columns (clu-col) or rows (clu-row) in the matrix. In cluster sampling, all elements in two randomly selected columns or rows are selected. Under srswr n elements are chosen randomly. In either case the sample mean
is to be calculated. The means and the variances under srs, Var_{(srswr)}(y), under column-wise cluster sampling, Var_{(clu-col)}(y) and under row-wise cluster sampling, Var_{(clu-row)}(y), are shown in the following table.

srswr | clu-col | clu-row | |
---|---|---|---|

Mean | 13.00 | 13.00 | 13.00 |

Variance | 5.20 | 18.75 | 0.75 |

It can be seen that the population mean of Y=13 is estimated without bias under all sample designs, but the variances in the estimates vary dramatically. The variance of the estimates of y under srswr (5.2) will serve as a reference.

Under column-wise cluster sampling, the variance of the sample mean is 18.75, which is Var_{(clu-col)}(y)/Var_{(srswr)}(y) = 18.75/5.2 = 3.61 times Var_{(srswr)}(y).

If the columns selected are not exactly symmetrical with the third column (i.e. first and fifth and second and fourth), the difference between the sample mean and the population parameter will be very large.

If rows are sampled, the variance of the sample mean is very low Var_{(clu-row)}(y) / Var_{(srswr)}(y) = 0.75/5.2 = 0.14.

This is due to the very low heterogeneity of row-wise means. Even if, in one of the worst cases, the two upper rows are selected, the sample mean is 11.5, which is closer to the population mean than in one of the corresponding worst cases of column-wise selection (i.e. if, for example, the first or last two columns are selected) where the sample mean is 5 and 20.5, respectively.

This example illustrates that cluster sampling can yield better and worse results (in terms of precision) than srswr. The magnitude of loss or gain in precision depends on the interrelation of the distribution of the study variable and the structure and definition of clusters in a sample design. In most real-world sample surveys, however, these two parameters are interrelated in such a manner that precision is lost.

# Estimation of Design Effects

Let us assume a population of elements grouped into M PSUs each of size N_{i}, i = 1, ..., M and let N=Σ N_{i}. Furthermore, let B = N/M be the average cluster size. For the time being, let us assume that the PSUs are of equal size, so that N_{i}=B. Finally, let y_{ij} denote the value of the variable of interest for the jth respondent in the ith cluster, as before. Consequently,
denotes the sum of the study variable in the ith cluster. A simple random sample of m clusters is drawn at the first stage and then all B elements of a PSU are selected. The homogeneity of y introduced by geographical clustering leads to the design effect, deff, which is defined by Kish1 as

(20) |

with and but here, following Lohr2, it shall be expressed more generally as

(21) |

where Var_{c} is the variance of the estimator under the actual complex design (here: one-stage sample design) and Var_{srs} is the variance of the same estimator under a (hypothetical) simple random sample3. Put less formally, the design effect is the factor by which the variance of an estimator under a complex design is under or overestimated by the naive formula.

The ratio

n_{eff} = n/deff | (22) |

is referred to as the **effective sample size** and is the number of ultimate sample elements required in an srs that yields the same precision for a certain estimator as under a given complex sample design. Kish4 showed that (20) can be expressed as

deff_{one-stage} = 1 + (B - 1) ρ | (23) |

if all B elements in a selected cluster are selected (one-stage or cluster sampling) and

deff_{two-stage} = 1 + (b - 1) ρ | (24) |

if b elements in a selected cluster are sub-sampled randomly (two-stage sampling). The factor ρ is a measure of homogeneity and will be discussed below.

If cluster sizes vary, for example due to non-response, [Gab99] showed that the design effect can be formulated as the product of two factors in the following way:

deff = deff_{p} * deff_{c} | (25) |

In this expression, deff_{p} is the **design effect due to unequal inclusion probabilities** and deff_{c} is the **design effect due to clustering**. In most cases, both components have to be estimated from sample data, so that the estimated design effect can be expressed as

(26) |

where the first term refers to the estimated design effect due to unequal inclusion probabilities. This factor can be expressed as

(27) |

The second term is the estimated design effect due to clustering, which is defined as

(28) |

The factor b* is the weighted average cluster size, which can be expressed as

(29) |

Obviously, does not depend on the distribution of the study variable. Thus, the design effect due to unequal inclusion probabilities is fixed for a given sample, regardless of the item under study. However, the magnitude of the design effect due to clustering will vary from item to item due to the fact that the magnitude of ρ (and its estimator) will vary with the distribution of the study variable. The next subsection provides a brief overview of the definition and the estimation of ρ.

- [1] Kish 1965, 162.
- [2] Lohr 1999, 239.
- [3] The ratios of variances in the sample means in the above examples correspond to the design effect as defined by the above formula.
- [4] Kish 1965, 162.

# Measures of Homogeneity

In expressions (23), (24) and (28), ρ is used in the definition of the design effect (and its estimator) due to clustering. It serves as a measure of homogeneity. As such, it indicates the degree to which elements of the same cluster are more similar to each other than to all other elements in the population or sample, respectively. For the population, ρ is defined as

(30) |

with S_{y}^{2} = SST/MB-1
. This can also be written as

(31) |

where . The domain of ρ ranges from -B/B-1 to one. The value of ρ can be negative when most or all of the total variation can be attributed to variation within clusters. This makes sense theoretically but will almost never occur in practical applications where ρ has small values around 0.02 to 0.15, depending on the variable under study.

It is obvious that the population quantities in (30) and (31) are unknown and have to be estimated from sample data. Numerous estimators have been proposed in the literature, but the so-called ANOVA or AOV estimator of ρ has proven to be a very reliable candidate in many studies. It can be expressed as

(32) |

Most statistical software packages use the estimator by default. Even if a software package does not provide a pre-defined function, the estimator can be constructed quite easily as MSB, and MSW can be obtained by any function that yields an analysis of variance table.

### Exercise 6

Based on the data from Example 5, calculate ρ under the assumption that both columns and rows defined the clusters, respectively. Explain the results.

Now, assume that we define the rows as clusters and select a cluster sample with m=2 of total size n=10. Assume a sample has been selected according to this design so that all elements in the first and the third row have been selected. Calculate

Substituting this in (31) gives,

- for the columns rho = 1 - B/(B-1) * SSW/SST = 1 - 5/4 * 50/1300 = 0.95,
- and for the rows rho = 1 - B/(B-1) * SSW/SST = 1 - 5/4 * 1250/1300 = -0.20.

This is caused by the high degree of heterogeneity of the column-wise means and the high homogeneity of the row-wise means. If the columns selected are not exactly symmetrical with the third column (i.e. first and fifth and second and fourth), the difference between the sample mean and the population parameter will be very large. If columns are sampled, the variance in sample mean is very low. This is due to the very low heterogeneity of row-wise means. Even if, for example, in one of the worst cases, the two upper rows are selected, the sample mean is 11.5, which is closer to the population mean than in one of the corresponding worst cases of column-wise selection (i.e. if, for example, the first or last two columns are selected) where the sample mean is 5 and 20.5, respectively.

Under the specified sample design, the estimated value of rho is obtained by substituting in (32). We then get an estimate for = (MSB-MSW)/(MSB + (K-1)*MSW) = (10-62.5)/(10 + 4*62.5) = -0.2019.

- [Ess05a] ESS (2005a).
*European Social Survey, Round 3: Specification for participating countries*. Specification, European Social Survey. - [Gab99] Gabler, S., Häder, S., and Lahiri, P. (1999). A model based justification of Kish’s formula for design effects for weighting and clustering.
*Survey Methodology*, 25(1): 105-106. - [Kis89] Kish, L. (1989). Deffs: Why, when and how? a review. In
*Proceedings of the Survey Research Methods Section, American Statistical Association*.