Inclusion Probabilities and Design Weights

Probability samples not only assign known probabilities of selection to every possible sample, but also to each element of the universe, so called inclusion probabilities. Each element in the population is assigned such an inclusion probability, which, according to Fuller1, is defined as

πi = `the sum of the sample probabilities for all samples that contain element i'.

The inverse of πi is called design weight and is defined as

wi = 1/πi

We have already used the design weight in the examples in the preceding section. We can now see very easily that, if all elements are drawn into the sample with equal probabilities, πi = c and, instead of wi = 1/c, we are free to define wi= 1. This means that, if a sample design assigns equal inclusion probabilities to all elements in the population, we can simply ignore the design weights, since constant weights are equivalent to multiplication by 1. This makes it clear why it is important to know the details of the sample design when it comes to data analysis.

We can now also deduce directly from the above equation that elements that are assigned a low inclusion probability receive a high design weight and vice versa. This means that, if elements are sampled with unequal inclusion probabilities, an element that was very unlikely to be included in the sample (i.e. has a small value on πi) is weighted up, which makes it `more important' than an element that had a very high chance of being selected (i.e. has a large value on πi), which was weighted down and hence made `less important'.

Inclusion probabilities, and hence design weights, depend on the sample design, the sample size and the size of the population. Let us again assume that we draw a sample of size n=4 from our population of size N=10 using srswor. If unit i is chosen in the sample, the remaining n-1 = 3 elements must be chosen from the remaining N-1=9 elements in the population. There are possible samples of size 3 from a population of N=9. They all contain element i since it has been selected into the sample on the first draw. Hence, the probability of selecting a sample that includes unit i is

and equal for all elements of the population.

In our example πi(srswor) = 4/10 = 0.4.

Using srswr, the inclusion probability of each element i is also constant and, for obvious reasons, can be expressed as:

πi(srswr) = n/N

Sample designs that produce constant inclusion probabilities are called equal probability of selection method (epsem}, self-weighting2, or equal probability sample designs. Sample designs of this type have some very desirable properties3 when it comes to estimation, as we will see in Chapter 3. There are other, more complex sample designs that produce equal inclusion probabilities. For example, intelligent use of probability proportional to size (pps) sampling in a multi-stage design can also yield constant inclusion probabilities.

Apart from equal probability sample designs, there are unequal probability sample designs that assign non-constant inclusion probabilities to the elements. Many (but not all) designs of the class of so-called complex sample designs (for examples of variants of cluster sampling or multi-stage sampling, see the Section ‘Cluster Sampling and Multi-Stage Sampling'), produce non-constant inclusion probabilities.

Go to next page >>