Simple Random Sampling

A very simple and intuitive sample design that we have already used in the previous examples is defined by the following rule: ‘Draw n of N elements randomly and do not return each element to the population after it has been drawn’. This sample design is called simple random sampling without replacement, abbreviated srswor or simply srs. Using srswor, the probability of realising every possible sample of size n from a population of size N is

(1)

The exclamation mark means `factorial'. It means we must calculate the product of the series of the elements 1, 2, 3, ..., n. Hence

3! = 1*2*3 = 6
4! = 1*2*3*4 = 24
5! = 1*2*3*4*5 = 120

and so on.

In practical applications, the number of possible samples of size n is enormous. Even with such moderate population and sample sizes as used in our previous examples, we have seen that, with N=10 and n=10, the number of possible samples is already 210. If the population contains N=500 elements and the sample size remains n=10, there are 2.46 * 1020 possible samples. That is a number beginning with 246 followed by 18 zeros. As impressive as these figures might be, however, the important thing for us to know is that all samples have the same chance of being realised, the specific value of P(s) is not important yet.

A less practical but theoretically appealing variant of srswor is to return each element after it has been selected into the sample, thus giving it a chance of being re-selected in the next draw, i.e. to use replacement. This sample design is called simple random sampling with replacement srswr. Also when srswr is used, each sample of size n has the same probability of realisation, which is expressed as

(2)

The number of possible samples using srswr is even larger than when using srswor. In our example with N=500 and n=10, the number of possible samples is approximately 9.77*1026. Again, the magnitude of P(s) does not matter, only the fact that the functional form of the sample design in (2) does not `favour' a specific sample but treats all possible samples in the same way by assigning each of them the probability of 1/Nn of being realised.

Example 1

From our universe, we want to draw n=4 people using both srswor and srswr. We know that in the first case there are 210 possible samples, but if we allow each element to be returned after it has been selected, there are Nn = 104 = 10,000 possible samples. Analogously, if the population is N=500, the number of possible samples is 9.77*1026, a very, very large number. Again, what is important is not the magnitude of P(s), but the fact that all samples are also equally likely under srswr.

If we were only interested in the sample data, we could calculate the mean age of the persons in the sample for all possible srswor and srswr samples as . Of course, the resulting value will depend on the composition of the sample. If, for example, only the four oldest persons were selected (i.e. elements 3, 6, 7 and 9), the sample mean would be very high (55 years). If the sample consists of the four youngest people (i.e. elements 1, 4, 8 and 10), the sample mean would be very low (27 years). Note that these figures are very much higher and lower, respectively, than the mean age in the population of 40.4 years. The important thing is that the sample that includes the oldest persons, the sample that includes the youngest persons, as well as the remaining 208 possible samples, are equally likely. Hence, all possible mean ages calculated on the basis of these samples are equally likely. Please note that the number of possible values of the sample mean need not be the same as the number of possible samples. In our example with n=4 and sampling using srswor, there are 80 distinct possible values of y, some y occurring only once, some twice and others even eight times. Similarly, for srswr, there are not 10,000 (the number of possible srswr samples of size n=4 from N=10) but only 140 possible sample means.

The following figures show the distributions of the 80 and 140 possible sample means based on 210 srswor and 10,000 srswr possible samples of size n=4 from N=10.

Example 1. Sample mean (srswor)

Example 1. Sample mean (srswr)

Exercise 3

In Round III of the ESS, Norway applied a simple random sample design without replacement. Suppose that the size of the Norwegian ESS target population is N=3,733,370 and the gross sample size is n=2,750. Calculate P(s) both, assuming srswor and srswr.

Solution

The numbers are very large. The value of P(s) is not important; the important thing to note is that all s are treated equally.

Probability of realisation using srswor:

(1)

P(s) = 2750!*(3733370-2750)!/3733370! =

Probability of realisation using srswr:

(2)

P(s) = 1/ 37333702750 =

Go to next page >>