Chapter 4: Statistical testing of hypotheses about regression coefficients and R2

Do these analyses tell us anything about the entire populations of those countries from which the European Social Survey samples were obtained? The answer is yes, provided that the surveys were properly conducted on random samples, which we shall assume to be the case. But, of course, the leap from conclusions about samples to conclusions about whole populations includes the possibility of error.

Fortunately, statisticians have proved that regression analyses on random sample data of the sizes used in the ESS program produce regression coefficients that are distributed in well-known ways. So, if we (as a thought experiment) collect all possible equally sized samples from a population and conduct a regression analysis on each sample, we get a large set of regression coefficients whose values have an approximately normal (bell shaped) distribution, with many relatively equal coefficients in the middle and few coefficients with extreme values in the tails. We also know that the mean value of this distribution (which is called the sampling distribution of the regression coefficient) is identical to the population coefficient, and we know the distances from this population coefficient, measured in standard errors, within which we will find a certain portion of the sample coefficients (say 90%, 95%, 99% or whatever portion we are interested in). A standard error is the standard deviation of a sampling distribution. In other words, the standard error is a measure of how widely dispersed the sampling distribution is. The higher the standard error, the further away from the population coefficient the value of the sample coefficients tend to lie. Hence, a sample regression coefficient is likely to be a more accurate estimate of the population regression coefficient if the standard error is small than if it is large, and the standard error becomes smaller the larger the sample is. Results obtained from large samples are therefore, under otherwise equal conditions, more reliable than results obtained from small samples.