# Basic OLS regression models

Let us estimate the regression model, first by using the familiar regression routine in SPSS and Stata and then by using the Mixed procedures for estimating multilevel models. In SPSS, use *Regression* to estimate the regression of wage on years of education, age and gender. In Stata, use *Regress* to estimate the model. We keep wage as the dependent variable, although most wage analyses are based on the natural logarithm of wage.

Table 1.3. Regression coefficients^{a} - SPSS output

The table shows the unstandardized (metric) regression coefficients, their standard error, the standardized regression coefficients (SPSS only), the t-value, its probability level, and 95 per cent confidence intervals for the regression coefficients (Stata only). To evaluate the statistical significance of the regression coefficient, we use the t-statistic, approximately normally distributed in large samples. The implicit statistical hypotheses are:

H_{0} : β = 0 and H_{1} : β ≠ 0

The test statistic is the t-ratio formed by dividing the estimated regression coefficient by its standard error:

or

t = b / s

_{b}

The critical value in the two-sided test with a level of significance of five per cent is |1.96|. In our example, all coefficients have very large t-ratios and they are statistically significant at any conventional level.

- How do we interpret the regression coefficients of the continuous variables: years of education and age?
- What about the interpretation of female, which is a dummy variable?

*Edyears*: for each added year of education, we expect the hourly wage to increase by NOK 4.87 controlling for age and gender. Alternatively, the marginal effect of one added year of education is to increase the expected wage by NOK 4.87.

*Female*: since female is a dummy variable, the regression coefficient is simply the difference in means between the categories: . Since the difference is negative, women (1) earn (NOK) 17.60 less than men (0) on average, controlling for age and education.