# Preliminary analysis

The first step is to create the set of variables to be used in the analysis. This sometimes requires recoding or other operations on variables in the data set. Our data set is ready to use and has only a minimal number of variables. Let us familiar ourselves with them by looking at descriptive statistics and looking more carefully at the dependent variable by means of a histogram.

The first exercise consists of opening the file and learning the variable names.

1. Download and open the file wage89.sav (SPSS) or wage89.dta (Stata) by clicking on the respective links below:
2. Next, present descriptive statistics for the variables and show the distribution of the dependent variable, wage, in a histogram.

*You can copy, paste and run this syntax.

*Descriptive, chapter 1, page 2, exercise 2.

DESCRIPTIVES VARIABLES=wage edyears age female
/STATISTICS=MEAN STDDEV MIN MAX.

Table 1.1. Descriptive statistics - SPSS output

*Histogram, chapter 1, page 2, exercise 2.

GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=wage MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: wage=col(source(s), name("wage"))
GUIDE: axis(dim(1), label("Hourly wage in NOK"))
GUIDE: axis(dim(2), label("Frequency"))
ELEMENT: interval(position(summary.count(bin.rect(wage))), shape.interior(shape.square))
END GPL.

Figure 1.1. Histogram - SPSS output

*You can copy, paste and run this syntax.

*Descriptive, chapter 1, page 2, exercise 2.

. summarize wage edyears age female

Table 1.2. Descriptive Statistics - Stata output
Variable Obs Mean Std. Dev. Min Max
wage 3759 90.15 30.31 25 343.75
edyears 4127 2.69 2.56 0 11
age 4127 39.65 12.36 16 74
female 4127 .47 .50 0 1

Only two decimals are shown.

*Histogram, chapter 1, page 2, exercise 2.

. histogram wage, frequency

Figure 1.2. Histogram - Stata output

Go to next page >>