# Preparing the data

Please open the data file you downloaded above. In order to have a better point of departure for the analyses, you need to work a bit more with the variables in this file. If you find it difficult to do the exercises, you may look at and copy from the syntax at the bottom of this page.

1. Open the file with data.

### The dependent variable

In all ESS rounds, happiness is measured by the following question: ‘Taking all things together, how happy would you say you are? Please use this card.’ The card shows a scale from 0, ‘extremely unhappy’ to 10, ‘extremely happy’. This variable can be regarded as having an ordinal scale, but with 11 categories. We will treat the raw scores as having been measured at the interval level.

1. Perform a frequency analysis of the variable ‘happy’.

### The explanatory variables

We want to have the following variables available for the analysis:

Demographic variables: age, age centered, age squared, gender.

1. Find 'mean age' in the data and compute a centered version of 'Age'.
2. Use the centered age variable and compute a squared version of it.
3. Recode ‘Gender’ into a dummy variable with 0 = man and 1 = female.

Socioeconomic variables: education in years, education level, household income, evaluation of current income.

Years of education (eduyrs) can be used directly, although the variable is not the best indicator of educational attainment in the ESS. It also has some problematic high values. The best indicator of education is eisced.

1. Recode ‘eisced’ into a new variable with three levels: primary (1,2), secondary (3,4) and tertiary (5,6,7).
2. Create dummy indicators for the three levels of education.
3. Recode ‘hincfel’ into a new dummy variable called ‘copeinc’. Values 1 and 2 for ‘hincfel’ = 1 for ‘copeinc’, and 3 and 4 = 0.
4. The variable ‘hinctna’, ‘Household’s net income, all sources’, is a variable with ten categories, in addition to missing. Our suggestion is to simplify this and compute a new variable, 'hinc4', with the following values: low income - medium income - high income - missing. The categories defining for example 'low income', are not the same across all countries, and we suggest to recode one country at the time. Please open the syntax at the bottom of this page, copy the 'Exercise 9' syntax and compute the simplified variable, ‘hinc4’.
5. Compute dummies for each of the four values on 'hinc4'.

Social support variables: living together with others or alone, meeting with friends.

1. The variable icpart2 distinguishes between persons who live together with a husband, partner or cohabitant and those who live alone. Recode ‘icpart2’ into a variable called ‘cohab’, where cohabitants are coded 1 and all others are coded 0.
2. Meeting socially with friends, neighbours and co-workers is measured by the schmeet variable. Recode ‘schmeet’ into a new variable called ‘social’. Set the values 1 - 5 to 0, and 6 and 7 to 1.

Country level variables: Welfare state classification.

Welfare state classification based on Ferrera
Nordic Liberal Continental Southern Eastern
Sweden

Norway

Denmark

Finland
Ireland

United Kingdom

Netherlands

Luxembourg

Germany
Switzerland

Belgium

Austria

France
Spain

Israel

Italy

Greece

Turkey

Portugal

Cyprus
Russia

Estonia

Czech Republic

Poland

Croatia

Hungary

Latvia

Rumenia

Slovenia

Slovakia

Ukraine

The welfare state classification is made by recoding the country variable (cntry). The classification is inspired by Esping-Andersen (1990), but has been expanded by adding a Southern and an Eastern welfare regime (Arts, W. and Gelissen, J. 2002).

1. Recode 'cntry' into a new variable 'welstate'.

* This syntax prepares data for chapters 5 and 6 in the EduNet module 'Multilevel models'.
* Data downloaded from ESS MD (edition 2 of the ESS 5, http://ess.nsd.uib.no/ess/essmd/).

#### *Exercise 1.

*Please remember to change the path to the location where your dataset is and, later, to write the path to the location where you would like to save your work.

GET FILE='c:\data\ESSMDw5e2.sav'.

fre happy.

#### *Exercise 3.

* Descriptive analysis - in order to find the mean value of age.
* Compute age centred.

desc agea.
compute agec = agea - 48.
var labels agec 'Age centred: age- 48'.

#### * Exercise 4.

* Compute centered age squared.
* Descriptive analysis of the three age variables.

compute agec2 = agec*agec.
var labels agec2 'Age centred squared'.
desc agea agec agec2.

#### * Exercise 5.

* Compute dummy variable, female.
* Check that the frequencies are identical.

compute female = gndr-1.
var label female 'Female gender from gndr'.
fre female gndr.

#### * Exercise 6.

* Look at the frequencies for the variable ‘years of education’.
* Note that this variable is not considered to be the best indicator of education.
* Note further that there are about 50 persons reporting more than 30 years of education.
* The best alternatives are edulvlb and eisced.
* eisced has seven values, recode to three levels.
*Check the frequencies of the variables.

fre eduyrs.
recode eisced (1,2=1)(3,4=2)(5,6,7=3)into edlev3.
var labels edlev3 'Education in three level from eisced'.
value labels edlev3 1 'Primary' 2 'Secondary' 3 'Tertiary'.
fre eisced edlev3.

#### * Exercise 7.

* Create dummy indicators for the three levels of education.
* Check the frequencies of the variables.

recode edlev3 (1=1)(2,3=0)into primed.
recode edlev3 (2=1)(1,3=0)into seced.
recode edlev3 (3=1)(1,2=0)into terted.
var labels primed 'Primary education, Edlev=1'.
var labels seced 'Secondary education, Edlev=2'.
var labels terted 'Tertiary education, Edlev=3'.
fre edlev3 primed seced terted.

#### * Exercise 8.

* Recode hincfel - Feeling about household's current income.
* Check the frequencies of these two variables.

recode hincfel (1,2=1)(3,4=0)into copeinc.
var labels copeinc 'Living comfortably or coping on present income'.
fre copeinc hincfel.

#### * Exercise 9.

* Note that Portugal lacks household income.
* Recode hinctnta into hinc4.
* Check the frequencies of these two variables.

do if (cntry='BE').
recode hinctnta (1 thru 5=1)(6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='BG').
recode hinctnta (1=1)(2 thru 6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='CH').
recode hinctnta (1 thru 4=1)(5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='CZ').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7 thru 10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='CY').
recode hinctnta (1 thru 2=1)(3,4,5=2)(6 thru 10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='DE').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7 thru 10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='DK').
recode hinctnta (1 thru 4=1)(5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='EE').
recode hinctnta (1 thru 4=1)(5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='ES').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7 thru 10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='FI').
recode hinctnta (1 thru 4=1)(5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='FR').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='GB').
recode hinctnta (1 thru 3=1)(4,5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='GR').
recode hinctnta (1 thru 4=1)(5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='HR').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='HU').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='IE').
recode hinctnta (1,2=1)(3,4,5=2)(6,7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='IL').
recode hinctnta (1 thru 4=1)(5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='NL').
recode hinctnta (1 thru 4=1)(5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='NO').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='PL').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='RU').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='SE').
recode hinctnta (1 thru 4=1)(5,6,7,8=2)(9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='SI').
recode hinctnta (1 thru 3=1)(4,5,6=2)(7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='SK').
recode hinctnta (1 thru 4=1)(5,6,7=2)(8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='TR').
recode hinctnta (1,2=1)(3,4,5=2)(6,7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
do if (cntry='UA').
recode hinctnta (1,2=1)(3,4,5=2)(6,7,8,9,10=3)(77,88,99=4)into hinc4.
end if.
var labels hinc4 'Household income in 3 cat + missing from hinctnta'.
value labels hinc4 1 'Low' 2 'Medium' 3 'High' 4 'Missing'.
fre hinctnta hinc4.

#### * Exersice 10.

* Recode hinc4 into dummies.
* Check frequencies.

recode hinc4 (1=1)(2,3,4=0)into lowinc.
recode hinc4 (2=1)(1,3,4=0)into medinc.
recode hinc4 (3=1)(1,2,4=0)into highinc.
recode hinc4 (4=1)(1,2,3=0)into missinc.
var labels lowinc 'Low household income, hinc4=1'.
var labels medinc 'Medium householdincome, hinc4=2'.
var labels highinc 'High household income, hinc4=3'.
var labels missinc 'Missing income, hinc4=4'.
fre hinc4 lowinc to missinc.

#### * Exercise 11.

* Recode iscpart2 into cohab, living with husband, wife, partner or cohabiting.
* Check frequencies.

recode icpart2 (1=1)(2=0)into cohab.
var labels cohab 'Living with husband wife partner or cohabiting'.
fre cohab.

#### * Exercise 12.

* Recode schmeet into social, meeting socially.
* Check frequencies.

recode sclmeet (1,2,3,4,5=0)(6,7=1)into social.
var labels social 'Meet sdeveral times a week with friends, relatives collegues'.
fre sclmeet social.

#### * Exercise 13.

* Recode cntry into welstate.
* Check frequencies.

RECODE cntry ('SE', 'NO', 'DK', 'FI' =1) ('IE', 'GB'=2) ('NL','LU', 'DE'=3)('CH', 'BE','AT', 'FR'=3)('ES','IL','IT', 'GR', 'TR','PT', 'CY'=4) ('RU', 'EE', 'BG','CZ','PL','HR', 'HU','LV','RO','SI','SK','UA'=5) INTO welstate .
var labels welstate 'Welfare state classification based on Ferrera'.
value labels welstate 1 'Nordic' 2 'Liberal' 3 'Continental' 4 'Southern' 5 'Eastern'.
fre welstate.

#### *Save the changes.

SAVE OUTFILE='c:\data\Multilevel.sav'
/COMPRESSED.

* This syntax prepares data for chapters 5 and 6 in the EduNet module 'Multilevel models'*
* Data downloaded from ESS MD (edition 2 of the ESS 5, http://ess.nsd.uib.no/ess/essmd/)*

#### *Exercise 1*

*Please remember to change the path to the location where your dataset is and, later, to write the path to the location where you would like to save your work.

use "c:\data\ESSMDw5e2.dta", clear

tabulate happy

#### *Exercise 3*

* Descriptive analysis - in order to find the mean value of age*
* Compute age centred*

summarize agea
generate agec = agea - 48
label variable agec "Age centered_ age- 48"

#### * Exercise 4*

* Compute centered age squared*
* Descriptive analysis of the three age variables*

generate agec2 = agec*agec
label variable agec2 "Age centered squared"
sum agea agec agec2

#### * Exercise 5*

* Compute dummy variable, female*
* Check that the frequencies are identical*

gen female = gndr-1
label variable female "Female gnder from gndr"
sum female gndr

#### * Exercise 6*

* Look at the frequencies for the variable ‘years of education’*
* Note that this variable is not considered to be the best indicator of education*
* Note further that there are about 50 persons reporting more than 30 years of education*
* The best alternatives are edulvlb and eisced*
* eisced has seven values, recode to three levels*
*Check the frequencies of the variables*

sum eduyrs
recode eisced (1/2=1) (3/4=2) (5/7=3), gen(edlev3)
replace edlev3 = .n if eisced == 55
label define edlev 1 "Primary" 2 "Secondary" 3 "Tertiary"
label values edlev3 edlev
tab1 eisced edlev3

#### * Exercise 7*

* Create dummy indicators for the three levels of education*
* Check the frequencies of the variables*

recode edlev3 (1=1) (2/3=0), gen(primed)
recode edlev3 (2=1) (1/3=0), gen(seced)
recode edlev3 (3=1) (1/2=0), gen(terted)
label variable primed "Primary education, Edlev=1"
label variable seced "Secondary education, Edlev=2"
label variable terted "Tertiary education, Edlev=3"
tab1 edlev3 primed seced terted

#### * Exercise 8*

* Recode hincfel - Feeling about household's current income*
* Check the frequencies of these two variables*

recode hincfel (1/2=1) (3/4=0), gen(copeinc)
label var copeinc "Living comfortably or coping on present income"
tab1 copeinc hincfel

#### * Exercise 9*

* Note that Portugal lacks household income*
* Recode hinctnta into hinc4*
* Check the frequencies of these two variables*

recode hinctnta (1/5=1) (6/7=2) (8/10=3) (.a .b .c . =4) if cntry == "BE", gen (hincBE)
gen hinc4 = hincBE
drop hincBE
***
recode hinctnta (1=1) (2/6=2) (7/10=3) (.a .b .c . =4) if cntry == "BG", gen (hincBG)
replace hinc4 = hincBG if hinc4 ==.
drop hincBG
***
recode hinctnta (1/4=1) (5/7=2) (8/10=3) (.a .b .c . =4) if cntry == "CH", gen (hincCH)
replace hinc4 = hincCH if hinc4 ==.
drop hincCH
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "CZ", gen (hincCZ)
replace hinc4 = hincCZ if hinc4 ==.
drop hincCZ
***
recode hinctnta (1/2=1) (3/5=2) (6/10=3) (.a .b .c . =4) if cntry == "CY", gen (hincCY)
replace hinc4 = hincCY if hinc4 ==.
drop hincCY
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "DE", gen (hincDE)
replace hinc4 = hincDE if hinc4 ==.
drop hincDE
***
recode hinctnta (1/4=1) (5/7=2) (8/10=3) (.a .b .c . =4) if cntry == "DK", gen (hincDK)
replace hinc4 = hincDK if hinc4 ==.
drop hincDK
***
recode hinctnta (1/4=1) (5/7=2) (8/10=3) (.a .b .c . =4) if cntry == "EE", gen (hincEE)
replace hinc4 = hincEE if hinc4 ==.
drop hincEE
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "ES", gen (hincES)
replace hinc4 = hincES if hinc4 ==.
drop hincES
***
recode hinctnta (1/4=1) (5/7=2) (8/10=3) (.a .b .c . =4) if cntry == "FI", gen (hincFI)
replace hinc4 = hincFI if hinc4 ==.
drop hincFI
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "FR", gen (hincFR)
replace hinc4 = hincFR if hinc4 ==.
drop hincFR
***
recode hinctnta (1/3=1) (4/7=2) (8/10=3) (.a .b .c . =4) if cntry == "GB", gen (hincGB)
replace hinc4 = hincGB if hinc4 ==.
drop hincGB
***
recode hinctnta (1/4=1) (5/7=2) (8/10=3) (.a .b .c . =4) if cntry == "GR", gen (hincGR)
replace hinc4 = hincGR if hinc4 ==.
drop hincGR
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "HR", gen (hincHR)
replace hinc4 = hincHR if hinc4 ==.
drop hincHR
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "HU", gen (hincHU)
replace hinc4 = hincHU if hinc4 ==.
drop hincHU
***
recode hinctnta (1/2=1) (3/5=2) (6/10=3) (.a .b .c . =4) if cntry == "IE", gen (hincIE)
replace hinc4 = hincIE if hinc4 ==.
drop hincIE
***
recode hinctnta (1/4=1) (5/6=2) (7/10=3) (.a .b .c . =4) if cntry == "IL", gen (hincIL)
replace hinc4 = hincIL if hinc4 ==.
drop hincIL
***
recode hinctnta (1/4=1) (5/7=2) (8/10=3) (.a .b .c . =4) if cntry == "NL", gen (hincNL)
replace hinc4 = hincNL if hinc4 ==.
drop hincNL
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "NO", gen (hincNO)
replace hinc4 = hincNO if hinc4 ==.
drop hincNO
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "PL", gen (hincPL)
replace hinc4 = hincPL if hinc4 ==.
drop hincPL
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "RU", gen (hincRU)
replace hinc4 = hincRU if hinc4 ==.
drop hincRU
***
recode hinctnta (1/4=1) (5/8=2) (9/10=3) (.a .b .c . =4) if cntry == "SE", gen (hincSE)
replace hinc4 = hincSE if hinc4 ==.
drop hincSE
***
recode hinctnta (1/3=1) (4/6=2) (7/10=3) (.a .b .c . =4) if cntry == "SI", gen (hincSI)
replace hinc4 = hincSI if hinc4 ==.
drop hincSI
***
recode hinctnta (1/4=1) (5/7=2) (8/10=3) (.a .b .c . =4) if cntry == "SK", gen (hincSK)
replace hinc4 = hincSK if hinc4 ==.
drop hincSK
***
*** Turkey not in data***
***recode hinctnta (1/2=1) (3/5=2) (6/10=3) (.a .b .c . =4) if cntry == "TR", gen (hincTR)***
***replace hinc4 = hincTR if hinc4 ==.***
***drop hincTR***
***
recode hinctnta (1/2=1) (3/5=2) (6/10=3) (.a .b .c . =4) if cntry == "UA", gen (hincUA)
replace hinc4 = hincUA if hinc4 ==.
drop hincUA
***
label var hinc4 "Household income in 3 cat + missing from hinctnta"
label define hinc4 1 'Low' 2 'Medium' 3 'High' 4 'Missing'
label values hinc4 hinc4
tab1 hinctnta hinc4

#### * Exersice 10*

* Recode hinc4 into dummies*
* Check frequencies*

recode hinc4 (1=1) (2/4=0), gen (lowinc)
recode hinc4 (2=1) (1 3/4=0), gen (medinc)
recode hinc4 (3=1) (1/2 4=0), gen (highinc)
recode hinc4 (4=1) (1/3=0), gen (missinc)
lab var lowinc "Low household income, hinc4=1"
lab var medinc "Medium household income, hinc4=2"
lab var highinc "High household income, hinc4=3"
lab var missinc "Missing income, hinc4=4"
tab1 hinc4 low med high miss

#### * Exercise 11*

* Recode iscpart2 into cohab, living with husband, wife, partner or cohabiting*
* Check frequencies*

recode icpart2 (1=1) (2=0), gen (cohab)
lab var cohab "Living with husband wife partner or cohabiting"
tab cohab

#### * Exercise 12*

* Recode schmeet into social, meeting socially*
* Check frequencies*

recode sclmeet (1/5=0) (6/7=1), gen (social)
lab var social "Meet sdeveral times a week with friends, relatives collegues"
tab1 sclmeet social

#### * Exercise 13*

* Recode cntry into welstate*
* Check frequencies*
* First create numeric country variable*

encode cntry, gen (cntry_num)
recode cntry_num (23 19 7 10 = 1)(16 12=2) (18 6=3)(3 1 11=3)(9 17 13 21 4 = 4)(22 8 2 5 20 14 15 24 25 26=5), gen(welstate)
lab var welstate "Welfare state classification based on Ferrera"
* Define a label group*
label define welstate 1 'Nordic' 2 'Liberal' 3 'Continental' 4 'Southern' 5 'Eastern'
*Assign the label group to the variable*
label value welstate welstate

#### *Save the changes*

save "c:\data\Multilevel.dta", replace

Go to next page >>