Chapter 2: A framework - the compositing methodology
Before starting this chapter, be sure to try out exercise 2 from the previous chapter. For sure, everyone doing this exercise will come up with a slightly different set of questions. To help identify a set a little easier, it is necessary to have an appropriate framework. The framework suggested here is a working model, based partly on the data available in the ESS. Students are welcome to consider its advantages and shortcomings at the end of this topic.
National accounts of well-being framework
In this approach we consider well-being to be:
‘a dynamic process that gives people a sense of how their lives are going, through the interaction between their circumstances, activities and psychological resources or ‘mental capital’.
It goes beyond life satisfaction, to include social dimensions of well-being, as well as feelings, functionings and psychological resources.
Personal well-being is made up of five main components, some of which are broken down further into subcomponents. These are:
- Emotional well-being. The overall balance between the frequency of experiencing positive and negative emotions, with higher scores showing that positive emotions are felt more often than negative ones. This is comprised of the subcomponents:
- Positive emotions – How often positive emotions are felt.
- Absence of negative emotions – The frequency with which negative emotions are felt, with higher scores representing less frequent negative emotions.
- Satisfying life. Having positive evaluation of your life overall, representing the results of four questions about satisfaction and life evaluations.
- Vitality. Having energy, feeling well-rested and healthy, and being physically active.
- Resilience and self-esteem. A measure of individuals’ psychological resources. It comprises the subcomponents:
- Self-esteem – Feeling good about yourself.
- Optimism – Feeling optimistic about your future.
- Resilience – Being able to deal with life’s difficulties.
- Positive functioning. This can be summed up as ‘doing well’. It includes four subcomponents:
- Autonomy – Feeling free to do what you want and having the time to do it.
- Competence – Feeling accomplishment from what you do and being able to make use of your abilities.
- Engagement – Feeling absorbed in what you are doing and that you have opportunities to learn.
- Meaning and purpose – Feeling that what you do in life is valuable, worthwhile and valued by others.
Social well-being is made up of two main components:
- Supportive relationships. The extent and quality of interactions in close relationships with family, friends and others who provide support.
- Trust and belonging. Trusting other people, being treated fairly and respectfully by them, and feeling a sense of belonging with and support from people where you live.
In addition to these indicators, as an example of a well-being indicator within a specific life domain, a satellite indicator of well-being at work is also included. This measures job satisfaction, satisfaction with work-life balance, the emotional experience of work, and assessment of work conditions.
A full list of the questions included in each indicator, subcomponent and component can be found in Appendix 3 of the National Accounts of Well-Being report.
The framework presented above emerges from a top-down, deductive, approach. Based on the latest theories of well-being, which in turn are based on empirical evidence, we posited a structure for what well-being looks like.
Another, very different, approach is the bottom-up, inductive one. Such an approach avoids reference to theory and attempts to allow the data to determine a structure. The most popular methodology for doing this is factor analysis.1 Having assumed a multi-dimensional structure of well-being, we might hope that such analyses would identify distinct factors within the ESS data. Questions on social life should intercorrelate more with each other than they do with questions on, for example, feelings of autonomy. Do exercise 3 now to see whether that’s the case.
-  Factor analysis groups together questions whose responses intercorrelate highly. For example, imagine you presented respondents with a quiz about Europe. Half the questions are about European geography (e.g. “what is the longest river in Europe?”, “which country has the southernmost point in mainland Europe?” etc.). The other half are more about European history (e.g. “who led the Reformation of the church in the 16th century?”, “in which century did Serbia become independent from Ottoman rule?”, etc.). The chances are that people who are good at geography will get more geography questions right generally, and people who are good at history will get more history questions right generally. In other words, someone who knows what is the highest mountain in Europe is more likely to know which is the most southerly country in Europe. This impact is likely to be weaker between the two sets of questions. As a result there are two factors that determine responses to the questions - geography knowledge and history knowledge.
To carry out this factor analysis, you will need to use SPSS or a similar statistical package.
- Open the dataset in SPSS. Make sure that the combined weight is on.
- Choose ‘Analyze’ from the menu, and proceed with ‘Data Reduction’ – ‘Factor’.
- Put the indicators from the table below into the frame called ‘Variables’.
Table 2.1. Items to be included in the factor analysis in exercise 3 Variable labels Variable names How satisfied with life as a whole STFLIFE How happy are you HAPPY How often socially meet with friends, relatives or colleagues SCLMEET Anyone to discuss intimate and personal matters with INMDISC Take part in social activities compared to others of same age SCLACT Subjective general health HEALTH Always optimistic about my future OPTFTR In general feel very positive about myself PSTVMS At times feel as if I am a failure FLRMS On the whole life is close to how I would like it to be LFCLLK Felt depressed, how often past week FLTDPR Felt everything did as effort, how often past week FLTEEFF Sleep was restless, how often past week SLPRL Were happy, how often past week WRHPP Felt lonely, how often past week FLTLNL Enjoyed life, how often past week ENJLF Felt sad, how often past week FLTSD Could not get going, how often past week CLDGNG Had lot of energy, how often past week ENRGLOT Felt anxious, how often past week FLTANX Felt tired, how often past week FLTTRD Absorbed in doing, how often past week ABSDDNG Felt calm and peaceful, how often past week FLTPCFL Felt bored, how often past week FLTBRD Felt rested when woke up in morning, how often past week FLTRSTM Free to decide how to live my life DCLVLF Seldom time to do things I really enjoy ENJSTM Little chance to show how capable I am LCHSHCP Feel accomplishment from what I do ACCDNG Like planning and preparing for future PLPRFTR When things go wrong in my life it takes a long time to get back to normal WRBKNRM My life involves a lot of physical activity PACTLOT Satisfied with how life turned out so far STFLFSF Satisfied with standard of living STFSDLV How much of the time spent with immediate family is enjoyable FMLENJ How much of the time spent with immediate family is stressful FMLSTRS Chance to learn new things CHLRNNW Feel people in local area help one another PPLAHLP Feel people treat you with respect TRTRSP Feel people treat you unfairly TRTUNF Feel you get the recognition you deserve for what you do RCNDSRV Feel what I do in life is valuable and worthwhile DNGVAL There are people in my life who care about me PPLLFCR Feel close to the people in local area FLCLPLA
- Setting the parameters:
- Click ‘Extraction method’, select the maximum likelihood method, including all factors with eigenvalues of 1 or over. Click ‘Continue’.
- Click ‘Rotation’ and select varimax rotation. . Select ‘Loading plot(s)’. Click ‘Continue’.
- Click ‘Options’, select ‘Suppress absolute values less than’ and set the value 0.3. This will suppress all factor loadings below 0.3 from the output.
If you are getting stuck, you might want to use the SPSS syntax here.
Interpreting the results
You should get 11 factors with eigenvalues above 1. Note that the first factor contains 21.4% of the variance . This means that over one fifth of the variance in all data is explained by a single factor. This is a lot, but it does mean that, by including the 44 questions we do here, we are getting almost five times as much information about well-being as we would just looking at a single factor.
Have a look at how different questions have loaded onto different factors. Attempt to summarise what each of the 11 factors captures, in terms of the concepts tapped by the questions loading on it. For example, you might broadly label the second factor as evaluative because it includes questions that ask people to assess their life as a whole, and doesn’t directly ask about their emotions - whether they feel ‘happy’ or ‘sad’. If you can’t think of anything, then click here for some suggestions.
|4||Sense of community|
|5||Family and community|
|6||Optimism and self-esteem|
|9||Autonomy and competence|
This pattern of results does seem promising, and might lead one to feel that something similar to the framework shown in Figure 2.1 has been confirmed, and that one should use these 10 factors to define well-being. However, digging deeper reveals that things are not that simple. Another factor has been conflated1 with the conceptual dimension – that is to do with question format and response codes.
It is well recognised that question formats and response codes can lead to patterns in results. Some individuals are more likely to agree to statements, whatever they are - others are more likely to always ‘neither agree nor disagree’. Some individuals use the extremes of a scale when offered the choice, otherwise tend to stick to somewhere around the mid-points. Question ordering is also known to play an important part, with people’s responses to one question dependent on their responses to previous questions. Let’s see how these effects may have played a role in the factor structure generated above.
Take, for example, the first factor, which we have labelled ‘negative affect’. With one exception, all the questions loading high onto this factor ask respondents to report how often they have felt different ways in the past week. Furthermore, all the questions are negatively worded - that means that saying you felt a certain way ‘often’ means you had low well-being. Importantly, and this is where the factor structure seems to follow more the structure of the questions than their content, no distinction emerges between the various types of feelings identified in the survey - between those that are more about having energy, those that are about sadness or depression, and those that are about stimulation. These are all quite different concepts, and you would expect them to load onto different factors if the factor structure was based on the actual content of the questions rather than their format. However, all having the same question format and response codes, and all appearing in a group together in the survey, responses to one of these questions tends to correlate to responses to all the others.
What about the ‘positive affect’ questions? The good news here is that the two questions on how often respondents enjoy life or feel happy (factor 7) do indeed separate from the more vitality-type questions (factor 8) - there is some evidence then for factors being driven by concepts rather than merely question format. However, the problem here is an absence: As well as the question on how often someone feels happy, there is also a question simply asking them how happy they are (on a scale of 0-10). Of course these are slightly different questions, but one would expect them to both load onto the same factor. They don’t - the ‘how happy are you’ question instead loads most with the evaluative questions (factor 2), which incidentally include three questions that use the same 0-10 scale.
Another example to note is factor 9. This actually includes just two questions, which don’t really have much in common in terms of content - one is about having free time/autonomy, the other is about competence. Other questions on autonomy and competence in the question set load elsewhere. So why do these particular two load together? Perhaps it’s the phrasing of the questions (which are both to be responded to on Likert scales):
- In my daily life, I seldom have time to do the things I really enjoy.
- In my daily life, I get very little chance to show how capable I am.
In conclusion: The factor structure does appear to identify certain conceptual groupings. However it appears to also be governed by non-interesting determinants such as question format and response codes. The two interact, sometimes contradicting each other and sometimes conflating each other, meaning that, for some conceptually-driven factors apparently emerging, it is hard to confirm that they indeed constitute confirmation of conceptual groups. Given this situation, it may be prudent to abandon any attempts to use a purely data-driven approach to grouping questions, and rely instead more on conceptual approaches.
Exercise 3 reveals that, whilst at first sight it might appear that the ESS has a factor structure which could be used as a framework for accounts of well-being, this structure seems more likely to be the product of question format than similarities in substantive meaning. As a result, we are forced to discard the purely bottom-up approach for determining our framework. Furthermore, factor analysis relies on an assumption that items that should be grouped together are likely to correlate more. This may be true in some cases, but it is not necessary in the case of well-being. For example, consider questions on family and friends. It makes sense to group them together under an overall heading such as social well-being, or supportive relationships. However, there are many reasons why the quality of family life need not correlate with the quality of relationships with friends. Indeed, the opposite may be more likely, as different people or cultures satisfy the need for supportive relationships in different ways.
-  When two factors are conflated, this means that it is hard to separate their roles in terms of explaining results. For example, imagine there was evidence to suggest that lawyers were happier than hairdressers (in fact the opposite is true!). A researcher might assume that this is because they have a higher income. However, they would be conflating income with (at least) one other variable - for example levels of education. In this context it would be hard to separate out the effects of the two different variables, as well as many others.
Once a framework is in place, two options are available. Firstly one can simply present the data as is, structured, but with results on each question presented separately. This could be useful if you are interested in particular aspects of well-being, but does not allow one to see any patterns or overall picture. Some degree of aggregation is necessary. This may be only to a minimal level (e.g. aggregating the three questions on positive feelings) or it may be to a higher level (i.e. creating a single overall indicator of well-being). Either way many problems are posed and must be resolved. For more discussion on the difficulties of creating composite indicators, look at the joint OECD/JRC Handbook on Constructing Composite Indicators.
Before going on, take a look at the questions that are important to well-being in the ESS. How might you be able to combine them? How can you compare responses to one set of questions with those from another? How could you tell if a country or individual is doing well on a particular aspect of well-being?
Difficult, isn’t it? Well, the methodology presented here was developed for the following purposes:
- To allow different aspects of well-being to be considered separately, but also in aggregate as necessary
- To allow comparison between countries, demographic groups, and over time
- To allow comparison, for a given group, between different aspects of well-being
- To be easily interpretable
It involves three stages:
One of the biggest problems with trying to bring together or compare different types of information is that they are measured in different units and on different scales. This is just as true when looking at survey data as any other information. For example, consider the following two questions:
- How much of the time spent with your immediate family is enjoyable?
- To what extent do you feel that people treat you with respect?
In both cases, the response scales go from 0 to 6. However, for the first question ‘0’ means ‘none of the time’ and 6 means ‘all of the time’, whilst for the second question ‘0’ means ‘not at all’ and 6 means ‘a great deal’. The two scales are not comparable. A ‘4’ for question 1 is not necessarily the same as a ‘4’ in question 2. If the mean for a country is slightly higher, for example, on question 1, this does not necessarily indicate that country has higher levels of family well-being than general levels of respect. Similarly, if we wanted to bring these questions together, alongside others that measure some aspect of social well-being, there is no way of knowing whether a ‘4’ for question 1 and a ‘3’ for question 2 is better or worse than a ‘3’ for question 1 and a ‘4’ for question 2.
These problems exist for all types of data. For some well-known indicators, it is a bit easier to gauge their levels based simply on the numbers. For example, returning to the HDI, if a country had an average life expectancy of 45 years, and a literacy rate of 95%, most people would agree that health was its primary concern. In the real world, however, comparison presents problems even for these indicators. Consider Gambia – where the life expectancy is 54 years, and the literacy rate is 38%. Which indicator represents the more pressing concern? Perhaps an expert on development would be able to immediately recognise that it is Gambia’s literacy rate that is particularly low, but the rest of us would find this hard to spot immediately.
This is why standardisation is useful. It gives us some way of comparing apples with oranges. Scores for each question are transformed such that they are expressed in the same terms: the distance from the mean for that question. Questions where higher figures indicated lower well-being are reversed such that higher numbers now indicate higher well-being. Standardisation follows a well-known formula. For a given individual:
The unit for standardised scores (also called z-scores) is a standard deviation. So, for example, a z-score of 2.0 on a certain question would indicate that an individual’s response was 2 standard deviations above the mean response for that question. A z-score of -0.5 would indicate that their response was half a standard deviation below the mean for that question. A z-score of 0.0 would indicate the individual’s response is the mean for that question. This allows direct comparison to be made between responses to different questions. If an individual’s z-score for question 1 is higher than their z-score for question 2, then we can be sure that their relative ‘family enjoyment’ is higher than their relative ‘feelings of respect’.
It is vital to note the use here of the word relative. Standardising the scores provides no way of knowing absolute levels. If everyone says that they find no time with their family enjoyable (i.e. a ‘0’ on the 0–6 scale), someone who circles ‘1’ (still very low of course) will come out with a positive z-score. By the same token, standardising implies that we cannot compare scores for different questions for the dataset as a whole. The means for Europe for all questions (using z-scores) will be 0 – we are not able to say that Europe as a whole is doing well on one aspect of well-being or another. We can only make comparisons within Europe, between countries, individuals, or demographic groups and, if data are collected for future years, over time. If identical data are collected for other countries in the world, we would be able to draw conclusions about Europe as a whole, but again, these would be relative to the rest of the world. However, without absolute targets of what high well-being looks like (in terms of survey data), and without absolute reference points to allow comparison between different aspects of well-being, nothing else is possible. This problem is not unique to well-being data. Without a reference point, there is no way of knowing that Angola’s GDP of £3440 per capita in 2007 is likely to be associated with poor living standards. Indeed, in 1950, such a level of GDP per capita would have actually been quite high, around that seen in Italy at the time. We cannot conclude from this that living conditions in Angola now are similar to those in Italy in 1950. The only way that we can understand £3440 per capita is in comparison to other figures.
A few further technical details. Before calculating the z-scores included in the data here, we excluded any respondents who had missing data on any of the questions included in the calculations (so called listwise deletion).1 We also excluded Russia because, although there are no more respondents for Russia than for any other country, its large population means that it is weighted very highly (a quarter of the total weighted count for Europe). The results of a single Russian respondent are weighted around ten times more than a single Belgian one, meaning that patterns emerging amongst the 2000 or so Russian respondents will dominant our conclusions.
In exercise 4 you will have a chance to try out calculating standardised scores. Whilst there is a way to get SPSS to do this automatically, we encourage you to use the formula shown above.
-  The exception to this rule was the question on having lots of energy, which was not asked in Hungary. So as not to exclude the entire country from the analysis, we simply ascribed the mean z-score (i.e. ‘0’) to all respondents in Hungary for this question.
We assume that you have downloaded the dataset and that you have SPSS available.
Follow the procedure below to calculate standardised scores for two questions: ‘FMLSTRS - how much of the time spent with immediate family is stressful?’ and ‘SCLMEET - how often socially meet with friends, relatives or colleagues?’:
- First you should weight the data: Select ‘Data’ on the menu, then ‘Weight cases’, ‘Weight cases by’, find the combined weight and click ‘OK’.
- The next step is to calculate the means and standard deviations for the two variables.
- Then compute the z-scores using the formula shown earlier in this chapter: z-score = (x-mean)/s.d., where x is the value for each respondent. Use at least three decimal places in the calculations.
- If you want to check this has worked, now calculate means and standard deviations for the two new variables - the means should both be exactly 0, and the standard deviations should both be exactly 1.
Now you are going to compare the countries’ mean scores on these two variables. For which countries is the ‘friends’ score relatively higher than the ‘family’ one, and for which is the opposite true?
- Change the weighting to design weight.
- Calculate mean scores for the new variables by country.
The weighted mean and standard deviation for the two original variables are; for FMLSTRS, mean = 4.17 and s.d. = 1.58; for SCLMEET, mean=5.03 and s.d.=1.54. These numbers are used to compute the standardised variables.
Generally family life scores higher in more Eastern countries, and friends score higher in more Western countries. The exceptions to this are Ireland and Denmark. Countries where family life is better than general social life - Bulgaria, Cyprus, Denmark, Estonia, Hungary, Ireland, Poland, Slovenia, Slovakia, Ukraine. Countries where family life is worse - Austria, Belgium, Switzerland, Germany, Spain, Finland, France, UK, Netherlands, Norway, Portugal, Sweden.
You could also copy this syntax and paste it into a syntax window in SPSS:
What problem might there be from drawing strong conclusions from the results of exercise 4? Well, one problem is that they are based simply on two questions. For example, by choosing the question on family life being stressful, we are ignoring the question on family life being enjoyable. Surely we should take account of both questions to make judgements about family life overall for a given country. That’s why we aggregate.
Once the scores for individual questions have been standardised, the next task is to aggregate them based on the framework that we presented in Figure 2.1 (and the question list available in the report). Using this hierarchical structure, it is possible to aggregate up to different levels. So, for example, one might bring together just the three positive emotional well-being questions. Or, one might bring these together with the negative emotional well-being questions to produce an overall emotional well-being score. Or, one might, in turn, bring these together with all the other personal well-being components, to produce a personal well-being index. Or, finally, one might decide to combine this with a social well-being index to produce an overall well-being index. How far one goes up the hierarchy depends on what your research questions are. For government, this depends on which of the questions mentioned earlier they are trying to answer.
At each level, the higher level indicator score is calculated by simply taking the unweighted mean1 of the z-scores for the lower level indicators or questions.
As this process can be quite laborious, we have already calculated scores for the 16 components and sub-components in our structure, as well as the top-level figures for personal well-being, social well-being and well-being at work. They are included in the dataset and ready to use. If you like, you can also develop your own structure and calculate your own scores based on it. We do warn you though that the exercises that follow are all based on the scores that are already in the dataset, and it will be easier to check your solutions if you use the same scores.
-  Unweighted between components - this means each component is given equal importance.
Calculate aggregated scores for ‘autonomy’ on the one hand and ‘social well-being’ on the other. You can use the structure we present, or decide for yourself which questions go into each component. Make sure you don’t overwrite any pre-existing variables.
Please note that the variables in the dataset are recoded in such a way that low values indicate low autonomy/supportive relations/trust and belonging, and high value indicates high on autonomy/supportive relations/trust and belonging.
Variables used to measure the concepts:
- ‘ENJSTM - Seldom time to do things I really enjoy’
- ‘DCLVLF - Free to decide how to live my life’
- Supportive relations
- ‘FMLENJ - How much of the time spent with immediate family is enjoyable’
- ‘FMLSTRS - How much of the time spent with immediate family is stressful’
- ‘SCLMEET - How often socially meet with friends, relatives or colleagues’
- ‘PPLLFCR - There are people in my life who care about me’
- ‘INMDISC - Anyone to discuss intimate and personal matters with.
- ‘FLTLNL - Felt lonely, how often past week’
- Trust and belonging
- ‘PPLAHLP - Feel people in local area help one another’
- ‘TRTRSP - Feel people treat you with respect’
- ‘FLCLPLA - Feel close to the people in local area’
- ‘TRTUNF - Feel people treat you unfairly’
- ‘PPLTRST - Most people can be trusted or you can't be too careful’
The first step is to find the weighted (combined weight) mean and standard deviation of each of the variables.
Then you should compute the z-scores by inserting the mean and standard deviation into the formula: z=(variable x – mean of variable x)/standard deviation of variable x. The SPSS syntax for the variable dclvlf: COMPUTE zDCLVLF=(DCLVLF-3.9417)/0.89258. EXECUTE.
The third step is to use the z-scores to compute the aggregated scores of the three concepts; ‘autonomy_user’, ‘trust_and_belonging_user’ and ‘supportive_relations_user’. The SPSS syntax for the variable ‘autonomy_user’: COMPUTE autonomy_user=mean(zDCLVLF,zENJSTM). EXECUTE.
Finally, you should use the variables ‘trust_and_belonging_user’ and ‘supportive_relations_user’ to compute ‘social_WBI_user’.
Then, plot the relationships between income band and the two scores you have created, making sure to include confidence intervals.
- Select ‘Graphs – Line – Multiple’. Select ‘Summaries of Separate Variables’ and click ‘Define’.
- Lines should represent the variables ‘autonomy_user’ and ‘social_WBI_user’and income should be placed in the box for ‘Category Axis’.
- Click ‘Options’ and check ‘Display error bars’.
- Press ‘OK’, or paste the syntax into a syntax window before running it.
If you use the structure as we’ve presented, you should get a graph something like that in Figure 2.3.
How does income seem to determine these two components of well-being?
Solution and SPSS syntax
Focus on the parts of the income distribution where most respondents are (3 600 - 60 000 Euro). As you can see, whilst the social score continues to rise steadily between these bands, the autonomy score actually drops slightly. In other words, aside from the very rich and the very poor, increasing income actually, if anything, has a marginally negative impact on feelings of autonomy.
Following aggregation, each score is the mean of a set of z-scores – as such, for each indicator, the mean across Europe for all respondents is 0 (though the standard deviation is not necessarily 1 as these scores are the means of true z-scores, not z-scores themselves). This makes comparison easy. If an individual has a positive mean for positive functioning for example, then one immediately knows that their functioning is better than the European average. If their score for vitality is lower than their score for positive emotions, then one can say that, relative to Europe, low vitality is more of an issue for them than lack of positive emotions.
These aggregated standardised scores are the most appropriate for analysis purposes. However, for presentation outside the academic sphere, they are somewhat clumsy. Most people are not used to dealing with z-scores. Furthermore, whilst z-scores tell you very clearly how well a particular score compares to the mean, they tell you nothing about how well they compare to the theoretical minimum or maximum 1 for a given indicator. For one indicator , the theoretical maximum could be 1.2, for another it may be 4.2. To resolve this issue we looked for a transformation metric that maps the z-scores for each indicator onto 0–10 scales (t-scores), where a ‘0’ is the minimum for that indicator, ‘10’ is the maximum, and ‘5’ is the mean for Europe.
Before reading on, take a sheet of squared paper and try to imagine what an appropriate transformation metric would look like, using the autonomy score. The x-axis is for the z-scores which range from, in this case -2.57 to 1.52. The y-axis is for the desired t-scores, from 0-10. On the sheet, plot the three points that our transformation metric must pass through. For example, we know that, when the z-score is -2.57, the t-score should be 0. What are the other two points we know? Look at Figure 2.4
As you can see quickly, the three points (0, 5 and 10) do not sit on a straight line. This means that no linear transformation (i.e., of the form y = mx + c) is possible. Note that, if we weren’t interested in the middle point (i.e. a z-score of 0 did not need to equate to a t-score of 5), then we would not have this problem. A straight line could be drawn between the point (-2.57,0) and (1.52,10). Any idea what the formula would be?
The answer is found on the next page.
-  By which we mean the lowest or highest score that an individual could possibly achieve on a given set of questions.
Here are the steps to get there:
- A standard linear equation takes this format:
- Putting in the two points we know, we have:
- With two unknowns, these equations can be solved:
- Putting m and c back into the original equation we have:
- When the minimum and maximum values for the competence indicator are included equation 1 emerges.
As you can see, if you put in a z-score of 0, the t-score will not be 5 unless the maximum and minimum values are of the same magnitude (e.g. 1.2 and -1.2), which they never will be for the standardised scores from survey questions. Instead, the minimum is almost always further from 0 (the mean), than the maximum is. This is because the mean tends to be above the mid-point of the distribution of untransformed response codes - a problem related to negative skew. For example, in response to the question “how much of the time during the past week could you not get going”, most people responded ‘never’ or ‘some of the time’ - the bottom two responses. Very few people responded with the other two responses (‘most of the time’ or ‘all of the time’).
As a result, if we want the mean untransformed score to always be halfway along our transformed distribution, then we have to stretch parts of that distribution. The smoothest way to do this is by having a linear stretch function, such that the end of the distribution which is closer to the mean is stretched whilst that furthest from the mean is compressed. Including a linear stretch makes the transformation. The function takes the following form:
If you are feeling brave why not try and solve to work out what the constants m, d and c are for a given indicator i with a minimum z-score of mini and a maximum z-score of maxi?
The trick is to start by putting in the three known points and trying to solve the equations that drop out.
Well, it comes out like this:
The whole thing simplifies to this:
And looks something like this:
Transformation should always be done at the last possible moment before presenting data, as the curvilinear relationship can distort patterns. For example, to get the mean score for a particular country on a particular indicator, the average of the z-scores across all individuals should first be calculated, and only then transformed; rather than transforming the z-scores for each individual and then taking the average.
Even taking these precautions, transformed scores must be treated with caution. The curvilinear transformation results in scores at one end of the distribution being stretched more than those at the other end. However, one should not be overly concerned with this distortion as it assumes that the original scales used in the questions are linear. Such faith would be ill-founded. For example, it is not necessarily the case that the difference between ‘all or almost all of the time’ (a response scored as ‘4’ for some questions) and ‘most of the time’ (scored as ‘3’), is the same as the difference between ‘most of the time’ (‘3’) and ‘some of the time’ (‘2’).
To demonstrate that the point at which one transforms scores makes a difference, try the following:
Calculate mean z-scores for autonomy, by country, using appropriate weighting. Then transform the country means produced using the equations shown above (do the last in Excel).
- Weight data by design weight.
- Find the mean, minimum and maximum values of each country on the autonomy variable.
- Copy the table from the output and paste it into an Excel sheet.
- Compute each country’ transformed score:
Next use SPSS to compute transformed autonomy scores for each individual (using the same formula). Then produce a mean transformed score for each country, using appropriate weighting.
How different are the two sets of scores?Solution and SPSS syntax
The scores we calculated are shown here, in rank orders:
As you can see, the rank orders are almost the same, but there are quite big differences in the actual scores. If you have problems replicating these, check out the SPSS syntax for methodology A and methodology B. The ‘min’ score was -2.5690 and the max score was 1.5155 (both to four decimal places).
* methodology A.
* methodology B.