Appendix 1: SQP coding process for the variables in the analysisInitially, the SQP coding seems not to be so simple because many decisions have to be made in order to properly describe which characteristics best define the question you are coding. For complete guidelines to SQP and the Codebook, we refer to the documentation available in SQP.
To start with the illustration, we will first indicate some basic steps before the coding begins: Step 0: Access SQP via sqp.upf.edu
The first screen provides information about the purpose of the program and other practical information. Click ‘Start’.
If you have not registered before, create a new account and you will be able to log in using the username and password you specify.
SQP allows each user to participate in the development of a database of questions with quality predictions. A user can create his/her own study in which the specific questions will be introduced and can be coded by the user and others. A user can also check the coding results for any question introduced by any user and in any study.
For the purpose of this illustration, from the above screen, we should go to ‘All Available Questions’. The Questions list that appears contains all the questions available in SQP, with and without predictions.
Step 1: Find and open questions in SQP. As we are going to illustrate the coding process using the variables of the analysis presented in the introduction, we will focus on the questions from ESS Round 6 in the United Kingdom. To open a question in SQP, the user will need to be logged in. Once in the Questions list (shown above), the user will need to filter the following details: All Studies = ESS Round 6; All languages = English; and All countries = United Kingdom. To search for a particular question, you can easily enter E17,1 for example, under ‘Containing text’. This is what will appear if you click on the line for information about the question:
As you can see above, Question E17 has already been coded and the authorized quality prediction is 0.643, which is far from 1. The prediction shown is authorized because a member of the SQP team has coded the question.
Step 2: As the SQP text is just a reminder of the question to be coded and should be used as the object of coding, besides finding the question in SQP, it is also very important for the assurance of the coding of the characteristics to find and open the question in the ESS questionnaire for Round 6 in UK. This can be simply done as follows: go to the ESS website,2 select Data and Documentation, then search for United Kingdom in the section Data and Documentation by Country, and, in the section Documents of ESS Round 6 – 2012, download the questionnaire and also the show cards. Find Question E17 in the questionnaire and the corresponding card (Show card 38). This is what you should obtain:
Note that the two texts (in SQP and in the questionnaire) should be and are exactly the same, with the exception of the instructions for the interviewers, the don’t know option and the show cards, which will be not visible in SQP. In order to code these characteristics properly, we need the actual questionnaire and that is why we have opened it.
These steps should be applied to all the other questions under study before the coding starts.
Step 3: To start with the coding process, you have to select from the last screen ‘Code question to create my own quality prediction’ and then ‘Begin coding’. The program will display the series of characteristics the user will have to code. Note that the yellow windows in SQP provide help and further explanations.
First, some basic characteristics of the topic of the question will be displayed. The Domain refers to the subject under study. For this question, it is ‘National politics’ because E17 asks about the freedom and fairness of national elections. Once the option National politics is selected (as shown in the screen below), you are able to be more specific and select the category ‘Elections’ from Domain: National politics.
Next, the coding process continues with more characteristics of the basic aspects of the question, such as the Concept.
The Concept is the characteristic indicating what the question measures. E17 measures an ‘Evaluative belief’ because the question asks for a judgement, which is not neutral as it is has an implicit positive connotation with the words ‘free’ and ‘fair’. It is not an Evaluation because then we would expect words like ‘good’, ‘bad’, ‘positive’, ‘negative’, etc. Moreover, it is also not a Judgement because in that case we would expect neutral words and not words with a positive or negative connotation like ‘free’ and ‘fair’. Both Evaluation and Judgement can be found under the concept ‘All other simple concepts’.
This procedure continues until all the characteristics are coded. In order to explain our decisions, we will continue with the illustration, mainly focusing on characteristics that could present some difficulties. We have therefore skipped some characteristics that we believe to be simple. The complete coding is available, however, and anyone can check and compare our authorized coding with their own in SQP 2.0.
Next is Social Desirability. This characteristic is specific to the population under study, so cultural and time references must be taken into account. In this case, since we are coding a question from ESS Round 6 in the UK, we must take into account that the survey took place in 2012 and that the respondents were the British population. For this reason, we suppose that the code should be: ‘A bit’ sensitive. The same applies to the Centrality characteristic. The coder must try to think like the respondents and envisage whether the topic will be central or not in the minds of the survey respondents. Referring to the British population in 2012, we suppose that national elections are a ‘Central’ topic in their minds.
Secondly, the characteristics of the request are displayed. Starting with the characteristic Formulation of the request for an answer, we should code E17 as the first question in a battery of questions consisting of a request and a statement. In this case, the request is ‘Indirect’, because of the pre-instruction ‘please tell me....’. The request uses a WH word in the expression ‘to what extent’, which should be coded as ‘How (extremity)’ because it measures how much the statement applies. Confusion could be caused by the WH word ‘what’, because the word ‘what’ is sometimes not intended to measure ‘how much’ but to identify something (e.g. What did you eat yesterday?). This reasoning also holds for other languages like Spanish and German. In Spanish ‘en qué medida’ (to what extent) also measures ‘how much’ although it uses the WH word ‘qué’ (what). E17 uses an ‘Imperative’ Request for an answer type because it explicitly asks for an answer by stating ‘please tell me’. Furthermore, the Balance of the request is ‘Not applicable’ because ‘apply’ does not have two possible poles. An Encouragement to answer is present in the expression ‘please tell me to what extent…’, and an Emphasis on subjective opinion is also present because of the words ‘… you think…’. For this particular question, it is very important to note that a Stimulus or Statement is present, because E17 is part of a battery and the statement is ‘National elections are free and fair’.
Thirdly, the characteristics of the response options are provided. In E17, the Response scale is provided with ‘Categories’ because the respondents have more than 3 and less than 12 options to choose from. The 11-point category scale is ‘Partially labelled’ because only categories 0 and 10 have labels. The Correspondence between the numbering of the categories and the labels is ‘High’ because the lowest number corresponds to the lowest (or most negative) label of the options and the highest number also corresponds to the highest (or most positive) label of the response options. The Theoretical range of the scale is ‘Unipolar’ because, theoretically, there are not two poles it is possible to formulate with ‘apply’ other than ‘not apply’, which is not the opposite pole but the zero point of ‘apply’. Furthermore, the Number of fixed reference points is 2 because only the labels ‘Not apply at all’ and ‘Applies completely’ are fixed and extreme labels. In this context, ‘fixed’ means that there is no doubt about the position of the label on the scale from extremely low to extremely high. Finally, it is asked whether a Don’t know option is provided on the scale. As we see, this cannot be answered without looking at the ESS questionnaire because, while it is not present in SQP, it is present in the questionnaire. In this case, the don’t know option is separate from the defined answer options, so we should mark that the Don’t know option is only registered by the interviewer if necessary.
Fourthly, SQP provides a set of characteristics regarding the instructions. In the first place, like the Don’t know option, the Interviewer instructions are not present in the SQP text. However, it is easy to see from the ESS questionnaire that there are several such instructions, namely ‘Card 38’, which indicates that the interviewer has to provide Card 38 to the respondent, and ‘Read out each statement and code the grid’. Several Respondent instructions are also provided in E17, but, in this case, they can be detected from the SQP text, as these are instructions read by the interviewer to the respondent. Examples of these instructions to the respondent are: ‘Using this card’ and ‘0 means you think the statement does not apply at all and 10 means you think it applies completely’. Furthermore, the latter definition of the scale can also be identified as an Extra motivation, information or definition; it is extra because it is not really needed by the respondent to answer the question properly.
Fifthly, the complexity of the questions is measured by the linguistic characteristics of the introduction, the request and the answer options text in SQP. In this case, since the Introduction is present, the linguistic characteristics will not only apply to the Request and Answer options texts but also to the Introduction text. Such characteristics include: number of words, nouns, abstract nouns, syllables, subordinate clauses, etc. SQP usually provides a value suggestion for these characteristics, but it is not always correct. For example, the nouns are denoted by NN, but they are sometimes wrong. The number of syllables in the answer options also includes all the numbers before the labels, which should not be counted. For these reasons, we suggest double checking the suggested value given by SQP. In the latter case, the coder always has to subtract the number of categories from the number of syllables suggested by the program.
Sixthly, the show card characteristics have to be coded. Because we have seen Card 38, we will indicate that a Show card is used. The show card characteristics can easily be detected. Card 38 has a ‘Horizontal’ scale; there is ‘Overlap present’ because the labels of 0 and 10 are not directly above these two numbers; the categories are ordered by ‘Numbers’; there is ‘Not a start of the response sentence’ on the Card, nor the question or a picture.
Seventh, and lastly, SQP provides you with a few characteristics to identify the mode of data collection. These refer to the whole questionnaire and not just to the particular question, E17, so we have to find out whether the questionnaire was computer based, whether an interviewer read the questions and the position of the question in the questionnaire. In the ESS, the Computer and Interviewer information can be found in the Data and Documentation Report (searching for the specific country). In ESS6 UK, the questionnaire was administered using CAPI (Computer assisted personal interviewing) and we can therefore say that a computer was used and that an interviewer asked the questions. We can say that the interviewer is present, not only because of that, but also because of the instructions we have already identified. If the questionnaire is self-administered, then the presentation is Visual. In this case, a personal interviewer asks the questions and the presentation of the questionnaire is ‘Oral’. Finally, the Position of this question in the questionnaire is 123.
With the specification of the position of the question in the questionnaire, the coding of question E17 is complete. We can now obtain the Quality Prediction by clicking ‘View Quality Prediction’.
We see above that the quality of the question is far from perfect (.643). This means that 35.7% of the variance in the answers to the question is error.
The user is not only able to obtain the quality, reliability and validity, but also the reliability and validity coefficients, which can and will be used later in the correction for measurement errors. You can obtain the quality coefficients by clicking ‘View quality coefficients’.
The same procedure must be applied to all the other variables under study. Continuing to the rest of the questions in the battery, we will illustrate the coding of question E20.
Again, we can see from SQP that Question E20 has also been coded and that one of the codings is authorized because it was done by the SQP members. The quality obtained by the authorized coding is 0.604, which is even lower than the quality of Question E17 and also rather far from 1.
As shown in the questionnaire above, E20 follows the request in E17, but this time E20 is not the first item provided in the battery (it is the fourth). Because most of the characteristics will be the same as for E17, here, we will only explain the codes that will be different and that are highlighted in the following screen:
First, and most important, is to note that, in this case, in SQP the request text and the introduction are no longer provided because these texts in a battery are not read over and over again before each statement, just before the first one. In such cases the request only consists of the stimulus or statement and the answer categories.
- Domain – National politics: the Domain for E20 will again be National politics, but this time not specifically related to Elections but to ‘Political parties’, as it asks about opposition parties’ freedom to criticise.
- Concept: will be code E20, ‘Evaluative belief’, because of the implicit positive connotation of the words ‘free to criticise’.
- Social Desirability and Centrality: can change over cultures and time period, depending on the topic. Focusing on the British population of 2012, we suppose that the topic ‘opposition parties’ freedom to criticise’ is still ‘A bit’ sensitive and ‘Central’ in their minds.
- Formulation of the request for an answer: a request, either direct or indirect, is usually provided. However, in this case, there is ‘No request present’, because E20 is not the first item in the battery. The request characteristics are then no longer presented in SQP.
- Use of stimulus or statement in the request: is indeed present because it is actually the only text provided in SQP. The statement is: ‘Opposition parties in the UK are free to criticise the government’.
- Interviewer and Respondent instructions: there are no instructions present in E20, neither for the interviewer nor for the respondent. Note that all the instructions coded in E17 were placed in the request, so now that the request is not present, we should not code them.
- Linguistic characteristics: as the introduction is not present in E20, its linguistic characteristics are no longer provided. Focusing on the linguistic characteristics of the request, the code will only be based on the text of the statement.
- Position: the position of question E20 in the UK ESS Round 6 questionnaires is 126.
The coding of E20 is now done and predictions of the quality, reliability and validity can be obtained.
E25 has been coded very similarly to E20 except for the aspects regarding the Domain, Social desirability and Centrality and Position. Question E25, which also belongs to the same battery, presents the statement: ‘The courts in UK treat everyone the same’. In this case, we therefore coded the Domain of E25 as ‘National politics’ and under it we specified ‘National institutions’, with reference to the courts. Again based on the British population of 2012, we would say that the topic is ‘A bit’ sensitive and ‘Rather central’ to them. Looking at the predictions below, we see that the differences are not great and that E25 only has 0.001 better quality than E20.
Let us continue with the illustration of how we coded the control question about household income, which is question F41 of ESS Round 6. The question in the ESS questionnaire and its show card are shown below:
As you can see from the next screen, we have highlighted the characteristics, which we are going to explain below:
- Domain: for question F41, the domain should be coded as ‘Living conditions and background variables’.
- Domain - Living conditions and background variables: more specifically, we can say that the subject of the household income question is ‘Income’.
- Concept: the concept of question F41 should be coded as ‘Facts, background or behaviour’ because it is an objective question and the information could, in principle, be obtained from a source other than the respondent.
- Social Desirability and Centrality – in the case of the British population of 2012, we would say that the topic has high social desirability, is ‘A lot’ sensitive, and also ‘Rather central’ to the respondents.
- Formulation of the request for an answer: F41 is an indirect request because of the formal pre-request ‘Using this card, please tell me …’.
- Request for an answer type: F41 uses an ‘Imperative’ request because it explicitly asks for an answer using the phrase ‘please tell me’.
- Balance of the request: is ‘Not applicable’ because ‘which letter describes’ does not have two possible poles.
- Presence of encouragement to answer: is present in the expression ‘please tell me…’.
- Use of stimulus or statement in the request: unlike questions E17, E20 and E25, question F41 does not belong to a battery of questions and there is therefore no Stimulus or Statement present.
- Response scale - basic choice: in F41, the answer options are provided with several categories labelled from J to H. So it is a category scale. The respondents have to give their approximate income estimates, weekly, monthly or annually. Because of this specific formulation, we will code the characteristics of the response options (also the linguistics) based on the first option, ‘weekly’.
- Labels of categories: the 10-point category scale is fully labelled because categories J to H have specific labels if we look at the Show card.
- Correspondence between labels and numbers of the scale: the correspondence between the lettering of the categories and the labels (the labels from the weekly column) is ‘Low correspondence’ because the lettering does not follow an alphabetical order corresponding to the order of the labels, which go from the lowest amount to the highest.
- Theoretical range of the scale bipolar/unipolar: theoretically, the concept of question F41 is Unipolar because there are not two possible poles.
- Number of fixed reference points: is 8 because labels from categories R to D are closed ranges, while categories J and H are open.
- Don’t know option: is separate from the defined answer options J to H, and it is not present in the Show card. We should therefore mark that the Don’t know option is only registered by the interviewer when a specific category is not chosen.
- Interviewer instruction: we can easily see from the ESS questionnaire that there is an instruction, ‘Card 58a’, which indicates that the interviewer has to provide Card 58a to the respondent.
- Respondent instruction: several instructions to the respondent are also provided and can be detected from the SQP text, such as: ‘Using this card’, ‘If you don’t know the exact figure, please give an estimate’ and ‘Use the part of the card that you know best: weekly, monthly or annual income’.
- Extra motivation, info or definition available?: is provided by the extra, informative sentence ‘Use the part of the card you know best: weekly, monthly or annual income’.
- Linguistic characteristics: as the introduction is not present in F41, the linguistic characteristics will only be counted for the texts of the request and answer options. Such characteristics include: number of words, nouns, abstract nouns, syllables, subordinate clauses, etc. As stated in the illustration of E17, we suggest double checking the value suggested by SQP for these characteristics.
- Show card used: as we have seen, Card 58 is used to indicate the different labels of income available to answer.
- Overlap of text and categories: the vertical scale provides a text that is clearly connected to the categories.
- Mode of data collection: as we already know from the previous questions coded, the British questionnaire of the ESS in Round 6 was administered by a Computer and an Interviewer asked the questions. The interviewer instructions coded before can also give us a clue about the presence of the interviewer.
- Position: the position of question F41 in the UK ESS Round 6 questionnaires is 192.
From this coding procedure for question F41, we obtain the following coefficients for the quality, reliability, validity and common method variance. Compared with the previous questions coded, we see that F41 has lower quality than question E17, but higher than E20 and E25.
Using the same procedure, questions B23, satisfaction with democracy, and B19, left or right placement, were coded. The coding procedure for these two questions will follow the reasoning described in the preceding illustrations. Here, we will therefore only indicate the major differences encountered in these questions.
|National politics - national government.||National politics - national institutions because democracy is represented by the work of institutions like the Parliament.|
|All other simple concepts - Judgement, because the evaluation is neutral.||Feeling, because satisfaction is an affective evaluation.|
|Where (place)||How (extremity)|
|Balanced, becaue the request allows answering in the left or in the right side.||Unbalanced, because just one pole "satisfied" is mentioned in the request while "dissatisfied" is the opposite pole.|
|Medium, because the scale is bipolar and the nubering on the left side should have negative values, the neutral point the zero category and the right side the positive values.||Medium, because the scale is bipolar and the numbering in the "dissatisfied" part of the scale should be negative, the neutral point should be the value 0 and the "satisfied" part, the positive values.|
|Bipolar, because two poles "left" and "right" can be formulated.||Bipolar, because the two poles can be formulated in English.|
Range of the scale
|Bipolar, because two poles "left" and "right" are mentioned in the scale.||Bipolar, because the two poles "satisfied" and "dissatisfied" are mentioned in the scale.|
Number of fixed reference points
|1, which is the implicit neutral point of option. 5 because the extreme labels are not extreme options, one can be Extremely in the left side or in the right side.||3, the two extremes and the implicit neutral category in option 5 of the response scale.|
For further information about the coding of these variables, we refer to SQP 2.0. Users can log in and check all the choices made in these questions, which will appear under the authorized coding.
The quality predictions and coefficients obtained from the SQP coding of these two questions are presented below:
|Open in new window