Impact of scales on respondents' answers
Surveys are a popular source of marketing data. Today there are a large number of ready-made studies. Sometimes the data obtained by different companies may vary greatly. Sometimes differences in data are due to the choice of different response scales. Due to the diversity of such scales, it is almost impossible to compare the results of different surveys.
How to compare the results of different surveys?
This article offers market research specialists a guide to practical “translation” of answers with the most frequently used scales, which can simplify the comparison of the results obtained.
This was made possible by a study conducted by Sara Dolnicar and Bettina Grün in 2011.
Data and method
The experiment used data from two online surveys. Respondents (n = 2609) were asked to fill out two questionnaires to assess the brand image with an interval of approximately two weeks. Both versions of the questionnaire were identical, except for the response scales. This made it possible to obtain information on the translation of responses from one scale to another at the individual level. The collected data allowed to establish links between the answers of the same respondent in different scales. Variations of answers were not due to individual differences of participants or changes in their perception of the brand, since the gap between the two estimates was too short, and during this time there were no changes in advertising or market positions of the brand that could affect the rating.
Respondents evaluated two brands: McDonald's and Red Rooster. The survey evaluated five characteristics: tasty, fast, cheap, useful and convenient.
9 response scales were tested:
1. Fully binary scale (choice of answer or refusal to choose)
McDonald’s is a tasty choice.
2. Fully binary:
McDonald’s is delicious
3. 5-point Likert scale with full description.
Fully agree Agree Partly agree, partly not Disagree Completely disagree
Tasty square.jpg square.jpg square.jpg square.jpg square.jpg
4. 5-point Likert scale describing extreme points
Fully agree (+2) +1
Completely disagree (-2)
Tasty square.jpg square.jpg square.jpg square.jpg square.jpg
5. 4-point Likert scale with full description
Fully agree Agree Disagree
Tasty square.jpg square.jpg square.jpg square.jpg
6. Monopolar 4-point scale with full description
Not at all A little
Tasty square.jpg square.jpg square.jpg square.jpg
7. Bipolar 7-point scale with full description.
Very Pretty A Little Neither
Tasty square.jpg square.jpg square.jpg square.jpg square.jpg square.jpg square.jpg tasteless
8. Bipolar 7-point scale describing extreme points
Very (+3) +2 +1 0 -1 -2 Very (-3)
Tasty square.jpg square.jpg square.jpg square.jpg square.jpg square.jpg square.jpg tasteless
9. Bipolar 6-point scale with full description.
Very Pretty A little bit Pretty Very
Tasty square.jpg square.jpg square.jpg square.jpg square.jpg square.jpg tasteless
Translation of answers from the fully binary scale
Within the framework of the first analysis, correspondences were obtained between the responses of the fully binary scale and all other scales with the full description included in the experiment. At the same time, the control group also translated the answers within the fully binary scale itself (row 1 in Table 1) in order to determine the base level of instability. In fig. 2 presents the results: the top row shows how the respondents who answered “yes” for the first time (on a fully binary scale) answered the second time; the following scales were used for the second time: fully binary (column 1), affirmative binary (column 2), 4-point Likert scale with full description (column 3), 5-point Likert scale with full description (column 4), monopolar 4-point scale with full description (column 5), bipolar 6-point scale with full description (column 6) and bipolar 7-point scale with full description (column 7). The bottom line shows how respondents translated their “no” answers in a question with a fully binary scale in the first survey to other answers in questions with different scales in the second survey. The height of the bar corresponds to the share of answers for each option.
The effect of the scale on the respondents
For example, as indicated in column 3 in fig. 2, only 22% of respondents who chose “yes” in the first survey noted “fully agree” in the second when using the 4-point Likert scale with a full description; while the other 68% said “agree”. Of those who chose "no" for the first time, 16% said "disagree completely", and 59% - "disagree."
Basic level of instability
BaThe baseline level of instability for a fully binary scale is about 15%. This indicator reflects the proportion of respondents who changed their answers during the re-evaluation in cases where the response scale remained identical (see column 1). Asymmetric use of an affirmative binary scale In an affirmative binary scale, respondents answered questions asymmetrically. They less often noted the option “yes” in comparison with the scale in which it was necessary to make a choice from the proposed options (as in the fully binary scale). As shown in the diagram in column 2, only 63% of the “yes” answers in the fully binary scale are preserved in the affirmative binary scale - the discrepancy is much higher than the base level of instability at 14%. On the other hand, 92% of the no answers remained unchanged. For a specialist, this means that the answer “yes” carries a stronger meaning in the affirmative binary scale, and the answer “no” should not always be interpreted as negative, e with a binary scale with a forced choice. For all other scales, the tendency to choose positive answers is higher than the tendency to choose negative ones, and scales with a central point on the rating scale shift the positive and negative responses of respondents to a neutral average position. The practical results of the translation, presented in Table 1, can serve as a guide for comparing the results obtained using different scales in real conditions. Translation of answers from a scale without a center point to a scale with a center point A 4-point Likert scale was also compared with a full description without a center point and a 5-point Likert scale with a full description and center point (in Fig. 3, the translation results are shown on the left), and there is also a bipolar 6-point scale with a complete description without a center point and a bipolar 7-point scale with a full description and a center point (on the right). When comparing the results, the following key conclusions were made. Table 2 Repetition rates of positive and negative answers and summary results Positive answers (in%) Negative answers (%) All (in%) Fully binary scale86 83 85 Affirmative binary scale 63 92 754-point Likert scale with full description 89 76 845-point Likert scale with full description73 53 65Monopolar 4-point scale with full description96 38 72Bipolar 6-point scale with full description91 62 79Bipolar 7-point scale with full description76 48 65Pri ode from a 4-point Likert scale is 5-point was noted that embodiments "completely agree" rarely pass into zone "partially agree and partially not"; although only 52% of respondents repeatedly chose the option “fully agree.” For all other variants of initial responses, the shift to the central zone is quite significant: 27% switched to it from the option “agree”, 42% - from “disagree”, and - most surprisingly - 18% shifted to the center from the option “not completely I agree". The practical conclusion that can be made on the basis of these data is that the neutral zone becomes a convenient answer for respondents who do not fully agree with the submitted statement. A rather high indicator of the transition from the “completely disagree” option to the central point clearly demonstrates the impossibility to assume that respondents who are unsure of their answer, in the absence of a neutral option, choose the answer randomly. Judging by these results, it is preferable to abandon the central point on the rating scale, at least in a situation assessing the brand image, if the choice is between 4-point and 5-point scale. When analyzing the translation of answers from the bipolar 6-point scale with a full description without a central point to a 7-point, a different picture emerges: only a few respondents move from extreme points to the central zone (6% for “very” and 2% for “very (tasteless)”). The transition from “pretty (tasteless)” and “pretty (tasty)” is symmetrical - about one third of the respondents switch to the neutral option. Asymmetry is observed only for the initial options “pretty (tasteless)” - only 10% of respondents moved to the center - and “fairly (tasteless)” - 18%. Due to the general substantial shift towards a neutral position (20%), the scale with the center point cannot be considered the preferred option in the context of assessing the brand image, especially for long scales. Transfer of Likert Scale Responses to Bipolar Scales 3 shows the translation of answers from a 4-point Likert scale with a full description into a 5-point Likert scale with a full description (left) and from a bipolar 6-point scale with a full description into a bipolar 7-point scale with a full description (right). To correctly interpret the data presented here, we calculated the basic level of instability for each of the scales. With regard to the fully binary scale, the indicator of the basic level of instability corresponds to the proportion of respondents who did not choose the same answer option. The impact of the scales on the responses of the respondents. The basic level of instability is 29% for a 4-point Likert scale with a full description, 35% for a 5-point Likert scale with a full description, 52% for a bipolar 6-point scale with a full description and 53% for a bipolar 7- point scale with full description. Even if we take into account the imaginary stability of responses that occurs when a random choice of answers (as described by Schmittlein, 1984), the basic level of instability increases with an increase in the number of suggested response options, therefore only binary scales retain the highest level of stability compared to the others. These differences themselves have a high practical significance. And although many supporters of multivariate scales argue that more than two answers are needed to fix different shades of meaning, the price of such accuracy is a decrease in reliability, which leads to the fundamental question of the validity of the results of such instead it can only reflect the respondent’s desire to avoid the answer. The “yes” answers on a 4-point Likert scale with a full description The 4-point Likert scale was excellent for fixing “yes” answers in a fully binary scale. The total number of responses “fully agree” (22%) and “agree” (68%) turned out to be almost identical to the number of “yes” answers in the fully binary scale. The deviation of the indicators remains within the basic level of instability (the sum of 2% and 8% is only slightly below 14%). The same results were obtained for the answers "no." Conversion from a fully binary scale to a 4-point Likert scale with a complete description is consistent, making it practical to compare the results in both cases relatively easily. At the same time, it is interesting to note that the majority of “yes” answers go to the more conservative option “I agree,” and not “completely agree.” The presence of a center point on the scale makes it difficult to translate. The introduction of a center point on a 5-point Likert scale with a full description somewhat complicates the translation of results from a fully binary scale. As can be seen in column 4, 21% of the “yes” answers and 36% of the “no” answers are shifted to the “partly agree, partly disagree” zone. Therefore, only 73% of the original “yes” answers remain positive on a 5-point Likert scale with a full description (sum of 55% and 18%), and only 52% of the initial “no” answers remain negative (sum of 14% and 38%). This means that the empirical results obtained using the 5-point Likert scale with a full description present underestimated data on the affirmative answers of the respondents in comparison with the fully binary scale and the 4-point Likert scale. Translating responses from a fully binary scale to a monopolar 4-point scale with a full description (column 5) shows that respondents are able to quite effectively transmit positive answers (three positive answer choices on a monopolar 4-point scale, namely: “a little bit”, “quite "And" very "- covers 96% of the initial answers" yes "). However, this does not happen with negative answers: 53% of them moved to the “little” zone, and only 38% of respondents chose the option “not at all”. This means that - at least in the context of assessing the brand image - the use of a monopolar 4-point scale causes a significant shift towards positive responses. When converting from a fully binary scale to a bipolar 6-point scale with a full description, the level of correspondence of positive answers remains rather high. Of those who chose the “yes” option in the first case, 91% also chose one of the three positive answers in the second. The correspondence of negative answers is less high: only 62% of respondents who noted “no” in the fully binary scale, chose one of three negative options on a bipolar 6-point scale with a full description. In practice, this means that the results of such a scale are characterized by a slight shift towards positive responses when compared with simple scales choosing between yes and no. Finally, the conversion from a fully binary scale to a bipolar 7-point scale with a full description, which has a central point, led to the same results as in the case of a 5-point Likert scale with a full description. The answer “neither one nor the other” scored a significant number of answers, reducing the share of positive to 76% (17% of those who chose “a little”, plus 32% of those who chose “pretty”, and 27% of those who chose “very ") And the proportion of negative to 48%. In tab. 2 shows data on the proportion of positive, negative and general correspondences between the different scales of the first survey. In general, these results indicate quite significant deviations in answers depending on the scale of questions. When studying the translation of answers in this study, some system deviations were also identified. The affirmative binary scale tends to evade response, and therefore when using it, one should always expect lower rates of positive responses than in the case of ratings. When moving from a 4-point Likert scale with a full description to a bipolar 6-point scale with a full description, the answers of the respondents generally coincide with the expected ones: the answers that in the first case fell to the extreme points on the scale, in the second they were distributed over four zones (two positive and two negative). In the case of negative answers, the two extreme options contain 74% of all initial answers “totally disagree”, and in the case of positive answers, 84% of all initial answers “completely agree”. The same principle holds true in the case of two central variants of a 4-point scale. The only unexpected result was an indicator of 29% of the number of those who chose the “disagree” option on a 4-point scale and a little on a 6-point scale, thus changing the negative rating to a positive one. When moving from a 5-point scale Likert to the bipolar 7-point scale similar results were obtained: extreme answers on a 7-point scale cover 92% of the initial “completely disagree” answers and 79% of the initial “disagree” answers. Switching to a positive assessment is also noted here: 16% of the “disagree” answers went to the “little” zone in the second case. In addition, a significant shift was noted with respect to the original answer “partly agree, partly not.” Transfer of answers from the scales describing extreme points to scales with full description Learning the answers on a 5-point Likert scale and a bipolar 7-point scale with full descriptions in comparison with answers on a 5-point Likert scale and a bipolar 7-point scale with descriptions of extreme points, respectively, and indicators of a basic level of instability were calculated: 35% for a 5-point Likert scale with a full description, 53% for a bipolar 7-point scale s with a full description, 46% for a 5-point Likert scale with a description of extreme points and 52% for a 7-point bipolar scale describing extreme tochek.Dannaya work involves analysis of a number of extreme responses. If only the extreme points of the scale are described and this description serves as an index for respondents, then we can expect that a larger number of respondents will choose these answer options. This assumption is confirmed in practice: in this study, only 20% of respondents chose the extreme answer options in a 5-point scale Likert with a full description compared to the figure of 27% for a 5-point Likert scale with a description of extreme points (χ2 = 69, df = 1, p-value <0.001); and only 19% - on a bipolar 7-point scale with a full description compared to 21% for a bipolar 7-point scale with a description of extreme points (χ2 = 7.5, df = 1, p-value = 0.006). These differences are significant for both scales. 4 shows the translation results of the answers. In general, the switching rate in the transition from a scale with a full description to a scale with a description of extreme points is 42% for a 5-point scale and 54% for a 7-point scale. These results show that the level of switching when moving from a 7-point scale is almost identical to the level of switching when respondents answer the questions twice on the same scale (the test for proportions on two basic levels of instability and the switching parameter indicates that that they are not statistically significant: χ2 = 1.5, df = 2, p = 0.477). The effect of the scales on the answers of the respondents On the basis of the analysis of the presented transitions, the following conclusions were made: Approximately a third of the respondents who first answered on a 5-point Likert scale with a description of extreme points, and then on a 5-point Likert scale with a full description, they moved from the answers “fully agree” and “completely disagree” to “agree” and “disagree”, respectively (Fig. 4 left). However, among those respondents who initially chose the options “agree” or “disagree”, only a few went to the answers “fully agree” (8%) or “completely disagree” (13%). These results can serve as a practical confirmation of the previously voiced assumption that the scale with a description of extreme points motivates respondents to choose the extreme answer options. 4 on the right shows the transition from a bipolar 7-point scale with a description of the extreme points to a bipolar 7-point scale with a full description. There is the same tendency in the answers as in the case of the 5-point Likert scale, the only difference is that the level of switching is generally higher - this corresponds to a higher baseline instability of this scale. ConclusionsThe main result of the above work was an idea of the consequences of choosing a particular scale when compiling a questionnaire. The findings made it possible to develop strategies for comparing the responses of respondents with different scales. The second significant result was a more complete understanding of which behavioral patterns of respondents, characteristic for different scales, affect the results of the study and should be taken into account when designing questionnaires. An exceptionally binary scale is characterized by a very low level of basic non-stability. flaxity (14%) compared to other formats with a large number of response options. This is an important observation that calls into question the validity of multivariate scales when assessing the brand image. The level of positive answers decreases in two cases: if the respondent is given “freedom of answer” and if there is a middle center point on the scale. In contrast, a monopolar scale increases the number of positive responses. Multivariate 7-point scales (Cox, 1980) are characterized by a high level of instability, and their use can lead to the generation of useless data instead of obtaining detailed information, resulting in a reduction in the overall validity of the assessment compared to a simple binary format.