How to analyze Likert type dependent variables

By , April 15, 2010 5:15 pm

Suppose your dependent variable (DV) is a Likert scale or something similar. That is, it’s some sort of rating, from 1 to 5 or 1 to 7 or some such. And suppose you want to regress that on several independent variables. What should you do?

There are three broad categories of regression models that might be applicable. A lot of people routinely use linear regression (often simply called regression). Others routinely say this is incorrect, and that you should use ordinal logistic regression. And yet others will do things such as multinomial logistic regression, or collapsing the DV into two categories, and then doing binary logistic. Which is right?

The short answer to this is to quote Sir David Cox

There are no routine statistical questions, only questionable statistical routines

Let’s get more specific. Suppose you are a doctor studying back pain, and suppose your DV is response to a scale:

How much pain are you in on a typical day
1 – None
2 – Barely noticeable
3 – Moderate
4 – Severe
5 – Excruciating

and your independent variables are things like age, sex, injury status, time since injury and so on.

If one is strict about it, linear regression requires a continuous DV – and we do not have one, at least as we’ve measured it, although it could be argued that there is a latent underlying variable here that is continuous. But you’d be hard pressed to prove that the difference between “none” and “barely noticeable”
is the same as that between (say) “moderate” and “severe”. Technically, if you follow Steven’s categories of nominal, ordinal, interval, ratio, your DV is ordinal, and should be analyzed with some form of ordinal logistic regression.

But the most common type (by far) of ordinal logistic regression is the proportional hazards model, which assumes proportional hazards. That assumption might be violated, in which case, you might want to use multinomial logistic.

Since those are relatively unusual methods, some people just collapse the categories into (say) “severe’” or “excruciating” vs. anything less than that.

Which is right?

The great advantages of linear regression are its ease of interpretation and its familiarity. But it might be wrong.
Ordinal logistic is more likely to be correct, but is less known and harder to understand.
Multinomial logistic is even harder to understand, and is a very complex model, with many parameters to estimate.
Collapsing the variable will only very rarely be correct. It throws away information, and that’s rarely a good thing to do.

So, here’s what I recommend:
Do ordinal logistic regression and test the assumptions. Then if the assumptions are met, also do linear and regression and compare the results by making a scatterplot of one set of predicted values vs. the other. If they are very similar (YOU decide. Statistical analysis requires thought and judgment) then go with linear regression. If the assumptions are NOT met, then also do multinomial logistic regression, and compare those two sets of results, opting for the simpler ordinal model if results are very similar.

71 Responses to “How to analyze Likert type dependent variables”

  1. Peter Flom says:

    Hi Naveen, if you want to run *separate* regressions then it’s not a big problem, it’s just 5 ordinal logistic models. If you want to run a multivariate analysis (that is, more than one DV simultaneously) then it is complex.

  2. Naveen says:

    Thanks a lot, Peter

  3. Naveen says:

    Hi Peter
    please suggest, can we use one way ANOVA even if assumption of homogeneity of variance is not satisfied?

  4. Naveen says:

    can problem of unequal variances be rectified for a given data set?

  5. Peter Flom says:

    Naveen – no, if the assumptions are violated you shouldn’t do ANOVA. How to fix it is a big topic. Transformations can sometimes help

  6. Naveen says:

    In case assumption of homogeneous variances is not met , can we use welch test in place of ANOVA?

  7. Naveen says:

    Does log transformation of variables change their original relationship?

  8. Peter Flom says:

    Yes, taking logs does change the relationship, otherwise it wouldn’t work. You could also use nonparametric regression.

  9. Naveen says:

    Thanks Peter

  10. Naveen says:

    Peter what is nomological validity. Kindly suggest some articles/readings on this.

  11. Peter Flom says:

    I have no idea. Never heard of that in 15 years.

  12. Luis says:

    Hi Peter,

    My dependent variable is an indicator calculated as the simple average of 3 sub-indicators that use a 5 point Likert scale.

    The sub-indicators are categorical and ordered, but the indicator (dependent variable) obviously it is not because it is just an average of the 3 sub-indicators. The indicator (dependent variable) behaves like a continuous variable.

    What would you recommend in this case?

    Thanks a lot for your kind advice

    Best regards,

    Luis

  13. Peter Flom says:

    Hi Luis

    If you average 3 5-point scales you can get any of 13 different values (from 3/3 to 15/3). You could try ordinary (OLS) regression and see how the residuals look. But the results might clump so that you get a lot fewer than 12 actual values.

  14. Naveen says:

    Hi Peter
    Greetings of the day
    Kindly suggest any basic book on research methodology and SPSS.
    Regards
    Naveen

  15. Peter Flom says:

    Hi Naveen I don’t know SPSS and it’s been a long time since I studied research methods, so I don’t know recent books, sorry

  16. Naveen says:

    Kindly comment

    If while checking (using SPSS) normality of data (likert scale), two procedures ,

    (1) Analyze – Desc – Explore – Normality of data plot

    (2) Analyze – non para -legacy -1 sample k s test

    are giving contradictory results, which one to rely on ? Which is considered more robust to check normality?

    ( procedure 1 is showing non normality whereas pro. 2 shows data to be normal).

    Regards

    Naveen Dua

  17. Peter Flom says:

    Naveen – I don’t know SPSS, so I don’t know what the first run is doing.

  18. Naveen says:

    Kindly guide
    If we have a dependent variable, absenteeism rate of employees in organisation( factual information collected on 5 point Likert scale) which is not normally distributed and
    an independent variable (categorical variable say with 3 categories) and
    we intend to find out difference in means of categories based on independent variable w. r. t. absenteeism rate,
    is it appropriate to use kruskal wallis test for the purpose?

    Regards

  19. Naveen says:

    what is post hoc test for Kruskal wallis test?

  20. Peter Flom says:

    Hi Naveen

    Given that you have so many questions, you might try posting on CrossValidated. Or, if you’d like to hire me, we can discuss that.

    Thanks

    Peter

  21. Naveen says:

    what is cross validated?

Leave a Reply

Panorama Theme by Themocracy