How to analyze Likert type dependent variables

By , April 15, 2010 5:15 pm

Suppose your dependent variable (DV) is a Likert scale or something similar. That is, it’s some sort of rating, from 1 to 5 or 1 to 7 or some such. And suppose you want to regress that on several independent variables. What should you do? Continue reading 'How to analyze Likert type dependent variables'»

p-values and modus tollens

By , April 14, 2010 1:29 pm

Modus tollens in logic
In logic, there is an argument style called modus tollens:

If  H0 then R
Not R
therefore
Not H0

This is a valid argument.

Modus tollens misapplied to p-values
Some people mis-apply this to p-values, saying:

If H0 then probably not R
Not R
therefore
Probably not H0

This is not valid.
Continue reading 'p-values and modus tollens'»

My own rules of data analysis

By , April 14, 2010 1:04 pm

The answer you get depends on the question you ask

In many substantive fields, students take one, two, or perhaps three statistical courses during graduate school.  These typically cover things such as descriptive statistics, ANOVA, regression, and perhaps a couple variants of regression such as logistic regression.  These are good tools for many purposes, but it’s a very limited toolbox.  This limits the number of questions you can ask.  Perhaps the really interesting substantive question is one that you can’t answer with those methods.  But if you ask a statistician or data analyst, you may find that the right method to answer your question does exist.

You can’t see something you’re not looking for

The more specific your question, the better you will be able to answer it; but if it’s too specific, you may miss something else.  Researchers need to learn to adapt the focus of their investigations.

If you’re not surprised, you haven’t learned anything (well, not much, anyway)

Isaac Asimov once said “The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ (I found it!) but ‘That’s funny …’”. That is, surprising.  It’s fine to confirm what you already suspected, but the real advances are made when you find things you did not expect.

and

Any analysis worth doing can be done in more than one way

This gets back to the toolbox – Which method should I use? but, even within a method, there are often options.  Should I transform variables?  Which covariates should I include?  How complex should my model be? What effect sizes should I report?

Often, these and other related questions do not have simple answers, but rather a range of reasonable choices.

PROC LOGISTIC: Complete and quasi-complete separation

By , April 13, 2010 11:26 am

Description of separation in PROC LOGISTIC

If you picture the data as a 2 x 2 crosstab, then quasi-complete separation occurs when one of the cells is 0.  Complete separation occurs when one cell in each row and column is 0.

Continue reading 'PROC LOGISTIC: Complete and quasi-complete separation'»

PROC LOGISTIC: Reference coding and effect coding

By , April 13, 2010 10:09 am

Description of the problem with effect coding
When you have a categorical independent variable with more than 2 levels, you need to define it with a CLASS statement. In PROC GLM the default coding for this is dummy coding. In PROC LOGISTIC, it’s effect coding. To me, effect coding is quite unnatural.

Continue reading 'PROC LOGISTIC: Reference coding and effect coding'»