Dichotomous Logistic Regression
Dichotomous Logistic Regression
Why Logistic Regression is Needed
One might try to use OLS regression with dichotomous DVs. There are several reasons why this is a bad idea:
1. The residuals cannot be normally distributed (as the OLS model assumes), since they can only take on two values for each combination of level of the IVs
2. The OLS model makes nonsensical predictions, since the DV is not continuous  e.g., it may predict that someone does something more than ‘all the time’.
A Very Quick Introduction to Logistic Regression
Logistic regression deals with these issues by transforming the DV. Rather than using the categorical responses, it uses the log of the odds ratio of being in a particular category for each combination of values of the IVs. The odds is the same as in gambling, e.g., 31 indicates that the event is three times more likely to occur than not. We take the ratio of the odds in order to allow us to consider the effect of the IVs. We then take the log of the ratio so that the final number goes from negative infinity to infinity, so that 0 indicates no effect, and so that the result is symmetric around 0, rather than 1. The log of the odds ratio is known as the logit.
Featured Posts

Question: Does lack of correlation imply lack of causation? Answer: We all know that correlation does not...

This is a paper for NESUG (NorthEast SAS Users' Group) 2010, which you can see as a PDF articleNESUG2010.

The average, or mean, is one of the simplest statistics there is. You have a bunch of numbers, you add them...

When you have two numeric variables and are interested in the relationship between them, the basic statistical...

Lately, across the statistical blogosphere, the repeating discussion of R vs. SAS has started up again. In...

In ordinary regression, we are interested in modeling the mean of a continuous dependent variable as a linear...

One problem with academic research publications is known as the "filedrawer" problem. If you do research on a...

This is a talk developed by David Cassell and me, and given at NESUG and SGF and WUSS

The title of this post is a quote from Herman Friedman, my favorite professor in graduate school. Herman was...

Signal versus noise Description and inference That is statistics

In an earlier article, we looked at simple linear regression, which involves one independent variable (IV) and...

In a previous post, I dealt with some SAS code for scatterplots. Various problems can arise when using...