**The problem of coding 0 and 1 in PROC LOGISTIC**

PROC LOGISTIC can be used to run logistic regression on a dichotomous dependent variable. Often, these are coded 0 and 1, with 0 for `no’ or the equivalent, and 1 for `yes’ or the equivalent. In this case, we are usually interested in modeling the probability of a ‘yes’. However, by default, SAS models the probability of a 0 (which would be a `no’).

For example, we might be interested in modeling the presence of a disease, with 0 meaning the person is not infected, and 1 meaning he or she is infected. To keep it simple, I will use one independent variable: sex, code as 1 for female and 0 for male. So:

` data today;`

input disease female weight;

datalines;

0 0 100

1 0 200

0 1 200

1 1 100

;;;;

we then run PROC LOGISTIC:

proc logistic data = today;

model disease = female;

weight weight;

run;

and get, among other output, an odds ratio estimate of 1.39 for female, while it’s clear that men are much more likely to be infected.

**Evidence of a 0-1 coding problem in PROC LOGISTIC**

The evidence that this is happening is one line in the output:

Probability modeled is disease=0

and several lines in the log:

NOTE: PROC LOGISTIC is modeling the probability that disease=0.

One way to change this to model the probability that disease=1

is to specify the response variable option EVENT=’1′

**Solving 0-1 coding problems in PROC LOGISTIC**

There are several solutions. The simplest is not the one mentioned in the log, but rather the DESCENDING option.

`proc logistic data = today |descending|;`

model disease = female;

weight weight;

run;

Another method is the one mentioned in the log, which is more general:

`proc logistic data = today;`

model disease|(event = '1')| = female;

weight weight;

run;

I specialize in helping graduate students and researchers in psychology, education, economics and the social sciences with all aspects of statistical analysis. Many new and relatively uncommon statistical techniques are available, and these may widen the field of hypotheses you can investigate. Graphical techniques are often misapplied, but, done correctly, they can summarize a great deal of information in a single figure. ** I can help with writing papers, writing grant applications, and doing analysis for grants and research.**

** Specialties:** Regression, logistic regression, cluster analysis, statistical graphics, quantile regression.

You can **click here to email** or reach me via phone at 917-488-7176. Or if you want you can follow me on Facebook, **Twitter**, or LinkedIn.