Note: This is a brief introduction to observational designs. For more on this type of study, see books by Paul Rosenbaum: Observational Studies and, for a less mathematical approach Design of Observational Studies
In statistics and research design, there are two types of study: Experiments and observational studies. Some people also use the term “quasi-experiment” but I do not like it. In an experiment, the key thing is <em>randomization</em>. We assign subjects (e.g. people) to different conditions (e.g. drug and placebo) randomly. Often, though, such assignment is not possible or not ethical. In social sciences, it is rarely possible. We cannot, for example, randomly assign people to different levels of education. We can only observe relationships between (say) education and political party.
When we present social science research, we often get the “correlation is not causation” reply. Indeed, these two are not equivalent. More correctly “correlation does not imply causation” – that is, two things can be related without one causing another. Sometimes, though, this is taken too far. While correlation does not <em> imply </em> causation, it is <em> evidence </em> of causation. And we can strengthen that evidence. There are at least three ways this can be done:
1) Control for other variables that might be relevant
2) Show a large effect size, even after the control
3) Show the mechanism that makes the relationship work.
Let’s take each in turn:
1) Control for other variables. This means to take them into account. There are various ways to do this in statistics, but most commonly we add them to some form of regression equation. A negative example: The more firefighters show up at a fire, the more damage is done. This does NOT imply that firefighters cause damage, and if we control for size of fire, the relationship inverts. A positive example: Smoking is related to lung cancer (and many other ills). This relationship exists even if we control for age, ethnicity, sex, weight, diet, exercise and many other variables. This increases the evidence that smoking causes cancer.
2) Show a large effect size even after controlling for other variables. Large effect sizes are hard to explain by other causes. This doesn’t mean impossible, it just means that one of the variables we haven’t controlled for must have a strong relationship to the variable we are studying. In the case of smoking, this was eminently true. Smokers have much higher rates of cancer, even after accounting for all those other factors. Could something else be accounting for this relationship? Yes. But it’s hard to imagine what it could be.
3) Show the mechanism that makes the relationship work. Relationships that are unexplained are more suspect that those that are explained. Of course, our explanation could be wrong, but a good explanation is an additional bit of evidence that the relationship is real.