The perils of categorizing continuous variables
Many times, researchers will categorize continuous variables. For example, birth weight of human infants is often categorized as “low birth weight” vs. “normal”; sometimes it is “very low birth weight”, “low birth weight” and normal. The cutoff for low birth weight is usually 2.5 kg. IQ tests are categorized with labels such as “gifted”. Depression tests are categorized. And so on. This rarely makes sense, either statistically or substantively.From a statistical point of view, categorizing a continuous (or nearly continuous) variable throws away information, leading (nearly always) to less power. It also assumes that we have any cutoff exactly right.
Substantively, it invokes “magical thinking” – that is, that something huge happens at the cutoff. E.g., with the birth weight example, a baby of 2.49 kg is treated as being very different than one of 2.51 kg, but very like one of 1.51 kg (if there is a 3rd category of “very low birth weight”). There’s no biological reason to think this happens. Substantive experts sometimes argue for cutoffs because they aid diagnosis and treatment, but this is only true if diagnosis is based on only one symptom and if treatment is dichotomous. This may sometimes be the case, but it is rare. Neonates are not diagnosed based solely on weight (there is also APGAR score, length of pregnancy and other factors) and treatment is not dichotomous. It is true that a baby will either be in intensive care or not. But, even if not in the NICU, hospital staff can be informed that a baby is at some risk. Babies can stay longer (or shorter) times in the hospital. New parents can be given varied advice, and so on.
Another example is depression. We can diagnose depression based on psychological tests such as the Beck Depression Inventory. But categorizing people still doesn’t make that much sense. People range on depression along a continuum. Treatment also varies. Neither psychotherapy nor drug treatment are dichotomous. Therapy can be more or less frequent. Dosages of drugs can be higher or lower. Nor should we diagnose depression based solely on the scores on one test – not when other symptoms are available.
So, does categorization ever make sense? Yes. There may be some situations where treatment really is dichotomous (although it is surprisingly difficult to think of them). In addition, sometimes there really is a big gap. For example, if looking at consumption of alcohol among teens and young adults, the age where drinking becomes legal would be key. Even here, though, there may be better models, such as spline regression.
I totally agree with your conclusion. I wrote a similar blog ’9. Dichotomization as the Devil’s Tool’ at http://www.AllenFleishmanBiostatistics. In it, I pointed out that at the BEST, one would need to increase N by at least 60% to compensate for the loss of information. Under more realistic assumptions (i.e., not dichotomization at the median) the N would need to increase by a factor of 4.
Another consideration is that dichotomized data assumes very large N. Finally, many common design elements (covariates, strata, interactions) are not easily tested by analyses of dichotomous data.
Thanks Allen. I will check out your blog.