Regression to the mean is a well known statistical artifact affecting correlated data that is not perfectly correlated. It was first noticed by Sir Francis Galton in the late 19th century. He noted that the tallest fathers will have sons who are not as tall, and, similarly, the shortest fathers will have sons who are not as short. But this is true, not because of any general tendency toward mediocrity: Indeed, the range of heights of people shows no signs of diminishing. How can this be? First, let’s give an example; and let’s take the same type of measures that Galton was talking about: Heights of people in two generations. Let’s make up some data. We know that the average height is increasing over time, and we know that heights of fathers and sons (or parents and children, generally) are correlated, but the correlation isn’t perfect. That is, taller fathers tend to have taller sons, but the tendency isn’t perfect. Sons are not exactly the same height as their fathers, nor are they a constant amount taller than their fathers. Let’s model this
In R we can simulate heights with something like this:
fht = rnorm(100, 68, 2.5)
sonht = fht + rnorm(100, 1, 2)
With this, the average height of fathers is
I specialize in helping graduate students and researchers in psychology, education, economics and the social sciences with all aspects of statistical analysis. Many new and relatively uncommon statistical techniques are available, and these may widen the field of hypotheses you can investigate. Graphical techniques are often misapplied, but, done correctly, they can summarize a great deal of information in a single figure. I can help with writing papers, writing grant applications, and doing analysis for grants and research.
Specialties: Regression, logistic regression, cluster analysis, statistical graphics, quantile regression.