The title of this post is a quote from Herman Friedman, my favorite professor in graduate school. Herman was one of my most important professors; he not only taught the subject matter, he taught how to think about statistics. What did he mean when he said
If you’re not surprised, you haven’t learned anything
When we have a set of data to explore, whether we have strong hypotheses or not, whatever our research questions are, whatever our analytic methods are, one way of categorizing the results is on what could be called a “surprisingness” scale. At the low end are results that simply confirm what was found by others. These results are not unimportant; replicating research is vital. But one reason replicating research is vital is that you might not replicate the results. If we knew that any result we saw reported was sure to be found in our data, we wouldn’t need to replicate it.
At higher levels of surprisingness, more interesting things happen. As Isaac Asimov said:
The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka” but “That’s funny…”.
What does this mean for the practice of statistics? All too often, researcher choose methods that limit their capacity to be surprised.
- They remove outliers automatically (outliers are, by definition, surprising!).
- If results are surprising they look for ways to remove the surprise rather than explore it, and if results are not surprising they rush to publish.
- They choose only one method of analysis. Certainly regression (for example) is often a good method. But what else could we look at?
- They rush to inferential statistics before fully exploring their data
- They ignore the surprsisingness of nonsignificance. This is part of the misuse of p values. “Statistical significance” doesn’t mean importance and lack of statistical significance doesn’t mean lack of importance. For example, if you found some group of humans where the men were not significantly taller than the women, that would be very important and very surprising.
Look at your data in multiple ways. Open your mind and be surprised. To close with one more quote, this one from Yogi Berra
You can see a lot by looking