What is the mean?
The average, or mean, or, more formally, the arithmetic mean, is one of the simplest statistics there is. You have a bunch of numbers, you add them up, divide by how many there are, and …. That’s it! How could you go wrong with the mean? Well …. It’s surprisingly easy to do so.First, a good example, to set the stage. If you weigh 5 people (Peter, John, Mary, Ed and Sally) and they weigh 180, 190, 130, 186 and 100 pounds, you can just add that up, divide by 5 and you’ve got the average weight for those 5 people. That’s the mean.
Now, what can go wrong?
Averaging rates or proportions
Averaging rates is a bad idea. Suppose I drive to work at a constant speed of 60 miles per hour and drive home at 40 miles per hour, over the same route. What’s my average speed? EASY! The mean of two numbers (60+40)/2 = 50. So, my average speed is 50, right? Wrong. Let’s say it’s 60 miles to work. Then the trip to work takes me 1 hour, the trip home takes me 1.5 hours, total time is 2.5 hours to drive 120 miles. 120/2.5 = 48, not 50.
If a baseball player bats .200 for the first half of the season, and .400 for the second half, what is his average for the whole season? It must be .300, right? WRONG. In fact, there is not enough information. Maybe in the first half he comes to bad 100 times and gets 20 hits; in the second half, he comes to bat 500 times, and gets 200 hits. Then, for the season, he has 600 at bats and 220 hits, and his average is .367.
Averaging times is also problematic. Suppose you want to find out the average time you went to bed in the last week. Your bedtimes were: 10 PM, 10PM, 11PM, 1AM, 2AM, 10PM and 10 PM. How to find the average? (10 + 10 + 11 + 1 + 2 + 10 + 10)/7 = 7.71 ? Huh? Your average bed time was not between 7 and 8!
The right way to solve this is to take (e.g) hours past the previous noon. So 10, 10, 11, 13, 15, 10, 10 and now the average is 11.28, or just about 11:15. That makes sense.
If there are extreme values, often called outliers, then the mean can be, if not exactly wrong, then certainly misleading. If you are figuring the average height of a group of college students, and your sample happens to include the center on the basketball team, who is 7’2″ tall, then your average won’t be a very good representation of the real average height at your school.
So, even with the mean, you can go wrong.
In later articles, I will look at some other measures of central tendency.
I specialize in helping graduate students and researchers in psychology, education, economics and the social sciences with all aspects of statistical analysis. Many new and relatively uncommon statistical techniques are available, and these may widen the field of hypotheses you can investigate. Graphical techniques are often misapplied, but, done correctly, they can summarize a great deal of information in a single figure. I can help with writing papers, writing grant applications, and doing analysis for grants and research.
Specialties: Regression, logistic regression, cluster analysis, statistical graphics, quantile regression.