When it comes to measures of central tendency or location, the arithmetic mean and the median get a lot of praise. Other measures such as the trimmed mean, geometric mean and so on also get some mention. But what about the mode?

Read more!

Question: What is principal component analysis in super-layman terms?

My answer: Principal component analysis is a dimension reduction method.

Suppose you have a great many variables – too many to deal with effectively. If you want to replace them with a smaller number of variables, while losing as little information as possible, PCA is one way to do it.

It is different from (but related to) factor analysis, which attempts to find latent factors – that is, things that cannot be directly measured.

The language used in these two methods is extremely confusing.

Question: How does ridge regression work?

My answer: OLS models are BLUE – best linear unbiased estimateors.

But sometimes forcing unbiasedness causes other problems. In particular, if the independent variables are fairly collinear, then the variances of the parameter estimates will be huge and small differences in the input data can make huge differences in the parameter estimates.

Ridge regression allows some bias in order to lower the variance.

Q: What does it mean when standard deviation is higher than the mean?

My answer: It depends on the nature of the data.

If all of the values are positive, then it indicates that there is quite a bit of spread, and the ratio of sd/mean is the coefficient of variation. This can be useful to compare the degree of “spreadoutedness” of two distributions with different means.

But if some of the data are negative, then the comparison of sd to mean stops having any meaning.

E.g. if there are 3 variables X, Y and Z

X: 1, 2, 3, 4, 5

Y: 10, 20, 30, 40, 50

Z: -2, -1, 0, 1, 2

Intuitively, these all seem to have the same spread in comparison to their size.

X has mean = 3, sd = 1.58, CV = 0.53

Y has mean = 30, sd = 15.81, CV = 0.53

Z has mean = 0, sd = 1.58, CV = infinite

Question: What makes a good statistician?

My answer: Curiosity, persistence, some anal tendencies, a willingness to ask questions, a willingness to question answers.