Super simple macros to make a statistician’s life easier

By Peter Flom, September 4, 2010 6:46 pm

I will be presenting this at NESUG in November in Baltimore

Macros can be a very complex topic, but some very simple macros can make life easier for a data analyst or statistician. I give a very basic introduction to macros from the perspective of a data analyst, and present some macros I have found useful. I include only certain types of macros, deliberately choosing the options I find easiest to understand and use. Again, this is a paper intended for statisticians and data analysts, not programmers. I am following the KISS principle: Keep It Simple, Statistician!

Continue reading 'Super simple macros to make a statistician’s life easier'»

SAS tip: Why you always should use a RUN statement

By Peter Flom, July 18, 2010 5:23 pm

OK, there are lots of places where it’s written that using RUN statements makes code look cleaner, but that invocation of another PROC statement makes the previous PROC get submitted. So…. It sounds like that RUN statement is a sort of esthetic extra.

But it can bite you

Continue reading 'SAS tip: Why you always should use a RUN statement'»

Using ridits to assign scores to categories of ordinal scales

By Peter Flom, June 10, 2010 10:40 am

When dealing with ordinal data, many methods require you to assign a number or score to each level of a variable. For instance, if you ask people about their political orientation and whether it is very conservative, somewhat conservative, moderate, somewhat liberal or very liberal, you might assign these scores of 1, 2, 3, 4 and 5, respectively. But that is somewhat arbitrary.

One alternative was suggested by Bross (1958) and brought to my attention in reading Alan Agresti’s excellent book: Analysis of Ordinal Categorical Data . Continue reading 'Using ridits to assign scores to categories of ordinal scales'»

Book review: SAS and R by Ken Kleinman and Nicholas J. Horton

By Peter Flom, May 10, 2010 2:26 pm

There are many books that teach you to use SAS or that teach you to use R.  There is at least one book that teaches R to people who know SAS or SPSS (R for SAS and SPSS users by Robert Muenchen, and it’s very good).

Continue reading 'Book review: SAS and R by Ken Kleinman and Nicholas J. Horton'»

When imputing interactions, squares, and so on, transform then impute

By Peter Flom, May 2, 2010 5:59 pm

In a recent article in Sociological Methodology entitled “How to impute interactions, squares, and other transformed variables”, Paul T. von Hippel shows that, when y0u have missing data and are using interactions, squares, or other transformed variables in a regression, it is better to transform first, and then impute.

In multiple imputation, the problem of missing data is dealt with by imputing multiple sets of data, and then combining them.  When there are no interactions or quadratics, the process is well-understood.  But relatively little is known about the proper procedure when you do have transformations.  von Hippel shows, using both mathematics and example data, that it is better to first transform the data that you do have, and then impute.  Although this leads to the odd situation that, e.g. the imputed values X^2 are not equal to the square of the imputed values for X; doing it in the reverse order (that is, imputing and transforming) yields biased estimates of the regression coefficients.

This is so both for ordinary least squares regression and other regression models.

I found the article fascinating and accessible.