PROC LOGISTIC: Concordant and discordant

By , April 25, 2010 6:48 pm

Description of concordant and discordant in SAS PROC LOGISTIC
Part of the default output from PROC LOGISTIC is a table that has entries including`percent concordant’ and `percent discordant’. To me, this implies the percent that would correctly be assigned, based on the results of the logistic regression. But that is not what it is. It looks at all possible pairs of observations. A pair is concordant if the observation with the larger value of X also has the larger value of Y.  A pair is discordant if the observation with the larger value of X has the
smaller value of Y; here, X and Y are the predicted value and the actual value.
Continue reading 'PROC LOGISTIC: Concordant and discordant'»

SAS v R: Getting help

By , April 25, 2010 10:26 am

I introduced this series the other day. Next up in the list is “getting help”. In both SAS and R, there are many sources of help. SAS has one that the usual R package does not have – technical support – although if you read the comments to the above article, you see that there are commercial versions of R that do have it. I won’t say much about this, because I think it’s bound up with the fact that R is free and SAS is not. I do find SAS technical support very helpful. Continue reading 'SAS v R: Getting help'»

Why grant writers need statisticians

By , April 23, 2010 1:35 pm

There are many reasons to write a grant, and many places to apply for one – from small grants for a few thousand dollars, to multi-year grants for many millions of dollars.  If your grant involves any sort of data analysis or data collection, even something very simple, it can be worth your while to consult with a statistician.  It is better to consult early in the process.  Although consulting costs money in the short term, it can save you a lot of time and money in the long term, and can improve your chances of getting a grant.

Some ways a statistician can help a grant writer -

1) Finding instruments – not all statisticians can do this, but many (including myself) can.  There are a huge array of psychological instruments out there.

2) Making data collection appropriate – when people come to me with data, it’s often collected in ways that make it hard to analyze.  Then I spend hours manipulating the data into the proper format.  If they had come to me before starting, it would have taken me a lot less time to show them a better way.

3) Power analysis.  Many federal agencies such as the National Institute of Health actually require power analysis.  Even if you aren’t required to do one, it can be very helpful to do so – to see how many subjects you will need to detect various effects.

4) Analysis plan.  If you come to the statistician (such as me) early, then he or she or I can help you answer the questions you want to ask, rather than the questions that the statistical techniques you are familiar with can answer.  There is a wide range of statistical techniques out there, and it’s better to let the substantive questions drive the analysis then the other way round.  A good carpenter has a big set of tools; but if you are not a carpenter, you may only have a few.

5) Doing the actual analysis – Once you get your grant, and start collecting data, you’ll want to analyze it. A good statistician can do it accurately and quickly, and show you the results in ways you understand

SAS v. R: Ease of learning

By , April 19, 2010 10:01 am

Two days ago, I wrote an introduction to this series.

Today, I will discuss ease of learning. Unlike the earlier post (and, I hope, most of the ones to come) this one is inherently subjective. “Ease of learning” is not the same for everyone – indeed, one thing I’d like to explore here and in the comments is why some people find SAS easier to learn, while others find R easier to learn. (Note that I am only discussing ease of use for statistical analysis and data management necessary to do that analysis). Continue reading 'SAS v. R: Ease of learning'»

SAS vs. R: Introduction and request

By , April 17, 2010 10:10 am

Lately, across the statistical blogosphere, the repeating discussion of R vs. SAS has started up again. In this series of posts, I’ll offer my opinions of the programs, and supply some information. In this post, I introduce the series and say a little about where I am coming from, so you can see where my opinions come from.

I’m a data analyst/statistician. Mostly, I work with researchers in the social and behavioral sciences, education, and medical fields. I’ve been using SAS for about 15 years and R for about 5, and I use SAS more than R. I am not a programmer.

There are many statistical packages, but there are two (SAS and R) that I use regularly – in fact, I use both every day. I like both. I don’t want to give up either. But they are very different.

Websites for SAS and R
For more on SAS see their home page ; for more on R see the R project page

Two basic differences between SAS and R
Two uncontroversial differences are:
1. SAS is commercial, R is free.

That is, with SAS, you pay an annual license fee, which varies depending on many factors. R, on the other hand, is free. Anyone can download it. (R is a dialect of S, there is also a commercial version of S – called S plus, but I haven’t used it, and I don’t see it mentioned much; there is also at least one commercial version of R, see the comments.).

2. SAS has tech support, R does not.
This is, of course, related to the first point. One of the things you are paying for with SAS is tech support – available by phone or e-mail. I have found SAS tech support to be among the best of any software I’ve used.

What I plan to cover in future posts
1. Ease of learning
2. Getting help
3. Error messages
4. Speed
5. Available statistics

Request for assistance
This series is really for two groups of people: Those trying to learn a little about these two prominent statistical software packages, and those who already know a lot, but want to discuss things. It might get a bit heated in the latter group – people have strong opinions. Please keep it civil.

If you have particularly good links on any of the above, or ideas for more topics in this series, or anything else you’d like to contribute, let me know.
I plan to search using Google (of course) and also look through both SAS-L and r-help and stackoverflow quite a bit. If there are other good places for me to look, let me know that too.